- Minor fix to basisu_frontend::optimize_selector_codebook() so the new_selector_cluster_indices array is initialized correctly. (The backend doesn't use this array so no harm was being done.)
In find_optimal_selector_clusters_for_each_block, a noticeable amount of
time is spent computing color distances for pixels of different clusters.
Because for each pixel we only have 4 colors to compare against, we have
to compute a grand total of 64 unique color deltas; the cluster count,
however, is typically ~200 and we computed 16 deltas for each cluster.
It's thus cheaper to precompute all 64 deltas ahead of time and just add
the right deltas up for each cluster.
This reduces the time to encode a 2Kx2K image with a mip chain in a
single thread with SSE4.1 enabled from 7.8 seconds to 7.3 seconds; the
resulting image is binary identical before/after this change.
- Fixing find_optimal_selector_clusters_for_each_block() so if some clusters don't get assigned any blocks during refinement the selector cluster block index array is still correctly sized.
- Optimized ETC1S encoder (3-4.5x faster)
- Added optional SSE 4.1 support to encoder
- Switched from std::vector to a custom vector in the encoder and transcoder
- Added CppSPMD (SSE only for now)
- UASTC RDO is now more effective, but the command line parameter controlling qualiy has changed (to "lambda")
Encoder is now a library in the "encoder" directory
JavaScript wrappers now expose the entire codec: encoder, transcoder, container independent transcoding, and .basis file information