> Code Snippet - Bitinterleaving with SIMD instructions (SSE)
Interleaving Bits with x86 SIMD instructions (SSE)
The first snippet requires about 1.24 cycles per output byte,
and the 2nd approximately 1.83 cycles (on my Core 2 Duo).
This work is licensed under the terms of the GPL.
-msse2 -O2 -mtune=native -march=native -flax-vector-conversions
Code Snippet 1
Code Snippet 2