Start > Misc. > A Gbit/s convolutional encoder on x86 GPP with SSSE3 SIMD

A Gbit/s convolutional encoder on x86 GPP with SSSE3 SIMD

Overview

Author Dominik Auras
Type C Code
State Done
License GPL
Features
  • Fast convolutional encoder
  • Exploiting palignr SIMD-instruction introduced with SSSE3
  • DVB-T mother code r=1/2

Introduction

Useful links:
Wikipedia on "Convolutional code"
Wikipedia on DVB-T

The encoder uses the code polynomials G0 = 1 + x^1 + x^2 + x^3 + x^6 and G1 = 1 + x^0 + x^2 + x^3 + x^5 + x^6. These are the polynomials defined for the DVB-T mother code, as in ETSI Standard: EN 300 744 V1.5.1.

Implementation

The x86 SIMD instruction set SSSE3 offers the wonderful palignr instruction. It takes three operands, two 128bit register and one immediate, and performs a right shift. The result is the lower 128bit vector of the right shifted concatenated two input vectors. Using this instruction, we assemble 128bit vectors from the input that are partially overlapping (8 bits).

We then right shift the assembled vectors (by 1,2,3,5 and 6 bits) and xor them. Since there is no 128bit right shift instruction, we are to do a trick: use the 64bit SIMD right shift (right shifting 2 64bit operands in one 128bit vector), the SIMD left shift and again palignr. With palignr, we swap the two 64bit operands. Then a right shift of 1 bit looks like this: a1 = (a >> 1) | (swap64(a) << 63).

Finally after XOR'ing, undo the partial overlapping using palignr.

The encoder processes blocks of 1920 bit, producing 2x1920 code bit. It requires the last 16 byte of the previous block (=0 for the first block), thus the input is an array of 256 byte. The output is stored in two 240 byte arrays.

Performance figures

Measured on a Core 2 Duo 2.66GHz
sudo chrt -f 99 taskset 1 time ./conv_test_inlined
Inlined:~408 cycles / 1920 biti.e. ~12.5 Gbit/s
Not inlined:~1144 cycles / 1920 biti.e. ~4.4 Gbit/s
Note: the benchmark runs the encoder several times on the same input and output arrays, without writing or reading to these memory locations. Thus they are definitely completely cached. This is unlikely for a real application.

License

This work is licensed under the terms of the GPL.

Sourcecode

Convolutional Encoder 0.1 TAR-Archive 2009-12-12 8.5kb