| Author | Dominik Auras |
| Type | VHDL-Implementation of CORDIC algorithm integrated in USRP FPGA code |
| State | Verified |
| Benefit |
|
| Changelog | 2008-06-13: structural architecture with megafunction, minimizing logic cell count (-40%) |
|
See USRP
and GNU Radio More documentation in preparation |
cordic_vhdl rx_cordic
( .clock(clock),.reset(reset),.enable(enable),
.xi(i_in),.yi(q_in),.zi(phase[31:16]),
.xo(bb_i),.yo(bb_q),.zo() );
The code is fully parameterized. You can vary bitwidths for X and Y signals, bitwidth for Z signal and
number of stages. The pipeline narrows intermediate signals if desired (change the constant "narrow_z"
in code of pipeline if you don't want that feature). For each stage, the angle increment constant is
computed during analysis/elaboration. These constants are not predefined in the code.
Behavorial architecture:
Compared to the existing Verilog implementation, we gain free area of 150 logic cells per instantiation. The
existing consumes 976 LCs, while the presented one only takes 826 LCs. That reduction is due to
narrowing signals and therefore register and adder bitwidths. Though there is also a small gain if
you don't narrow the signals.
Structural architecture:
The recent update from june 2008 introduced a new architecture that contains a structural description
of the algorithmic unit. It matches the hardware representation closer and therefore optimizes the
logic cell usage. Specifically for the Altera FPGA, the Altera megafunction for a combined adder
and subtractor is explicitly instantiated. Every logic cell consists of a register and a LUT to perform
boolean logic. Every logic we need is now part of a register. There are only a few cells left that do
logic only operation. Since the count of registers is dictated by our parameter set, the optimal logic
cell count equals the minimum number of required registers. The implementation matches this condition
very very close.
Here are the numbers: the unit consumes 588 logic cells instead of 976, that is a reduction of 388 LCs
or almost 40%. The total LC reduction is about 3.2% per cordic unit, therefore ~6.4% less space used
for the standard configuration (2rx w/ hb, 2tx).
There is a new testbench written in VHDL. It is checking for some corner cases and simulates a typical use case. Also, it is reproducing the old testbench so you can compare outputs of both units. Except for that part, the testbench is an automatic selftest. The output of the unit is compared with the direct computation of rotation of stimulus.
Altera Quartus permits to mix VHDL and Verilog without restriction. Though some available free simulator only accept either VHDL or Verilog. I tried to use GHDL (GPL'ed) to verify functionality and gtkWave to display waveforms. But with the latest GHDL revision I encountered a bug related to the use of the recursion within the generate statement. A bug ticket has been sent to the developer.
Nevertheless, the functionality has been verified with Modelsim-Altera WebEdition simulator. All tests have been passed.
Additionally, the CORDIC unit has been tested on an USRP. The generated bitstream with the new unit incorporated has been uploaded and tested on hardware.
I am not the author of the usrp code. See files AUTHORS and ChangeLog for more info. There have been two changes to existing files. The instantation of the CORDIC unit is altered to use the new implementation.
The complete archive contains all relevant files from GNU Radio project related to usrp (that is the subdirectory usrp from the subversion server). The diff-archive contains only the new files. Last changed revision of original directory was r8086. Existing RBF files (FPGA bitstreams) have been excluded (better get a current copy of that). The included RBF uses some standard configuration along with the new CORDIC implementation.
| Complete Archive | TAR-Archive 2008-06-13 817kb |
| Diff-Archive | TAR-Archive 2008-06-13 29kb |
| Complete Archive | TAR-Archive 2008-04-17 821kb |
| Diff-Archive | TAR-Archive 2008-04-17 28kb |