Start > GNU Radio > Parameterized CORDIC implementation for USRP

Parameterized CORDIC implementation for USRP

Overview

Author Dominik Auras
Type VHDL-Implementation of CORDIC algorithm integrated in USRP FPGA code
State Verified
Benefit
  • Parameterized code
  • Easy replacement of existing unit
  • Complex Testbench with automatic verification
  • Iteratively narrowing of intermediate signals between stages
  • Large area gain due to narrowing
  • Structural architecture: minimal count of logic cells used
  • Even w/o narrowing structural architecture grants large area reduction
Changelog 2008-06-13: structural architecture with megafunction, minimizing logic cell count (-40%)

Introduction

See USRP and GNU Radio

More documentation in preparation

Implementation

This a simple implementation of the CORDIC algorithm in VHDL. The algorithm is implemented as pipeline. It can handle variable number of pipeline stages. Additionally, it allows to iteratively narrow the bitwidths between stages. This is achieved with a recursive generate statement. The unit is designed to easily replace the existing implementation. That is, the interfaces are identical to that of the verilog module. So the new unit is instantiated in rx_chain.v by replacing the module name cordic with cordic_vhdl:
   cordic_vhdl rx_cordic
     ( .clock(clock),.reset(reset),.enable(enable), 
       .xi(i_in),.yi(q_in),.zi(phase[31:16]),
       .xo(bb_i),.yo(bb_q),.zo() );
The code is fully parameterized. You can vary bitwidths for X and Y signals, bitwidth for Z signal and number of stages. The pipeline narrows intermediate signals if desired (change the constant "narrow_z" in code of pipeline if you don't want that feature). For each stage, the angle increment constant is computed during analysis/elaboration. These constants are not predefined in the code.

Behavorial architecture:
Compared to the existing Verilog implementation, we gain free area of 150 logic cells per instantiation. The existing consumes 976 LCs, while the presented one only takes 826 LCs. That reduction is due to narrowing signals and therefore register and adder bitwidths. Though there is also a small gain if you don't narrow the signals.

Structural architecture:
The recent update from june 2008 introduced a new architecture that contains a structural description of the algorithmic unit. It matches the hardware representation closer and therefore optimizes the logic cell usage. Specifically for the Altera FPGA, the Altera megafunction for a combined adder and subtractor is explicitly instantiated. Every logic cell consists of a register and a LUT to perform boolean logic. Every logic we need is now part of a register. There are only a few cells left that do logic only operation. Since the count of registers is dictated by our parameter set, the optimal logic cell count equals the minimum number of required registers. The implementation matches this condition very very close.
Here are the numbers: the unit consumes 588 logic cells instead of 976, that is a reduction of 388 LCs or almost 40%. The total LC reduction is about 3.2% per cordic unit, therefore ~6.4% less space used for the standard configuration (2rx w/ hb, 2tx).

There is a new testbench written in VHDL. It is checking for some corner cases and simulates a typical use case. Also, it is reproducing the old testbench so you can compare outputs of both units. Except for that part, the testbench is an automatic selftest. The output of the unit is compared with the direct computation of rotation of stimulus.

Altera Quartus permits to mix VHDL and Verilog without restriction. Though some available free simulator only accept either VHDL or Verilog. I tried to use GHDL (GPL'ed) to verify functionality and gtkWave to display waveforms. But with the latest GHDL revision I encountered a bug related to the use of the recursion within the generate statement. A bug ticket has been sent to the developer.

Nevertheless, the functionality has been verified with Modelsim-Altera WebEdition simulator. All tests have been passed.

Additionally, the CORDIC unit has been tested on an USRP. The generated bitstream with the new unit incorporated has been uploaded and tested on hardware.

License

This work is licensed under the terms of the GPL. A copy is included in the module archive.

I am not the author of the usrp code. See files AUTHORS and ChangeLog for more info. There have been two changes to existing files. The instantation of the CORDIC unit is altered to use the new implementation.

The complete archive contains all relevant files from GNU Radio project related to usrp (that is the subdirectory usrp from the subversion server). The diff-archive contains only the new files. Last changed revision of original directory was r8086. Existing RBF files (FPGA bitstreams) have been excluded (better get a current copy of that). The included RBF uses some standard configuration along with the new CORDIC implementation.

Sourcecode

Complete Archive TAR-Archive 2008-06-13 817kb
Diff-Archive TAR-Archive 2008-06-13 29kb
Complete Archive TAR-Archive 2008-04-17 821kb
Diff-Archive TAR-Archive 2008-04-17 28kb