| Author | Dominik Auras |
| Type | Modified USRP1 FX2 firmware to boost supported USB bandwidth |
| State | Done, tests needed |
| Key features |
|
| Changelog | 2009-05-09: first version uploaded |
|
See USRP
and GNU Radio The following changes augment the achievable USB bandwidth of the USRP by modifying the firmware for the Cypress FX2 USB-chip that resides on the USRP board. While the previous firmware only allowed up to 32 MB/s, the presented one can sustain 45 MB/s (when only receiving, depends on your computer). Even though many applications may not merit from higher signal bandwidth, the accelerated USB IO transfer can possibly reduce the time spent in IO routines and the USRP sink/source and hence free processor resources for signal processing. While in previous firmware implementation the main loop within the FX2 8051 core was limiting the bandwidth, now if using unidirectional mode only your computer will be the bottleneck, because the FX2's GPIF transfers at 96 MB/s between USB and FPGA, far beyond the hypothetical USB bandwidth (60MB/s, not achievable). Apply patch with patch -p 0 < fx2.patchwhen you are in gnuradio/trunk/usrp. |
First of all, the firmware now distinguishes the unidirectional communication modes from the bidirectional ones. When we
transfer data in one direction, i.e. we only want to transmit _OR_ to receive, we can switch the firmware state to have
maximum gain.
All four possible waveform descriptors are in use. Waveform 1 and 2 are used in unidirectional modes (named readonly and writeonly).
The bidirectional transfer mode uses Waveform 3 and 4 (still FIFOwr, FIFOrd). See below for screenshots of the waveforms.
The GPIF flags are the programmable flags for both FIFOs. Programmable Level is set close to zero and 2048 bytes respectively.
They inform the state machines that the internal FIFOs are almost empty or almost full.
The Auto Commit feature is turned on. It means that buffers are directly committed from FIFO to USB and vice versa without
core interaction.
Let me explain the course of my development:
The old firmware used flow states to transfer data and had a problem with signals being asserted for one additional cycle (called bug 257).
From manual, we can take that in a decision point state conditions are sampled during the state. Evaluation takes one cycle. Hence when the
GPIF concludes to branch to another state, that branch is executed with one cycle delay. With this observation, we can readily avoid bug 257
by setting transaction count to 255 instead of 256.
Next is to see that once bug 257 workaround is eliminated, we are not restricted to transfer multiple of 512 bytes. Thus we can change the state
machines to transfer as many words as possible (16 bit databus, so words not bytes are transferred). We must avoid under- and overruns
of the internal FX2 FIFOs, thus we use the programmable level flag to signal early enough to stop the transfer. Note that when the flags deassert,
the decision points' evaluation takes one cycle in which we transfer one word. We must set the FIFO level/threshold such that we account for
that additional word.
I kept the FIFO levels in the FPGA. These are asserted/deasserted when at least 256 words are available or we have space for another 260 words.
Theoretically, we could tune these too to get closer to the FIFO capacity. However there is a delay in the FIFO (Altera Doc) that we must take into consideration.
When using the flow state feature, we must write different values to one configuration register when switching between reading and writing.
The GPIF supports loops in decision points, which means that the control task (here: sample or drive next data word) is rexecuted every
cycle we are in the loop state. Because we do not need the flow control, we can use loop states instead of flow states. There can be only one
flow state, but several loop states, and we save setting the configuration register.
To achieve the highest USB bandwidth, we would ideally not even leave the GPIF state machine. If we are sending and receiving simultaneously,
this cannot be achieved because one waveform is restricted to one FIFO. But if we need only one direction, that is feasible. So that is the
next optimization: recognize unidirectional modes (rx only, tx only) and set the GPIF to never exit. This can be readily done when we move
the condition checking into the state machine and append a decision point that jumps back to the first state (which evaluates the transfer start
condition, as our main loop did before). There are 4 waveforms where first two are actually dedicated to single word transfer. We can use them
if we rewrite the waveform select configuration register before entering the GPIF loop (and of course restore the register when we exit).
Hence in the unidirectional mode, the work is shared between GPIF and 8051 core, which execute in parallel. The GPIF transfers data (remember:
once a buffer is filled, it is committed automatically, same if it is empty) and the 8051 handles setup messages. When we want to leave
the mode we simply abort the GPIF by wrting 0xFF to GPIFABORT. Here is the main loop that handles the tx only mode:
if( g_tx_enable && ! g_rx_enable ) // TX ONLY
{
// flow state not used
//setup_flowstate_write (); SYNCDELAY;
// select GPIF waveform writeonly, won't return
GPIFWFSELECT = 0x44; SYNCDELAY; // Waveform writeonly
GPIFTCB1 = 0x00; SYNCDELAY;
GPIFTCB0 = 0x01; SYNCDELAY;
GPIFTRIG = bmGPIF_EP2_START | bmGPIF_WRITE; SYNCDELAY;
while(1){ // TX ONLY
if (usb_setup_packet_avail ())
usb_handle_setup_packet ();
if( !g_tx_enable || g_rx_enable )
{
// state change, abort GPIF waveform
GPIFABORT = 0xFF;
break;
}
// Check for underruns and overruns
if (UC_BOARD_HAS_FPGA && (USRP_PA & (bmPA_TX_UNDERRUN))){
// record the under/over run
if (USRP_PA & bmPA_TX_UNDERRUN)
g_tx_underrun = 1;
// tell the FPGA to clear the flags
fpga_clear_flags ();
}
} // while(1) inner
// restore
GPIFWFSELECT = InitData[ 5 ]; SYNCDELAY;
}
gnuradio-svn/trunk/usrp/host/apps$ ./test_usrp_standard_rx -c -D 4You will obtain overruns, but don't care for them. If you got at least 42.67 MB/s, you could try to run the usrp_fft program with decim=6 (which gives roughly 10.66 MHz signal bandwidth).
Loopback testing:
Send many data (sine wave) with half bandwidth
test_usrp_standard_tx -i 32 --sine -N 1024000000In other terminal, in parallel, receive via loopback:
test_usrp_standard_rx -D 16 -l -o test.cshortInspect test.cshort, which contains complex values composed of two shorts, with your favorite viewer. You should see a complex sinusoid.
write only (usrp is sending)
FIFO read (usrp is sending and receiving)
FIFO write (usrp is sending and receiving)
GPIF state machine (configuration)
I am not the author of the usrp code. See files gnuradio.org for more info. The patch file informs you about all changes I did.
Considering the program, especially the provided binary files, please remember:
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
| Patch file | TAR-Archive 2009-05-09 15kb |
| Some prebuilt RBFs and FX2 firmware binary | TAR-Archive 2009-05-09 481kb |