FFT Implementation With No Data Scaling

From Texas Instruments Embedded Processors Wiki

Jump to: navigation, search
Translate this page to   

Contents

Introduction

The FFT (DSP_fft16x16) and iFFT (DSP_ifft16x16) implementation provided with the C64x+ DSPLIB apply scaling of data to avoid overflow. See below from the user’s guide.

All stages are radix-4 except the last one, which can be radix-2 or radix-4, depending on the size of the FFT. All stages except the last one scale by two the stage output data

It is desirable in certain use cases that the data scaling is not applied in the FFT routines. This article suggests modifications to the provided FFT source in the DSPLIB such that the data is not scaled. Also, an example is provided to demonstrate the affect of suggested change.

Suggested Change

The change to both the routines (DSP_fft16x16 and DSP_ifft16x16) is similar. The below description suggests modifications to the serial assembly (SA) implementation of the kernels. The kernels are located at:

Change 1: Identify the below code in the SA files

       ;----------------------------------------------------------;
       ; Compute first set of outputs:                            ;
       ;                                                          ;
       ;  x0[0]= xh0_0 + xh20_0 + 1 >> 1                          ;
       ;  x0[1]= xh1_0 + xh21_0 + 1 >> 1                          ;
       ;  x0[2]= xh0_1 + xh20_1 +1  >> 1                          ;
       ;  x0[3]= xh1_1 + xh21_1 +1  >> 1                          ;
       ;----------------------------------------------------------;
       AVG2.2     B_xh1_0_xh0_0,       A_xh21_0_xh20_0,      B_x_1o_x_0o 
       AVG2.2     B_xh1_1_xh0_1,       A_xh21_1_xh20_1,      B_x_3o_x_2°

Update the code to:

       ADD2.2     B_xh1_0_xh0_0,       A_xh21_0_xh20_0,      B_x_1o_x_0o 
       ADD2.2     B_xh1_1_xh0_1,       A_xh21_1_xh20_1,      B_x_3o_x_2°

Note replacement of AVG2 instruction with ADD2

Change 2: Identify the below code in the SA files

       ;---------------------------------------------------------;
       ; The following code computes intermediate results for:   ;
       ;                                                         ; 
       ;  si10' = -si10  twiddle table has -sin factors          ;
       ;                                                         ;
       ; x2[h2  ] = (co10 * xt0_0 + si10'* yt0_0 + 0x8000) >> 16 ;
       ; x2[h2+1] = (co10 * yt0_0 - si10'* xt0_0 + 0x8000) >> 16 ;
       ; x2[h2+2] = (co11 * xt0_1 + si11'* yt0_1 + 0x8000) >> 16 ;
       ; x2[h2+3] = (co11 * yt0_1 - si11'* xt0_1 + 0x8000) >> 16 ;
       ;---------------------------------------------------------;
        CMPYR .M1  A_co10_si10,   B_yt1_0_xt1_0, A_xh2_1_0; 
        CMPYR .M1  A_co11_si11,   B_yt1_1_xt1_1, A_xh2_3_2;
       ;---------------------------------------------------------;
       ;                                                         ;
       ; x2[l1  ] = (co20 * xt1_0 + si20'* yt1_0 + 0x8000) >> 16 ;
       ; x2[l1+1] = (co20 * yt1_0 - si20'* xt1_0 + 0x8000) >> 16 ;
       ; x2[l1+2] = (co21 * xt1_1 + si21'* yt1_1 + 0x8000) >> 16 ;
       ; x2[l1+3] = (co21 * yt1_1 - si21'* xt1_1 + 0x8000) >> 16 ;
       ;                                                         ;
       ; These four results are retained in registers and a      ;
       ; double word is formed so that it can be stored with     ;
       ; one STDW.                                               ;
       ;---------------------------------------------------------;
       ; This equation ONLY has minus sign for x, y components
       CMPYR .M1  A_co20_si20,   A_myt0_0_mxt0_0, A_xl1_1_0;
       CMPYR .M1  A_co21_si21,   A_myt0_1_mxt0_1, A_xl1_3_2;
       ;---------------------------------------------------------;
       ; The following code computes intermediate results for:   ;
       ;                                                         ;
       ; x2[l2  ] = (co30 * xt2_0 + si30'* yt2_0 + 0x8000) >> 16 ;
       ; x2[l2+1] = (co30 * yt2_0 - si30'* xt2_0 + 0x8000) >> 16 ;
       ; x2[l2+2] = (co31 * xt2_1 + si31'* yt2_1 + 0x8000) >> 16 ;
       ; x2[l2+3] = (co31 * yt2_1 - si31'* xt2_1 + 0x8000) >> 16 ;
       ;---------------------------------------------------------;
       CMPYR .M2  B_co30_si30,   B_yt2_0_xt2_0,  B_xl2_1_0
       CMPYR .M2  B_co31_si31,   B_yt2_1_xt2_1,  B_xl2_3_2

Update the code to:

       ;---------------------------------------------------------;
       ; The following code computes intermediate results for:   ;
       ;                                                         ; 
       ;  si10' = -si10  twiddle table has -sin factors          ;
       ;                                                         ;
       ; x2[h2  ] = (co10 * xt0_0 + si10'* yt0_0 + 0x8000) >> 16 ;
       ; x2[h2+1] = (co10 * yt0_0 - si10'* xt0_0 + 0x8000) >> 16 ;
       ; x2[h2+2] = (co11 * xt0_1 + si11'* yt0_1 + 0x8000) >> 16 ;
       ; x2[h2+3] = (co11 * yt0_1 - si11'* xt0_1 + 0x8000) >> 16 ;
       ;---------------------------------------------------------;
        CMPYR1 .M1  A_co10_si10,   B_yt1_0_xt1_0, A_xh2_1_0; 
        CMPYR1 .M1  A_co11_si11,   B_yt1_1_xt1_1, A_xh2_3_2;
       ;---------------------------------------------------------;
       ;                                                         ;
       ; x2[l1  ] = (co20 * xt1_0 + si20'* yt1_0 + 0x8000) >> 16 ;
       ; x2[l1+1] = (co20 * yt1_0 - si20'* xt1_0 + 0x8000) >> 16 ;
       ; x2[l1+2] = (co21 * xt1_1 + si21'* yt1_1 + 0x8000) >> 16 ;
       ; x2[l1+3] = (co21 * yt1_1 - si21'* xt1_1 + 0x8000) >> 16 ;
       ;                                                         ;
       ; These four results are retained in registers and a      ;
       ; double word is formed so that it can be stored with     ;
       ; one STDW.                                               ;
       ;---------------------------------------------------------;
       ; This equation ONLY has minus sign for x, y components
       CMPYR1 .M1  A_co20_si20,   A_myt0_0_mxt0_0, A_xl1_1_0;
       CMPYR1 .M1  A_co21_si21,   A_myt0_1_mxt0_1, A_xl1_3_2;
       ;---------------------------------------------------------;
       ; The following code computes intermediate results for:   ;
       ;                                                         ;
       ; x2[l2  ] = (co30 * xt2_0 + si30'* yt2_0 + 0x8000) >> 16 ;
       ; x2[l2+1] = (co30 * yt2_0 - si30'* xt2_0 + 0x8000) >> 16 ;
       ; x2[l2+2] = (co31 * xt2_1 + si31'* yt2_1 + 0x8000) >> 16 ;
       ; x2[l2+3] = (co31 * yt2_1 - si31'* xt2_1 + 0x8000) >> 16 ;
       ;---------------------------------------------------------;
       CMPYR1 .M2  B_co30_si30,   B_yt2_0_xt2_0,  B_xl2_1_0
       CMPYR1 .M2  B_co31_si31,   B_yt2_1_xt2_1,  B_xl2_3_2

Note the CMPYR instruction has been replaced with CMPYR1 intrinsic The updates to the FFT routines can be incorporated in the application in two ways

Example Application

Find attached with this article an example that demonstrates the impact of the above suggested changes (fft_scaling_example.zip). The example should be unarchived at [DSPLIB_INSTALLATION_DIR]\dsplib_v210\example. Note the example assumes that the updated FFT and iFFT files are located at [DSPLIB_INSTALLATION_DIR]\dsplib_v210\src\DSP_fft16x16 and [DSPLIB_INSTALLATION_DIR] \dsplib_v210\src\DSP_ifft16x16. Following files are included:

Conclusions

Following conclusions can be derived:

Thus, the suggested changes correctly remove the scaling from the two kernels

E2e.jpg For technical support please post your questions at http://e2e.ti.com. Please post only comments about the article FFT Implementation With No Data Scaling here.
Hyperlink blue.png Links
ARM Microcontroller MCU ARM Processor Digital Media Processor Digital Signal Processing Microcontroller MCU Multi Core Processor
Ultra Low Power DSP 8 bit Microcontroller MCU 16 bit Microcontroller MCU 32 bit Microcontroller MCU

Leave a Comment

Comments

Comments on FFT Implementation With No Data Scaling


Ayhankm said ...

Hi. when debugging session im taking this error. _TI_EABI_ undefined. how can i build this example with code composer 3.3? how can enable eabi option? Thanks

--Ayhankm 03:02, 9 June 2011 (CDT)

Personal tools
Namespaces
Variants
Actions
Navigation
Print/export
Toolbox