Page 3 - Preface; Read This First; About This Manual; This document uses the following conventions:; Related Documentation From Texas Instruments; The following books describe the C6000
i Read This First Preface Read This First About This Manual This document describes the C64x+ digital signal processor little-endian(DSP) Library, or DSPLIB for short. Notational Conventions This document uses the following conventions: - Hexadecimal numbers are shown with the suffix h. For example,...
Page 4 - Trademarks
Trademarks ii SPRAA84 — TMS320C64x to TMS320C64+ CPU Migration Guide. Describes migrating from the Texas Instruments TMS320C64x digitalsignal processor (DSP) to the TMS320C64x+ DSP. The objective of thisdocument is to indicate differences between the two cores. Functionalityin the devices that is id...
Page 5 - Contents; Introduction
Contents iii Contents 1 Introduction 1-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Provides a brief introduction to the TI C64x+ DSPLIBs, shows the organization of the routinescontained in the libraries, ...
Page 6 - Performance/Fractional Q Formats; Performance Considerations; Software Updates and Customer Support; DSPLIB Software Updates; Glossary; Defines terms and abbreviations used in this book.
Contents iv A Performance/Fractional Q Formats A-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Describes performance considerations related to the C64x+ DSPLIB and provides informationabout the Q format used by DSPLIB functions. A.1 Performance Consi...
Page 7 - Tables
Tables v Contents Tables 2−1 DSPLIB Data Types 2-3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3−1 Argument Conventions 3-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...
Page 9 - Topic; Chapter 1
1-1 Introduction This chapter provides a brief introduction to the TI C64x+ DSP Libraries(DSPLIB), shows the organization of the routines contained in the library, andlists the features and benefits of the DSPLIB. Topic Page 1.1 Introduction to the TI C64x+ DSPLIB 1-2 . . . . . . . . . . . . . . . ....
Page 10 - Adaptive filtering
Introduction to the TI C64x+ DSPLIB 1-2 1.1 Introduction to the TI C64x+ DSPLIB The TI C64x+ DSPLIB is an optimized DSP Function Library for Cprogrammers using devices that include the C64x+ megamodule. It includesmany C-callable, assembly-optimized, general-purpose signal-processingroutines. These ...
Page 11 - Filtering and convolution
Introduction to the TI C64x+ DSPLIB 1-3 Introduction - Filtering and convolution J DSP_fir_cplx J DSP_fir_cplx_hM4X4 J DSP_fir_gen J DSP_fir_gen_hM17_rA8X8 J DSP_fir_r4 J DSP_fir_r8 J DSP_fir_r8_hM16_rM8A8X8 J DSP_fir_sym J DSP_iir - Math J DSP_dotp_sqr J DSP_dotprod J DSP_maxval J DSP_maxidx J DSP_...
Page 12 - Features and Benefits; C and linear assembly source code
Features and Benefits 1-4 1.2 Features and Benefits - Hand-coded assembly-optimized routines - C and linear assembly source code - C-callable routines, fully compatible with the TI C6x compiler - Fractional Q.15-format operands supported on some benchmarks - Benchmarks (time and code) - Tested again...
Page 13 - Installing and Using DSPLIB; Chapter 2
2-1 Installing and Using DSPLIB This chapter provides information on how to install and rebuild the TI C64x+DSPLIB. Topic Page 2.1 How to Install DSPLIB 2-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Using DSPLIB 2-3 . . . . . . . . . . . . . . . . . . . . ...
Page 15 - Using DSPLIB; DSPLIB Arguments and Data Types; DSPLIB Types; Table 2−1. DSPLIB Data Types; DSPLIB Arguments
Using DSPLIB 2-3 Installing and Using DSPLIB 2.2 Using DSPLIB 2.2.1 DSPLIB Arguments and Data Types 2.2.1.1 DSPLIB Types Table 2−1 shows the data types handled by the DSPLIB. Table 2−1. DSPLIB Data Types Name Size (bits) Type Minimum Maximum short 16 integer −32768 32767 int 32 integer −2147483648 2...
Page 16 - Calling a DSPLIB Function From C; Link the code with dsp64plus.lib; Calling a DSP Function From Assembly; to completely prevent
Using DSPLIB 2-4 2.2.2 Calling a DSPLIB Function From C In addition to correctly installing the DSPLIB software, follow these steps toinclude a DSPLIB function in the code: - Include the function header file corresponding to the DSPLIB function - Link the code with dsp64plus.lib - Use a correct link...
Page 17 - Interrupt Behavior of DSPLIB Functions; How to Rebuild DSPLIB
How to Rebuild DSPLIB 2-5 Installing and Using DSPLIB 2.2.6 Interrupt Behavior of DSPLIB Functions All of the functions in this library are designed to be used in systems withinterrupts. Thus, it is not necessary to disable interrupts when calling any ofthese functions. The functions in the library ...
Page 19 - DSPLIB Function Tables; Chapter 3
3-1 DSPLIB Function Tables This chapter provides tables containing all DSPLIB functions, a briefdescription of each, and a page reference for more detailed information. Topic Page 3.1 Arguments and Conventions Used 3-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 DSPLIB Functions 3-...
Page 20 - Table 3−1. Argument Conventions; A kernel function
Arguments and Conventions Used 3-2 3.1 Arguments and Conventions Used The following convention has been used when describing the arguments foreach individual function: Table 3−1. Argument Conventions Argument Description x,y Argument reflecting input data vector r Argument reflecting output data vec...
Page 21 - DSPLIB Functions
DSPLIB Functions 3-3 DSPLIB Function Tables 3.2 DSPLIB Functions The routines included in the DSP library are organized into eight functionalcategories and listed below in alphabetical order. - Adaptive filtering - Correlation - FFT - Filtering and convolution - Math - Matrix functions - Miscellaneo...
Page 22 - Table 3−2. Adaptive Filtering; Table 3−3. Correlation
DSPLIB Function Tables 3-4 3.3 DSPLIB Function Tables Table 3−2. Adaptive Filtering Functions Description Page long DSP_firlms2(short *h, short *x, short b, int nh) LMS FIR 4-2 Table 3−3. Correlation Functions Description Page void DSP_autocor(short *r,short *x, int nx, int nr) Autocorrelation 4-4 v...
Page 23 - Table 3−5. Filtering and Convolution
DSPLIB Function Tables 3-5 DSPLIB Function Tables Table 3−4. FFT (Continued) Functions Page Description void DSP_ifft16x16(short *w, int nx, short *x, short *y) Complex out of place, InverseFFT mixed radix with digitreversal. Input/Output data inRe/Im order. 4-28 void DSP_ifft16x16_imre(short *w, in...
Page 25 - Table 3−8. Miscellaneous; Table 3−9. Obsolete Functions
DSPLIB Function Tables 3-7 DSPLIB Function Tables Table 3−8. Miscellaneous Functions Description Page short DSP_bexp(int *x, short nx) Max Exponent of a Vector (forscaling) 4-76 void DSP_blk_eswap16(void *x, void *r, int nx) Endian-swap a block of 16-bitvalues 4-78 void DSP_blk_eswap32(void *x, void...
Page 26 - Differences Between the C64x and C64x+ DSPLIBs; Table 3−10 shows the optimized functions for the C64x+ DSPLIB.; Table 3−10. Functions Optimized in the C64x+ DSPLIB
Differences Between the C64x and C64x+ DSPLIBs 3-8 3.4 Differences Between the C64x and C64x+ DSPLIBs The C64x+ DSPLIB was developed by optimizing some of the functions of theC64x DSPLIB to take advantage of the C64x+ architecture. Table 3−10 shows the optimized functions for the C64x+ DSPLIB. There...
Page 29 - DSPLIB Reference; Chapter 4
4-1 DSPLIB Reference This chapter provides a list of the functions within the DSP library (DSPLIB)organized into functional categories. The functions within each category arelisted in alphabetical order and include arguments, descriptions, algorithms,benchmarks, and special requirements. Topic Page ...
Page 30 - Adaptive Filtering
DSP_firlms2 4-2 4.1 Adaptive Filtering LMS FIR DSP_firlms2 Function long DSP_firlms2(short * restrict h, const short * restrict x, short b, int nh) Arguments h[nh] Coefficient Array x[nh+1] Input Array b Error from previous FIR nh Number of coefficients. Must be multiple of 4. return long Return val...
Page 31 - Implementation Notes; The loop is unrolled 4 times.; Benchmarks; Cycles
DSP_firlms2 4-3 C64x+ DSPLIB Reference Implementation Notes - Bank Conflicts: No bank conflicts occur. - Interruptibility: The code is interrupt-tolerant but not interruptible. - The loop is unrolled 4 times. Benchmarks Cycles 3 * nh/4 + 17 Codesize 148 bytes
Page 32 - Correlation; AutoCorrelation
DSP_autocor 4-4 4.2 Correlation AutoCorrelation DSP_autocor Function void DSP_autocor(short * restrict r, const short * restrict x, int nx, int nr) Arguments r[nr] Output array x[nx+nr] Input array. Must be double-word aligned. nx Length of autocorrelation. Must be a multiple of 8. nr Number of lags...
Page 33 - The inner loop is unrolled 8 times.
DSP_autocor 4-5 C64x+ DSPLIB Reference Implementation Notes - Bank Conflicts: No bank conflicts occur. - Interruptibility: The code is interrupt-tolerant but not interruptible. - The inner loop is unrolled 8 times. - The outer loop is unrolled 4 times. - The outer loop is conditionally executed in p...
Page 34 - Interruptibility: The code is interruptible.
DSP_autocor_rA8 4-6 AutoCorrelation DSP_autocor_rA8 Function void DSP_autocor_rA8(short * restrict r, const short * restrict x, int nx, int nr) Arguments r[nr] Output array, Must be double word aligned. x[nx+nr] Input array. Must be double-word aligned. nx Length of autocorrelation. Must be a multip...
Page 36 - FFT; Complex Forward Mixed Radix 16 x 16-bit FFT; Pointer to complex Q.15 FFT coefficients.; Description; nx
DSP_fft16x16 4-8 4.3 FFT Complex Forward Mixed Radix 16 x 16-bit FFT DSP_fft16x16 Function void DSP_fft16x16(const short * restrict w, int nx, short * restrict x, short *restrict y) Arguments w[2*nx] Pointer to complex Q.15 FFT coefficients. nx Length of FFT in complex samples. Must be power of 2 or...
Page 37 - Stage; logN
DSP_fft16x16 4-9 C64x+ DSPLIB Reference Implementation Notes - Bank Conflicts: No bank conflicts occur. - Interruptibility: The code is interruptible. The routine uses log 4 (nx) − 1 stages of radix-4 transform and performs either a radix-2 or radix-4 transform on the last stage depending on nx. If ...
Page 39 - Complex Forward Mixed Radix 16 x 16-bit FFT, With Im/Re Order
DSP_fft16x16_imre 4-11 C64x+ DSPLIB Reference Complex Forward Mixed Radix 16 x 16-bit FFT, With Im/Re Order DSP_fft16x16_imre Function void DSP_fft16x16_imre(const short * restrict w, int nx, short * restrict x, short* restrict y) Arguments w[2*nx] Pointer to complex Q.15 FFT coefficients. nx Length...
Page 40 - The routine uses log
DSP_fft16x16_imre 4-12 The routine uses log 4 (nx) − 1 stages of radix-4 transform and performs either a radix-2 or radix-4 transform on the last stage depending on nx. If nx is apower of 4, then this last stage is also a radix-4 transform, otherwise it is aradix-2 transform. The conventional Cooley...
Page 42 - Complex Forward Mixed Radix 16 x 16-bit FFT With Rounding
DSP_fft16x16r 4-14 Complex Forward Mixed Radix 16 x 16-bit FFT With Rounding DSP_fft16x16r Function void DSP_fft16x16r(int nx, short * restrict x, const short * restrict w, const un-signed char * restrict brev, short * restrict y, int radix, int offset, int nmax) Arguments nx Length of FFT in comple...
Page 43 - truncation noise power by 3dB.
DSP_fft16x16r 4-15 C64x+ DSPLIB Reference void dft(int n, short x[], short y[]) { int k,i, index; const double PI = 3.14159654; short * p_x; double arg, fx_0, fx_1, fy_0, fy_1, co, si; for(k = 0; k<n; k++) { p_x = x; fy_0 = 0; fy_1 = 0; for(i=0; i<n; i++) { fx_0 = (double)p_x[0]; fx_1 = (doubl...
Page 44 - The twiddle factor array is composed of log
DSP_fft16x16r 4-16 The function takes the twiddle factors and input data, and calculates the FFTproducing the frequency domain data in the y[ ] array. As the FFT allows everyinput point to affect every output point, which causes cache thrashing in acache based system. This is mitigated by allowing t...
Page 45 - Algorithm
DSP_fft16x16r 4-17 C64x+ DSPLIB Reference DSP_fft16x16r(N, &x[0], &w[0], brev,y,N/4,0, N) DSP_fft16x16r(N/4,&x[0], &w[2*3*N/4],brev,y,rad,0, N) DSP_fft16x16r(N/4,&x[2*N/4], &w[2*3*N/4],brev,y,rad,N/4, N) DSP_fft16x16r(N/4,&x[2*N/2], &w[2*3*N/4],brev,y,rad,N/2, N) DSP_...
Page 50 - Special Requirements; nx must be a power of 2 or 4.
DSP_fft16x16r 4-22 xl1_1 = x6; xl0_1 = x7; } yt2 = xl0_0 + xl1_1; yt3 = xl1_0 − xl0_1; yt6 = xl0_0 − xl1_1; yt7 = xl1_0 + xl0_1; if (radix == 2) { yt7 = xl1_0 − xl0_1; yt3 = xl1_0 + xl0_1; } y0[k] = yt0; y0[k+1] = yt1; k += n>>1; y0[k] = yt2; y0[k+1] = yt3; k += n>>1; y0[k] = yt4; y0[k+1...
Page 52 - Complex Forward Mixed Radix 16 x 32-bit FFT With Rounding; to completely prevent overflow.
DSP_fft16x32 4-24 Complex Forward Mixed Radix 16 x 32-bit FFT With Rounding DSP_fft16x32 Function void DSP_fft16x32(const short * restrict w, int nx, int * restrict x, int * restrict y) Arguments w[2*nx] Pointer to complex Q.15 FFT coefficients. nx Length of FFT in complex samples. Must be power of ...
Page 54 - Complex Forward Mixed Radix 32 x 32-bit FFT With Rounding; Pointer to complex 32-bit FFT coefficients.
DSP_fft32x32 4-26 Complex Forward Mixed Radix 32 x 32-bit FFT With Rounding DSP_fft32x32 Function void DSP_fft32x32(const int * restrict w, int nx, int * restrict x, int * restrict y) Arguments w[2*nx] Pointer to complex 32-bit FFT coefficients. nx Length of FFT in complex samples. Must be power of ...
Page 56 - Complex Forward Mixed Radix 32 x 32-bit FFT With Scaling
DSP_fft32x32s 4-28 Complex Forward Mixed Radix 32 x 32-bit FFT With Scaling DSP_fft32x32s Function void DSP_fft32x32s(const int * restrict w, int nx, int * restrict x, int * restrict y) Arguments w[2*nx] Pointer to complex 32-bit FFT coefficients. nx Length of FFT in complex samples. Must be power o...
Page 57 - Bank Conflicts: No bank conflicts occur.
DSP_fft32x32s 4-29 C64x+ DSPLIB Reference - The FFT coefficients (twiddle factors) are generated using the programtw_fft32x32 provided in the directory ‘support\fft’. The scale factor must be1073741823.5. The input data must be scaled by 2 (log2(nx) − ceil[ log4(nx)−1 ]) to completely prevent overfl...
Page 58 - Complex Inverse Mixed Radix 16 x 16-bit FFT With Rounding
DSP_ifft16x16 4-30 Complex Inverse Mixed Radix 16 x 16-bit FFT With Rounding DSP_ifft16x16 Function void DSP_ifft16x16(const short * restrict w, int nx, short * restrict x, short *restrict y) Arguments w[2*nx] Pointer to complex Q.15 FFT coefficients. nx Length of FFT in complex samples. Must be pow...
Page 60 - Complex Inverse Mixed Radix 16 x 16-bit FFT With Im/Re Order
DSP_ifft16x16_imre 4-32 Complex Inverse Mixed Radix 16 x 16-bit FFT With Im/Re Order DSP_ifft16x16_imre Function void DSP_ifft16x16_imre(const short * restrict w, int nx, short * restrict x, short* restrict y) Arguments w[2*nx] Pointer to complex Q.15 FFT coefficients. nx Length of FFT in complex sa...
Page 62 - Complex Inverse Mixed Radix 16 x 32-bit FFT With Rounding
DSP_ifft16x32 4-34 Complex Inverse Mixed Radix 16 x 32-bit FFT With Rounding DSP_ifft16x32 Function void DSP_ifft16x32(const short * restrict w, int nx, int * restrict x, int * restricty) Arguments w[2*nx] Pointer to complex Q.15 FFT coefficients. nx Length of FFT in complex samples. Must be power o...
Page 64 - Complex Inverse Mixed Radix 32 x 32-bit FFT With Rounding
DSP_ifft32x32 4-36 Complex Inverse Mixed Radix 32 x 32-bit FFT With Rounding DSP_ifft32x32 Function void DSP_ifft32x32(const int * restrict w, int nx, int * restrict x, int * restrict y) Arguments w[2*nx] Pointer to complex 32-bit FFT coefficients. nx Length of FFT in complex samples. Must be power ...
Page 66 - Filtering and Convolution; Complex FIR Filter; nh
DSP_fir_cplx 4-38 4.4 Filtering and Convolution Complex FIR Filter DSP_fir_cplx Function void DSP_fir_cplx (const short * restrict x, const short * restrict h, short * restrictr, int nh, int nr) Arguments x[2*(nr+nh−1)] Complex input data. x must point to x[2*(nh−1)]. h[2*nh] Complex coefficients (i...
Page 69 - The number of output samples nr must be a multiple of 4.; Interruptibility: The code is fully interruptible.
DSP_fir_cplx_hM4X4 4-41 C64x+ DSPLIB Reference Special Requirements - The number of coefficients nh must be larger or equal to 4 and a multipleof 4. - The number of output samples nr must be a multiple of 4. Implementation Notes - Bank Conflicts: No bank conflicts occur. - Interruptibility: The code...
Page 70 - FIR Filter
DSP_fir_gen 4-42 FIR Filter DSP_fir_gen Function void DSP_fir_gen (const short * restrict x, const short * restrict h, short * restrictr, int nh, int nr) Arguments x[nr+nh−1] Pointer to input array of size nr + nh − 1. h[nh] Pointer to coefficient array of size nh (coefficients must be inreverse ord...
Page 72 - Pointer to input array of size nr + nh − 1.
DSP_fir_gen_hM17_rA8X8 4-44 FIR Filter DSP_fir_gen_hM17_rA8X8 Function void DSP_fir_gen_hM17_rA8X8 (const short * restrict x, const short * restricth, short * restrict r, int nh, int nr) Arguments x[nr+nh−1] Pointer to input array of size nr + nh − 1. h[nh] Pointer to coefficient array of size nh (c...
Page 74 - FIR Filter (when the number of coefficients is a multiple of 4)
DSP_fir_r4 4-46 FIR Filter (when the number of coefficients is a multiple of 4) DSP_fir_r4 Function void DSP_fir_r4 (const short * restrict x, const short * restrict h, short * restrictr, int nh, int nr) Arguments x[nr+nh−1] Pointer to input array of size nr + nh – 1. h[nh] Pointer to coefficient ar...
Page 76 - FIR Filter (when the number of coefficients is a multiple of 8)
DSP_fir_r8 4-48 FIR Filter (when the number of coefficients is a multiple of 8) DSP_fir_r8 Function void DSP_fir_r8_hM16_rM8A8X8 (short *x, short *h, short *r, int nh, int nr) Arguments x[nr+nh−1] Pointer to input array of size nr + nh – 1. h[nh] Pointer to coefficient array of size nh (coefficients...
Page 78 - FIR Filter (the number of coefficients is a multiple of 8); Pointer to input array of size nr + nh – 1.
DSP_fir_r8_hM16_rM8A8X8 4-50 FIR Filter (the number of coefficients is a multiple of 8) DSP_fir_r8_hM16_rM8A8X8 Function void DSP_fir_r8_hM16_rM8A8X8 (short *x, short *h, short *r, int nh, int nr) Arguments x[nr+nh−1] Pointer to input array of size nr + nh – 1. h[nh] Pointer to coefficient array of ...
Page 80 - Symmetric FIR Filter; Pointer to output array of size nr. Must be word aligned.
DSP_fir_sym 4-52 Symmetric FIR Filter DSP_fir_sym Function void DSP_fir_sym (const short * restrict x, const short * restrict h, short * re-strict r, int nh, int nr, int s) Arguments x[nr+2*nh] Pointer to input array of size nr + 2*nh. Must be double-wordaligned. h[nh+1] Pointer to coefficient array...
Page 81 - nr must be a multiple of 4.
DSP_fir_sym 4-53 C64x+ DSPLIB Reference y0 += (short) (x[j + i] + x[j + 2 * nh − i]) * h[i]; y0 += x[j + nh] * h[nh]; r[j] = (int) (y0 >> s); } } Special Requirements - nh must be a multiple of 8. The number of original symmetric coefficientsis 2*nh+1. Only half (nh+1) are required. - nr must ...
Page 82 - IIR With 5 Coefficients
DSP_iir 4-54 IIR With 5 Coefficients DSP_iir Function void DSP_iir (short * restrict r1, const short * restrict x, short * restrict r2, constshort * restrict h2, const short * restrict h1, int nr) Arguments r1[nr+4] Output array (used in actual computation. First four elements must have the previous...
Page 83 - nr is greater than or equal to 8.
DSP_iir 4-55 C64x+ DSPLIB Reference Special Requirements - nr is greater than or equal to 8. - Input data array x[ ] contains nr + 4 input samples to produce nr outputsamples. Implementation Notes - Bank Conflicts: No bank conflicts occur. - Interruptibility: The code is interrupt-tolerant but not i...
Page 84 - All-Pole IIR Lattice Filter
DSP_iirlat 4-56 All-Pole IIR Lattice Filter DSP_iirlat Function void DSP_iirlat(const short * restrict x, int nx, const short * restrict k, int nk, int* restrict b, short * restrict r) Arguments x[nx] Input vector (16-bit). nx Length of input vector. k[nk] Reflection coefficients in Q.15 format. nk ...
Page 85 - No special alignment requirements
DSP_iirlat 4-57 C64x+ DSPLIB Reference rt = rt − (short)(b[i] >> 15) * k[i]; b[i + 1] = b[i] + (short)(rt >> 15) * k[i]; } b[0] = rt; r[j] = rt >> 15; } } Special Requirements - nk must be >= 4. - No special alignment requirements - See Bank Conflicts for avoiding bank conflicts...
Page 86 - Math; Vector Dot Product and Square
DSP_dotp_sqr 4-58 4.5 Math Vector Dot Product and Square DSP_dotp_sqr Function int DSP_dotp_sqr(int G, const short * restrict x, const short * restrict y, int *restrict r, int nx) Arguments G Calculated value of G (used in the VSELP coder). x[nx] First vector array y[nx] Second vector array r Result...
Page 88 - Vector Dot Product; The input length must be a multiple of 4.
DSP_dotprod 4-60 Vector Dot Product DSP_dotprod Function int DSP_dotprod(const short * restrict x, const short * restrict y, int nx) Arguments x[nx] First vector array. Must be double-word aligned. y[nx] Second vector array. Must be double word-aligned. nx Number of elements of vector. Must be multi...
Page 90 - Maximum Value of Vector
DSP_maxval 4-62 Maximum Value of Vector DSP_maxval Function short DSP_maxval (const short *x, int nx) Arguments x[nx] Pointer to input vector of size nx. nx Length of input data vector. Must be multiple of 8 and ≥ 32. return short Maximum value of a vector. Description This routine finds the element...
Page 91 - Index of Maximum Element of Vector
DSP_maxidx 4-63 C64x+ DSPLIB Reference Index of Maximum Element of Vector DSP_maxidx Function int DSP_maxidx (const short *x, int nx) Arguments x[nx] Pointer to input vector of size nx. Must be double-word aligned. nx Length of input data vector. Must be multiple of 16 and ≥ 48. return int Index for...
Page 92 - This code requires 40 bytes of stack space for a temporary buffer.
DSP_maxidx 4-64 Implementation Notes - Bank Conflicts: No bank conflicts occur. - Interruptibility: The code is interrupt-tolerant but not interruptible. - The code is unrolled 16 times to enable the full bandwidth of LDDW andMAX2 instructions to be utilized. This splits the search into 16 sub-range...
Page 93 - Minimum Value of Vector
DSP_minval 4-65 C64x+ DSPLIB Reference Minimum Value of Vector DSP_minval Function short DSP_minval (const short *x, int nx) Arguments x [nx] Pointer to input vector of size nx. nx Length of input data vector. Must be multiple of 4 and ≥ 20. return short Maximum value of a vector. Description This r...
Page 94 - 2-Bit Vector Multiply; nx Number
DSP_mul32 4-66 32-Bit Vector Multiply DSP_mul32 Function void DSP_mul32(const int * restrict x, const int * restrict y, int * restrict r, shortnx) Arguments x[nx] Pointer to input data vector 1 of size nx. Must be double-wordaligned. y[nx] Pointer to input data vector 2 of size nx. Must be double-wo...
Page 95 - nx must be a multiple of 8 and greater than or equal to 16.
DSP_mul32 4-67 C64x+ DSPLIB Reference e+=d; /* Xhigh*Yhigh + */ /* (Xhigh*Ylow+Xlow*Yhigh)>>16 */ *(r++)=e; } } Special Requirements - nx must be a multiple of 8 and greater than or equal to 16. - Input and output vectors must be double-word aligned. Implementation Notes - Bank Conflicts: No b...
Page 97 - Pointer to Q.15 input data vector of size nx.
DSP_recip16 4-69 C64x+ DSPLIB Reference 16-Bit Reciprocal DSP_recip16 Function void DSP_recip16 (short *x, short *rfrac, short *rexp, short nx) Arguments x[nx] Pointer to Q.15 input data vector of size nx. rfrac[nx] Pointer to Q.15 output data vector for fractional values. rexp[nx] Pointer to output...
Page 99 - Sum of Squares
DSP_vecsumsq 4-71 C64x+ DSPLIB Reference Sum of Squares DSP_vecsumsq Function int DSP_vecsumsq (const short *x, int nx) Arguments x[nx] Input vector nx Number of elements in x. Must be multiple of 4 and ≥ 8. return int Sum of the squares Description This routine returns the sum of squares of the ele...
Page 100 - Weighted Vector Sum
DSP_w_vec 4-72 Weighted Vector Sum DSP_w_vec Function void DSP_w_vec(const short * restrict x, const short * restrict y, short m, short* restrict r, short nr) Arguments x[nr] Vector being weighted. Must be double-word aligned. y[nr] Summation vector. Must be double-word aligned. m Weighting factor r...
Page 101 - Matrix; Matrix Multiplication; Number of rows in matrix x.
DSP_mat_mul 4-73 C64x+ DSPLIB Reference 4.6 Matrix Matrix Multiplication DSP_mat_mul Function void DSP_mat_mul(const short * restrict x, int r1, int c1, const short * restricty, int c2, short * restrict r, int qs) Arguments x [r1*c1] Pointer to input matrix of size r1*c1. r1 Number of rows in matrix...
Page 103 - Matrix Transpose
DSP_mat_trans 4-75 C64x+ DSPLIB Reference Matrix Transpose DSP_mat_trans Function void DSP_mat_trans (const short *x, short rows, short columns, short *r) Arguments x[rows*columns] Pointer to input matrix. rows Number of rows in the input matrix. Must be a multipleof 4. columns Number of columns in ...
Page 104 - Miscellaneous; Block Exponent Implementation; nx must be a multiple of 8.
DSP_bexp 4-76 4.7 Miscellaneous Block Exponent Implementation DSP_bexp Function short DSP_bexp(const int *x, short nx) Arguments x[nx] Pointer to input vector of size nx. Must be double-wordaligned. nx Number of elements in input vector. Must be multiple of 8. return short Return value is the maximu...
Page 113 - Float to Q15 Conversion
DSP_fltoq15 4-85 C64x+ DSPLIB Reference Float to Q15 Conversion DSP_fltoq15 Function void DSP_fltoq15 (float *x, short *r, short nx) Arguments x[nx] Pointer to floating-point input vector of size nx. x should containthe numbers normalized between [−1,1). r[nx] Pointer to output data vector of size n...
Page 114 - Loop is unrolled twice.
DSP_fltoq15 4-86 Implementation Notes - Loop is unrolled twice. - Bank Conflicts: No bank conflicts occur. - Interruptibility: The code is interrupt-tolerant but not interruptible. Benchmarks Cycles 3 * nx/2 + 14 Codesize 224 bytes
Page 115 - Minimum Energy Error Search; Array of error coefficients.
DSP_minerror 4-87 C64x+ DSPLIB Reference Minimum Energy Error Search DSP_minerror Function int minerror (const short * restrict GSP0_TABLE, const short * restricterrCoefs, int * restrict max_index) Arguments GSP0_TABLE[9*256] GSP0 terms array. Must be double-word aligned. errCoefs[9] Array of error ...
Page 116 - The inner loop is completely unrolled.
DSP_minerror 4-88 Special Requirements Array GSP0_TABLE[] must be double-word aligned. Implementation Notes - Bank Conflicts: No bank conflicts occur. - Interruptibility: The code is interrupt-tolerant but not interruptible. - The load double-word instruction is used to simultaneously load fourvalue...
Page 117 - Q15 to Float Conversion
DSP_q15tofl 4-89 C64x+ DSPLIB Reference Q15 to Float Conversion DSP_q15tofl Function void DSP_q15tofl (short *x, float *r, int nx) Arguments x[nx] Pointer to Q.15 input vector of size nx. r[nx] Pointer to floating-point output data vector of size nx containingthe floating-point equivalent of vector ...
Page 118 - Obsolete Functions; Pointer to complex input vector x of size nx
DSP_bitrev_cplx 4-90 4.8 Obsolete Functions 4.8.1 FFT Complex Bit-Reverse DSP_bitrev_cplx NOTE: This function is provided for backward compatibility with the C62xDSPLIB. It has not been optimized for the C64x architecture. You are advisedto use one of the newly added FFT functions which have been op...
Page 120 - nx must be a power of 2.
DSP_bitrev_cplx 4-92 if (t){x[i3] = xj3; x[j3] = xi3;} } } Special Requirements - nx must be a power of 2. - The array index[] is generated by the routine bitrev_index provided in thedirectory ‘support\fft’. - If nx ≤ 4K, you can use the char (8-bit) data type for the “index” variable. This requires...
Page 122 - Loads input x and coefficient w as words.
DSP_radix2 4-94 xt = x[2*l] − x[2*i]; x[2*i] = x[2*i] + x[2*l]; yt = x[2*l+1] − x[2*i+1]; x[2*i+1] = x[2*i+1] + x[2*l+1]; x[2*l] = (c*xt + s*yt)>>15; x[2*l+1] = (c*yt − s*xt)>>15; } } ie = ie<<1; } } Special Requirements - 2 ≤ nx ≤ 32768 (nx is a power of 2) - Input x and coefficie...
Page 126 - Complex Forward FFT With Digital Reversal
DSP_fft 4-98 Complex Forward FFT With Digital Reversal DSP_fft Function void DSP_fft (const short * restrict w, int nx, short * restrict x, short * restrict y) Arguments w[2*nx] Pointer to vector of Q.15 FFT coefficients of size 2 * nxelements. Must be double-word aligned. nx Number of complex eleme...
Page 134 - nx must be a power of 4 and 4
DSP_fft 4-106 Special Requirements - In-place computation is not allowed. - nx must be a power of 4 and 4 ≤ nx ≤ 65536. - Input x[ ] and output y[ ] are stored on double-word aligned boundaries. - Input data x[ ] is stored in the order real0, img0, real1, img1, ... - The FFT coefficients (twiddle fa...
Page 135 - Complex Forward Mixed Radix 16- x 16-Bit FFT With Truncation
DSP_fft16x16t 4-107 C64x+ DSPLIB Reference Complex Forward Mixed Radix 16- x 16-Bit FFT With Truncation DSP_fft16x16t Function void DSP_fft16x16t(const short * restrict w, int nx, short * restrict x, short * re-strict y) Arguments w[2*nx] Pointer to complex Q.15 FFT coefficients. nx Length of FFT in...
Page 147 - The following statements can be made based on above observations:
DSP_fft16x16t 4-119 C64x+ DSPLIB Reference The following statements can be made based on above observations: 1) Inner loop “i0” iterates a variable number of times. In particular, the number of iterations quadruples every time from 1..N/4. Hence, software pipelininga loop that iterates a variable nu...
Page 149 - Appendix A
A-1 Appendix A Performance/Fractional Q Formats This appendix describes performance considerations related to the C64x+DSPLIB and provides information about the Q format used by DSPLIBfunctions. Topic Page A.1 Performance Considerations A-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ....
Page 150 - A.1 Performance Considerations
Performance Considerations A-2 A.1 Performance Considerations The ceil( ) is used in some benchmark formulas to accurately describe thenumber of cycles. It returns a number rounded up, away from zero, to thenearest integer. For example, ceil(1.1) returns 2. Although DSPLIB can be used as a first est...
Page 151 - A.2 Fractional Q Formats; ) and the finest fractional resolution is 2
Fractional Q Formats A-3 Performance/Fractional Q Formats A.2 Fractional Q Formats Unless specifically noted, DSPLIB functions use Q15 format, or to be moreexact, Q0.15. In a Qm.n format, there are m bits used to represent the two’scomplement integer portion of the number, and n bits used to represe...
Page 152 - Table A−3. Q.31 Low Memory Location Bit Fields; Table A−4. Q.31 High Memory Location Bit Fields
Fractional Q Formats A-4 A.2.3 Q.31 Format Q.31 format spans two 16-bit memory words. The 16-bit word stored in thelower memory location contains the 16 least significant bits, and the highermemory location contains the most significant 15 bits and the sign bit. Theapproximate allowable range of num...
Page 153 - Appendix B
B-1 Appendix A Software Updates and Customer Support This appendix provides information about software updates and customersupport. Topic Page B.1 DSPLIB Software Updates B-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.2 DSPLIB Customer Support B-2 . . . . . . . . . ....
Page 154 - B.1 DSPLIB Software Updates
DSPLIB Software Updates B-2 B.1 DSPLIB Software Updates C64x DSPLIB software updates may be periodically released incorporatingproduct enhancements and fixes as they become available. You should readthe README.TXT available in the root directory of every release. B.2 DSPLIB Customer Support If you h...
Page 155 - Appendix C
C-1 Appendix A Glossary A address: The location of program code or data stored; an individually accessible memory location. A-law companding: See compress and expand (compand). API: See application programming interface. application programming interface (API): Used for proprietary application progr...
Page 157 - control register file:
Glossary C-3 Glossary compress and expand (compand): A quantization scheme for audio signals in which the input signal is compressed and then, afterprocessing, is reconstructed at the output by expansion. There are twodistinct companding schemes: A-law (used in Europe) and μ -law (used in the United...
Page 160 - instruction fetch packet:
Glossary C-6 H HAL: Hardware abstraction layer of the CSL. The HAL underlies the service layer and provides it a set of macros and constants for manipulating theperipheral registers at the lowest level. It is a low-level symbolic interfaceinto the hardware providing symbols that describe peripheralr...
Page 161 - Internal peripherals:
Glossary C-7 Glossary interrupt service table (IST) A table containing a corresponding entry for each of the 16 physical interrupts. Each entry is a single-fetch packet andhas a label associated with it. Internal peripherals: Devices connected to and controlled by a host device. The C6x internal per...
Page 163 - RTOS
Glossary C-9 Glossary reset: A means of bringing the CPU to a known state by setting the registers and control bits to predetermined values and signaling execution to startat a specified address. RTOS Real-time operating system. S service layer: The top layer of the 2-layer chip support library arch...
Page 165 - Index
Index-1 Index A adaptive filtering functions 3-4 DSPLIB reference 4-2 address, defined C-1 A-law companding, defined C-1 API, defined C-1 application programming interface, defined C-1 argument conventions 3-2 arguments, DSPLIB 2-3 assembler, defined C-1 assert, defined C-1 B big endian, defined C-1...