0% found this document useful (0 votes)
49 views108 pages

Fir V2.07

Uploaded by

rajasekaran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views108 pages

Fir V2.07

Uploaded by

rajasekaran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 108

DSP C5000

Chapter 14
Finite Impulse Response (FIR)
Filter Implementation

Copyright © 2003 Texas Instruments. All rights reserved.


Outline

Digital Filters and FIR filters

Implementation of FIR Filters on C54x

Implementation of FIR Filters on C55x

Comparison of C54x and C55x

SIEE, Slide 2 Copyrig


Outline of FIR Filters
 Generalities on Digital Filters
 FIR Filters with Matlab
 Implementation of FIR Filters

SIEE, Slide 3 Copyrig


Digital Filters

Sampling
frequency
fS
Analog xn yn
x(t) anti- A D Analog y(t)
aliasing D Digital Filter A smoothing
C C filter
filter

xn yn
Digital Filter

SIEE, Slide 4 Copyrig


Linear, Time-Invariant Digital Systems
 Linearity

 1 R
 1x1( n )  2 x2 ( n )  1 y1( n )  2 y2 ( n )
2  R

 Time Invariance

x ( n )  y ( n )  x ( n  n0 )  y ( n  n0 )

SIEE, Slide 5 Copyrig


Impulse Response

n  0 un  0

Impulse sequence un  u0  1
n  0 u  0
 n

un hn
Digital Filter
n=0

SIEE, Slide 6 Copyrig


Input-Output Relationship, Convolution
xn
n=-1 0 1 2 =
x-1un+1
 n=-1 0 1 2 +
xn  xu
k 
k nk x0un
n=-1 0 1 2 +
x1un-1
n=-1 0 1 2 +
x2un-2
n=-1 0 1 2
SIEE, Slide 7 Copyrig
Input-Output Relationship, Convolution
 Using linearity and time invariance:
k  k 
yn   x output(u
k 
k n k ) xh
k 
k n k

k  k 
yn   xk hn k   hk xn k
k  k 

SIEE, Slide 8 Copyrig


Output for a Single Frequency Input
 Single frequency input  Single frequency output

j 0 nTe
xn  e
y n  xn H (0 )
k 
H (0 )   hk e  j 0 kTe

k 

j arg( H (  0 )) j (  0 )
H (0 )  H (0 ) e  A(0 )e

SIEE, Slide 9 Copyrig


Frequency Transfer Function
 For a digital filter the frequency
transfer function is periodic.
j arg( H (  )) j (  )
H ( )  H ( ) e  A( )e
f e
1 Amplitude
hn 
2f e  H ( )e jnTe
d
f e

( )
( )  arg H    ( )  

Phase Group
delay
SIEE, Slide 10 Copyrig
Relationship Between Fourier Transforms
of Input and Output

n  n 
X ( )   xne  jnTe
Y ( )   yne  jnTe

n  n 

Y ( )  H ( ) X ( )

SIEE, Slide 11 Copyrig


Z Transfer Function


H(z)   hn z n

n 


H ( )   hne  jnTe
 H ( z ) z e jTe
n 

Y( z)  X ( z)H( z)

SIEE, Slide 12 Copyrig


Basic Relationships of a Digital Filter

k  k 
yn   xk hnk   hk xnk
k  k 

Y ( )  H ( ) X ( )

Y( z)  X ( z)H( z)

SIEE, Slide 13 Copyrig


Rational z Transfer Function

N(z)
 bi z i

i0
H(z)   P
D( z )
1   ak z k

k 1

 Linear equation with constant coefficients.


Q P
yn   bi xni   ak yn k
i0 k 1

SIEE, Slide 14 Copyrig


IIR and FIR Filters
 IIR = Infinite Impulse Response
 FIR = Finite Impulse Response
 FIR
Q   n  0, Q  1 hn  0
H ( z )   bi z 
i
 hn z n

i0 n  n  0, Q  1 hn  bn

 IIR
N(z)
H(z)  With D( z )  constant.
D( z )
SIEE, Slide 15 Copyrig
FIR and IIR
 FIR: output yn is a linear combination of a
finite number of input samples.
Q Q
yn   hi xn i   bi xn i , bi  hi .
i 0 i 0

 IIR: output yn is a linear combination of a


finite number of input and of output
samples. Recursive form.
Q P
yn   bi xni   ak yn k
i0 k 1

SIEE, Slide 16 Copyrig


Causality and Stability
 A filter is causal if hn=0 for n < 0
 A filter is stable if the output is bounded
for any bounded input.
 Condition for stability is:
 All the poles of H(z) are inside the unit circle
 FIR are always stable.
 Or:

 hn A
n

SIEE, Slide 17 Copyrig


Representation of Poles and Zeroes of H(z) in
the Complex Plane

Imaginary Part
1

0.5

Real Part
0

-0.5

-1
-1 -0.5 0 0.5 1

SIEE, Slide 18 Copyrig


Some Useful Matlab Functions
 Example for a FIR filter:
1 2 3
N ( z )  b0  b1 z  b2 z  b3 z
b  [b0 b1 b2 b2 ]  [1 1 1 1].

 Enter the filter coefficients vector b:


 b=[1 1 1 1]; a=1;
 Calculate transfer function Hf, its
amplitude and phase on 256 samples,
with fs=1:
 [Hf,f]=freqz(b,a,256,1);
 HfA=abs(Hf);
SIEE, Slide 19
 Hfphi=angle(Hf); Copyrig
Some Useful Matlab Functions
 Plot impulse response: stem(b)
 Plot amplitude and phase of transfer
function: plot(f,HfA) and plot(f,Hfphi)
1
Phase of the transfer function 4
Amplitude of the transfer function
3.5
0.5

3
0

2.5
-0.5

-1
1.5

-1.5
1

-2
0.5

-2.5
0.05 0.1 0.15 0.2 0.3 0.35 0.4 0.45 0
0 0.25 0.5 0
0.05 0.1 0.15 0.2
0.25
0.3 0.35 0.4 0.45
0.5
Frequency, FS=1 Frequency, FS=1

SIEE, Slide 20 Copyrig


Some Useful Matlab Functions
 Generate a test signal = sum of cosines:
 x=cos(2*pi*[0:99]*0.25)+2*cos(2*pi*[0:99]*0.1);
 Apply the filter to x. Output is y:
 y=filter(b,a,x);
 Plot the results: plot(x); plot(y)
3
Input x 6
Output y
2 4

x is the sum of 1 2

2 frequencies :
0.25 and 0.1.
The filter
0 0

-1 -2
cancels the
frequency 0.25.
y has only the
-2 -4

-3 -6
freq. 0.1.
0 20 40 60 80 100 0 20 40 60 80 100
Time Time
SIEE, Slide 21 Copyrig
Calculation of a FIR using Matlab
 For given attenuation and frequency
response characteristics, the transfer
function can be calculated using
different methods:
 Mean square error, miniMax (Chebychev)
 Empirical window method
 Corresponding Matlab functions
 firls and remez.
 fir and fir1.

SIEE, Slide 22 Copyrig


Example using Matlab
 Design a low pass filter:
 Sampling frequency = 9600 Hz
 Maximum attenuation (passband) = 0.1 dB
 Minimum attenuation (stopband) = 50 dB
 Limit frequencies of passband and
stopband = 1200 Hz and 2600 Hz.
Attenuation in dB

f in Hz
1200 2600
SIEE, Slide 23 Copyrig
Example using Matlab
 Vector of limited frequencies (normalized)
 F=[0 1200 2600 4800]/4800;
 Vector of required amplitudes:
 A=[1 1 0 0];
 Least square calculation of filter:
 Bls=firls(23,F,A);
 Mini Max calculation of filter:
 Bre=remez(21,F,A);
 Window method (Hamming):
 Bwin=fir1(25,(1200+2600)/9600);

SIEE, Slide 24 Copyrig


Results of Matlab Example
 The minimum orders to satisfy the
constraints are 23 for LS, 21 for
minimax and 25 for the window
method.
140

Least square
120
method
100

80 Window
method
60

40

20

Mini Max
0
window
-20
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

SIEE, Slide 25 Copyrig


Results of Matlab Example
 Impulse Response
0.4

hn
0.35

0.3

0.25

0.2

0.15

0.1

0.05

0 n
-0.05

-0.1
0 5 10 15 20 25

SIEE, Slide 26 Copyrig


FIR Filters with Constant Group Delay or
Linear Phase
 For many applications, it is desirable to
use a filter with a constant group delay
(independant of the frequency).
 The phase will be linear or affine.
 2 possible cases:
 symmetrical or asymmetrical FIR.
 Constant group delay = TS (N-1)/2
 Symmetrical: h(n)=h(N-1-n)
 Asymmetrical; h(n)=-h(N-1-n)

SIEE, Slide 27 Copyrig


FIR filters with Constant Group Delay or
Linear Phase
 Asymmetric case: linear phase

( f )  kf

 Asymmetrical case:

( f )  kf 
2

SIEE, Slide 28 Copyrig


Fixed Point Implementation of FIR Filters
Numerical Issues
 Fixed point implementation:
 16 bits for data and coefficients
 Accumulators have size 40 bits
 Fixed point representation of data
 Size B = 16 bits, Format Qk: k fractional bits
 Quantization of coefficients
 Maximum magnitude coefficient = hmax
 Number of bits of the integer part of
coefficients is Bi:
 Bi = log2(hmax)
 Coefficients in Qk’ with k = 16-Bi
SIEE, Slide 29 Copyrig
Matlab Example
 The coefficients Bre can be quantized
using 16-bit fixed point with 15 fractional
bits:
 Bre=round(Bre*2^15);
 To store the result in a text file for CCS:
 fp=fopen('coef.asm','wt')
 for i=1:22
 fprintf(fp,' .word %d \n',Ba(i))
 end
 fclose(fp)

SIEE, Slide 30 Copyrig


Matlab Example
.word 39
 File coef.asm .word
.word
-92
-242
.word 25
 Can be edited .word 668
to be used .word
.word
579
-978
with CCS. .word -2229
.word 86
.word 6374
.word 12127
.word 12127
.word 6374
.word 86
.word -2229
.word -978
.word 579
.word 668
.word 25
.word -242
.word -92
.word 39

SIEE, Slide 31 Copyrig


FIR Implementation, Numerical issues,
FRCT bit
 Common case:
 Data and coefficients in Q15 format
 Product h(i)x(n-i) in Q30 (2 sign bits)
 By shifting products 1 bit left, the product
are in Q31 format with only 1 sign bit.
 If the FRCT bit (Fraction) is set to 1,
products are automatically shifted 1 bit
left.

SIEE, Slide 32 Copyrig


Structures for FIR Implementation
 Common structures for FIR filters
 Transversal structures
 Trellis structure
 Useful in some adaptive situations.
 Transversal structures using:
 Linear buffers
 Circular buffers
 Special case for symmetrical or
asymmetrical FIRs.

SIEE, Slide 33 Copyrig


Transversal Structures of FIR
 Structure with a delay line
xn-1 xn-2 xn-N+1
xn
b0 b1 b2 b3 bN-1
yn

 Transposed structure
yn
bN-1 bN-2 b3 b2 b1 b0
xn

SIEE, Slide 34 Copyrig


Implementation of a FIR with a Delay Line
 Most common structure used in DSP.
 The delay line can be implemented using a
linear or a circular buffer.
 Basic operations:
 Read a new data value x(n) every TS
 ACCU=0
 for i=0 to N-1:
 Multiply h(i) by x(n-i) and add it to
accumulator
 Output y(n)

SIEE, Slide 35 Copyrig


Implementation of FIR Filters on C54x
 Implementation of General Transve
rsal FIR filters
 Using linear buffers
 Using circular buffers
 Implementation of Symmetrical FI
R filters

SIEE, Slide 36 Copyrig


Operations using a Linear Buffer for a FIR
with N Coefficients
 Length of the delay line = N samples
 Read a new sample x(n) and store it in the
delay line in the first position.
 ACCU=0
 for i=0 to N-1
 Read h(i) and x(n-i)
 Multiply h(i) by x(n-i) and add it to ACCU
 Output y(n)
 N-1 Shifts in the delay line.

SIEE, Slide 37 Copyrig


Linear Buffer, MACD Mode
 Instead of shifting N-1 samples at the
end, do the shift in the loop one by one.
 Read a new sample xn and store it in the
delay line in the first position.
 ACCU=0
 for i=N-1 to 0
 Read h(i) and x(n-i)
 Multiply h(i) by x(n-i) and add it to ACCU
 Shift x(n-i) in the delay line
 Output y(n)
SIEE, Slide 38 Copyrig
MACD Instruction
 MACD:
 Multiply Accumulate and Delay move.
 MACD Smem, pmad, src
 src=src+Smem*pmad;
 T=Smem;
 (Smem+1)=Smem
 If MACD used in a loop with RPT the
program memory (pmad) address is
automatically incremented.
 MACD alone = 3 cycle times
 In a RPT loop 1 cycle time

SIEE, Slide 39 Copyrig


Implementing a FIR with MACD
 Memory organization of data and coefficients
Program Memory Data Memory
Addresses Content Addresses Content
i=pmad b(N-1) k=Smem x(n)
i+1 b(N-2) k+1 x(n-1)
i+2 b(N-3) k+2 x(n-2)
… … …
i+N-1 b(0) k+N-1 x(n-N+1)
dummy place
for copy of
k+N x(n-N+1)

SIEE, Slide 40 Copyrig


Initialization of Registers
 STM Stores #value to the MMR early
in the pipeline to avoid latencies.
 2 words, 2cycles.
 Initialization of FRCT bit (fractional
mode):
 Instructions SSBX (Set Status Bit) and
RSBX (Reset Status Bit).
 Initialization of ACCU
 Using RPTZ :RePeaT after initializing
ACCU at 0
 Or via LD #0,A
SIEE, Slide 41 Copyrig
RPT, RPTZ Instructions
 RPT #n
 Repeat next instruction n+1 times.
Repetition counter set to n and decreases
until 0.
 1 or 2 cycles, not interruptible.
 RPTZ src, #n
 Same as repeat, except that src ACCU is
cleared to zero before repeat.
 2 cycles , not interruptible.
 Some instructions execute faster when
in repeat mode (pipeline).
SIEE, Slide 42 Copyrig
Implementing a FIR Filter with MACD
.bss adr_debut_dat,N+1
adr_fin_dat .set adr_debut_dat+N-1
.text
* Initialization of AR1 and FRCT
STM #adr_fin_dat, AR1
SSBX FRCT
* Filter loop
RPTZ A, #N-1
MACD *AR1-, adr_coef, A

 Test with CCS


 Filter with N=32 coefficients all equal to 1/32
 Create a file fircoef.asm, address of coefficients in
program mem = adr_coef

SIEE, Slide 43 Copyrig


Implementing a FIR Filter with MACD
 File containing coefficients fircoef.asm
.global adr_coef
.sect ".coef"
adr_coef .word 0X400, 0X400
.word 0X400,0X400,0X400,0X400,0X400
.word 0X400,0X400,0X400,0X400,0X400
.word 0X400,0X400,0X400,0X400,0X400
.word 0X400,0X400,0X400,0X400,0X400
.word 0X400,0X400,0X400,0X400,0X400
.word 0X400,0X400,0X400,0X400,0X400

SIEE, Slide 44 Copyrig


Implementing a FIR Filter with MACD
 File firmacd.asm with the program
 2 files to compile and link:
 fircoef.asm and firmacd.asm
 Test by associating files on the ports
DRR0 and DXR0
 File infir.dat attached to DRR0
 File outfir.dat attached to DXR0

SIEE, Slide 45 Copyrig


Implementing a FIR Filter with MACD
 Program file firmacd.asm: initializations
.mmregs
.global adr_debut_dat
.global adr_fin_dat
.global adr_coef
N .set 32
.bss adr_debut_dat,N+1
adr_fin_dat .set adr_debut_dat+N-1

.text
* Initialization of DP and FRCT
LD #0, DP
SSBX FRCT
* Initialization of AR0, AR1, AR2
STM #(adr_debut_dat),AR2
STM #(adr_debut_dat-1),AR1
STM #N, AR0
SIEE, Slide 46 Copyrig
Implementing a FIR Filter with MACD
 Program file firmacd.asm: endless loop
debut:
* set AR1 at adr_fin_dat
MAR *AR1+0
* Read x(n) at DRR See files
LDM DRR0, A firmacd.asm
STL A,*AR2 and
fircoef.asm
* Endless filter loop for the test in
RPTZ A, #N-1 directory
MACD *AR1-, adr_coef, A tutorial.
* Write y(n) in DXR
* by saving the high part of ACCU in DXR
STH A,DXR0
* Go back to the beginning of the loop
B debut

SIEE, Slide 47 Copyrig


FIR with MACD, Test with CCS
 Create project, create command file,
compile and link.
 To test the impulse response:
 Create a file infir.dat with:
 A value 0.5 (0x4000) then zeros (at least 40)
 Set 2 probe points
 1 at reading of DRR: LDM DRR
 1 at end of loop: B debut
 Attach files to probe points
 infir.dat at 1rst probe point (read value stored
at address 0x20 DRR)
 outfir.dat at second probe point (data at
address 0x21 DXR is strored in the file)
SIEE, Slide 48 Copyrig
Results
 Let program run until end of file
infir.dat
 Load file outfir.dat at some address in
the DSP data memory (File-Data-Load)
 Plot the content of this memory area
(View-Graph-Time/Frequency).
 Plot a time graph (Single Time)
 Plot a frequency graph (FFT: Magnitude
and Phase)

SIEE, Slide 49 Copyrig


Results for the impulse response and its FFT

SIEE, Slide 50 Copyrig


Second Test
 New test with a sine input.
 Replace infir.dat by file insinus.dat
containing 80 samples of a sine with 40
samples per period of sine.
 Name outsine.dat the result file.
 Repeat the same operations as in the
preceding test.

SIEE, Slide 51 Copyrig


Second test
 Observe that the output is attenuated and is phase
shifted by values corresponding at H(f) at fS/40.

SIEE, Slide 52 Copyrig


Implementation using a Circular Buffer
 A circular buffer of length N is a block
of contiguous memory words addressed
by a pointer using a modulo N
addressing mode.
 The 2 extreme words of the memory block
are considered as contiguous.
 Characteristics of a circular buffer:
 Instead of moving the N data in memory,
just modify the pointers.
 When a new data x(n) arrives, the pointer
is incremented and the new data is written
in place of the oldest one.
SIEE, Slide 53 Copyrig
Trace of Memory and Pointer in a Circular
Buffer of Length 3

Time n Time n+1 Time n+2 Time n+3


x(n-1) x(n-1) x(n+2) x(n+2)
x(n) x(n) x(n) x(n+3)
x(n-2) x(n+1) x(n+1) x(n+1)

SIEE, Slide 54 Copyrig


FIR with Circular Buffers
 2 circular buffers
 1 for data
 1 for coefficients
Data Coefficient
Memory memory
adr_deb_data adr_deb_coef b(N-1)
b(N-2)
pnt_coef
pnt_data

adr_fin_coef adr_fin_coef b(0)

SIEE, Slide 55 Copyrig


Operation of FIR with Circular Buffer
 Read a new input sample x(n)
 Store it at address of pnt_data
 ACCU=0
 for i=1 to N-1
 multiply data pointed by pnt_data by
coefficient pointed by pnt_coef. Add
product to ACCU
 decrement pointers pnt_data and pnt_coef
 end
 output y(n) from ACCU
 increment pnt_data of 1
SIEE, Slide 56 Copyrig
Instruction MAC with 2 operands in Indirect
Addressing Mode
 MAC: Multiply and Accumulate
 MAC Xmem, Ymem, src[, dest]
 dst=src+Xmem*Ymem
 T=Xmem
 With Xmem, Ymem use only AR2 to AR5
 Can be executed in 1 cycle time.
 Dual operand instructions indirect
addressing restricted to:
 AR2, AR3, AR4, AR5
 none, +, -, +0%

SIEE, Slide 57 Copyrig


Circular Buffer with C54x
 Circular indirect addressing mode:
 *ARi-%, *ARi+%, *ARi-0%, *ARi+0%,
*ARi(lk)%
 In dual operand mode Xmem, Ymem:
 *ARi+0% only valid mode
 To perform a decrement, store a negative value
in AR0.
 BK register:
 Stores the size N of the circular buffer.
 Must be initialized before use.
 There may be several circular buffers at
different addresses at the same time but
with the same length.
SIEE, Slide 58 Copyrig
Limitations on Start Addresses of Circular
Buffers
 If N is written on nb bits in binary, the
start address must have its nb LSB at 0:
 Examples:
 for N=32, 6 LSB of start address =0
 for N=30, 5 LSB of start address =0
 To access a circular buffer:
 Initialize BK with N (nb bits)
 Choose 1 ARi as a pointer
 The effective start address of the buffer is the
value in ARi with its nb LSB at 0.
 The end address = start addess +N-1.

SIEE, Slide 59 Copyrig


Circular buffer on C54x

Data Memory ARi BK


Start_address =
xxxxxxxxxxx00000 xxxxxxxxxxx00010 N=30=1 1 1 1 0

ARi

End_address =
xxxxxxxxxxx11111

SIEE, Slide 60 Copyrig


Implementation of FIR Filter
with 2 Circular Buffers
 Same filter as in the preceding example,
coefficients in section .coef (in program
memory) in file fircoef.asm.
 N=32
 2 buffers are allocated in data memory
for the coefficients and the data of the
filters
 Start addresses must be multiple of 64.
 First step of program after initialization:
 Transfer coefficients from program to data
memory from adr_coef to adr_debut_coef.
SIEE, Slide 61 Copyrig
Move Instructions
 MVPD #pmad, Smem
 Copy values from program to data memory
 In RPT mode pmad is automatically
incremented.

Program Data MMR Data


MVPD, MVDP
MVMD, MVDM
READA, WRITEA
Data Data MMR MMR
MVKD, MVDK, MVDD MVMM

SIEE, Slide 62 Copyrig


Implementation of FIR with 2 Circular
Buffers, Initializations
.mmregs
.global adr_debut_dat
.global adr_fin_dat
.global adr_debut_coef
.global adr_fin_coef
.global adr_coef

N .set 32
adr_debut_dat .usect "buf_data", N
adr_debut_coef .usect "buf_coef", N
adr_fin_dat .set adr_debut_dat+N-1
adr_fin_coef .set adr_debut_coef+N-1

.text
* Initialization of BK,AR0,FRCT
STM #N, BK
STM #-1, AR0
SSBX FRCT
* Initialization of AR2, AR3
STM #(adr_debut_dat),AR2
STM #(adr_fin_coef),AR3
SIEE, Slide 63 Copyrig
Implementation of FIR with 2 Circular
Buffers, Program
* Transfer of coefficients from
* program to data memory
STM #adr_debut_coef, AR4
RPT #N-1
MVPD adr_coef, *AR4+

* Endless loop See files


debut: fircirc.asm
* Read x(n) at DRR and
LDM DRR0, A fircoef.asm
for the test.
STL A, *AR2
* Calculation of y(n)
RPTZ A, #N-1
MAC *AR2+0%, *AR3+0%, A
* Write y(n) in DXR
* by saving high part of ACCU
STH A, DXR0
* Go back to the beginning of the loop
MAR *AR2+
B debut
SIEE, Slide 64 Copyrig
Command File for Circular Buffer
Addressing Constraint
 The addresses adr_debut_dat and
adr_debut_coef have to be aligned with
a multiple of 64 in the example.
 adr_debut_dat is the start address of
unitialized section buf_data.
 adr_debut_coef is the start address of
unitialized section buf_coef.
 To align the 2 sections on a multiple of 64,
in the command file add align(64) after the
name of the sections in the MEMORY
directive, for example:
 buf_data align(64) > DATA
page 1
SIEE, Slide 65 Copyrig
Implementation of a Symmetrical FIR filter
 The symmetry of coefficients is used to decrease the
computational load:
 b(n)=b(N-1-n)
 N time cycles for a general FIR filter with N
coefficients is N (in good conditions).
 N/2 time cycles for a symmetrical FIR filter.
 Use of specific instruction FIRS.

N
1
2
y (n)   b(i )  x(n  i )  x(n  N  1  i )  N even
i 0
N 1
1
2  N 1   N 1 
y (n)   b(i )  x(n  i )  x(n  N  1  i )   b  x
  n   N odd
i 0  2   2 

SIEE, Slide 66 Copyrig


FIRS Instruction to Work with RPT(Z)
 FIRS Xmem, Ymem, pmad
 Xmem, Ymem corresponds to:
 x(n-i), x(n-N+1+i)
 Coefficients in program memory pmad
 operations of FIRS:
 pmad PAR
 while RC  0
 B = B + A(32:16) x Pmem addressed by PAR
 A = (Xmem+Ymem)<<16
 PAR=PAR+1
 RC=RC-1
SIEE, Slide 67 Copyrig
Using FIRS for a Symmetrical FIR Filter
 3 arrays:
 N/2 first coefficients,
 N/2 newest data and N/2 oldest data.
Program Data
Memory Memory
adr_debut_coef adr_debut_dat0
b(0) x(n-2)
PAR AR2
b(1) x(n)
b(2) x(n-1)

adr_debut_dat1
x(n-3)
AR3
x(n-5)
x(n-4)
Example for N = 6
2 circular
buffers
SIEE, Slide 68 Copyrig
Using FIRS for a Symmetrical FIR Filter
 BK = N/2
 At the beginning AR2 and AR3 point to:
 the newest data x(n)
 and the oldest data x(n-N+1)

Beginning After N/2 +1 incrementations

x(n) x(n-N+3) x(n) x(n-N+3)


x(n-1) x(n-1)

x(n-N/2) x(n-N/2)
x(n-N+1) x(n-N+1)
x(n-N/2-1) x(n-N+2) x(n-N/2-1) x(n-N+2)

SIEE, Slide 69 Copyrig


Using FIRS for a Symmetrical FIR Filter
 FIRS is repeated N/2 times
 The first sum x(n)+x(n-N+1) is done
before entering the loop.
 N/2 iterations (AR2 and AR3 incremented
by 1):
 At the first iteration AR2 points on x(n-1) and
AR3 on x(n-N+2)
 After N/2 iterations: AR2 is decremented of 2
and AR3 of 1.
 The oldest sample x(n-N/2+1) of 1st buffer is
stored in 2nd buffer in place of x(n-N+1).
Then AR is incremented by 1.
 New sample x(n+1) is stored in place of x(n).
SIEE, Slide 70 Copyrig
Symmetrical FIR Implementation with FIRS,
Initializations
.mmregs
.global adr_debut_coef
.global adr_debut_dat0
.global adr_debut_dat1
N .set 32
Nsur2 .set 16
adr_debut_coef .set adr_coef
adr_debut_dat .usect "buf_data0", N
adr_debut_dat1 .usect "buf_data1", N

.text
* Initialization of BK, AR0,FRCT
STM #Nsur2, BK
STM #-2, AR0
SSBX FRCT
* Initialization of AR2, AR3
STM #(adr_debut_dat0),AR2
STM #(adr_debut_dat1),AR3

SIEE, Slide 71 Copyrig


Symmetrical FIR Implementation using
FIRS, Program
* Endless loop
debut:
* Read x(n) at DRR
LDM DRR0, A
STL A, *AR2
* Calculation of y(n)
* Calculation of the first sum See files
ADD *AR2+0%,*AR3+0%,A firsym.asm
* Repeat N/2 times FIRS and
RPTZ B, #(Nsur2-1) fircoef.asm
FIRS *AR2+0%, *AR3+0%, adr_coef for the test.
* Write y(n) at DXR
* by saving high part of ACCU in DXR
STH B, DXR0
* Transfer of the oldest value of 1rst array
* to the oldest value of the 2nd array
MAR *+AR2(-2)%
MAR *AR3-%
MVDD *AR2, *AR3+0%
* Go back to the beginning of the loop
B debut
SIEE, Slide 72 Copyrig
Tutorial
 The listing files for the prceent examples
can be found in directory tutorial:
 Tutorial > Dsk5416 > Chapter 14 > Labs_fir

SIEE, Slide 73 Copyrig


Implementation of FIR Filters on C55x
 Implementation of block filters

 Implementation of symmetrical or a
symmetrical FIR filters

SIEE, Slide 74 Copyrig


Implementation of FIR Filters using C55x
 2 MAC units accessed using 3 data buses
D, B, C make it possible to:
 Calculate 2 output samples y at a time using
same set of coefficients and different data x.
 Calculate 2 output samples y at a time using
same input data x but 2 set of coefficients.
Data Read Buses

MAC
t MAC

AC
A0
AC1
SIEE, Slide 75 Copyrig
Using the 2 MAC Units
 Use of block Data Read Buses
filtering in order to
calculate 2 output
samples at a time. MAC
t MA C

yn b 0 x n + b 1 x n-1 + b 2 x n-2 + b 3 x n-3


AC
A0
=
AC1
y n+1 = b 0 x n+1 + b 1 x n + b 2 x n-1 + b 3 x n-2

C55x MAC *AR2+, *CDP+, AC0 :: MAC *AR3+, *CDP+, AC1

yn = b 0 x n + b 1 x n-1 + b 2 x n-2 + b 3 x n-3

C54x MAC *AR2+, *AR3+, A

SIEE, Slide 76 Copyrig


Block Filter
 Calculate a block of M output samples:
 Avoids interrupts sample by sample
 Allows calculation of 2 samples at a time

N 1
yn m   bi xn mi m  0, M  1.
i 0

 M+N-1 inputs necessary to calculate M output


samples.
 Because of N-1 initial conditions.

SIEE, Slide 77 Copyrig


Block Filter, example N=4, M=3
Coeffcients Input data
CDP b0 AR2 xn
b1 AR3 xn-1
b2 xn-2
b3 xn-3
xn-4
xn-5

yn = b0xn+b1xn-1+b2xn-2+b3xn-3
yn-1 = b0xn-1+b1xn-2+b2xn-3+b3xn-4
yn-2 = b0xn-2+b1xn-3+b2xn-4+b3xn-5
SIEE, Slide 78 Copyrig
Block Filter Example
 Double loop:
 On coefficients and on m
 Coefficients accessed by CDP:
 CDP (Cmem) modifications limited to:
*CDP, *CDP+, *CDP-, *(CDP+T0).
 CDP uses B bus only for dual-MAC.
Because B bus is internal only, coefficients
must also be internal.
 Place data operands carefully to avoid
memory conflicts (SA/DARAM).

SIEE, Slide 79 Copyrig


Using Dual MAC
yn = b 0 x n + b 1 x n-1 + b 2 x n-2 + b 3 x n-3

y n+1 = b 0 x n+1 + b 1 x n + b 2 x n-1 + b 3 x n-2


CDP AR2 AR3
Coeffcients Input data
B CDP b0 AR2 xn
C b1 AR3 xn-1
D b2 xn-2
MAC MAC b3 xn-3
xn-4
xn-5
AC0

AC1

MAC *AR2+, *CDP+, AC0 :: MAC *AR3+, *CDP+, AC1

SIEE, Slide 80 Copyrig


Initialization of Pointers
 Use AMOV to do transfers during the
“AD” pipeline phase.
 Init AR2 to point to the 1st value of
input data : (x)
 Init AR3 to point to the 2nd value of
input data (x+1)
 Init CDP to point to coefficient array (a)
AMOV #x,XAR2
AMOV #(x+1),XAR3
AMOV #a0,XCDP

SIEE, Slide 81 Copyrig


Inner Loop on Coefficients
RPT #3
MAC *AR2+,*CDP+,AC0
:: MAC *AR3+,*CDP+,AC1

Pointers at the end of the repeat instruction:


Coeffcients Input data Reinitialization of
pointers for next
CDP b0 xn output sample:
b1 xn-1
b2 AR2 xn-2
ASUB #2,AR2
ASUB #2,AR3
b3 AR3 xn-3
MOV #a0,CDP
CDP AR2 xn-4
AR3 xn-5

SIEE, Slide 82 Copyrig


Circular Addressing Mode for Coefficients
 Initialize size of the circular buffer: BK
 Set up Buffer Start Address: BSA and
Xeven
 Set up ARi or CDP
 No memory alignment constraint

b0 Xeven : BSAxx
b1
BKzz
b2 ARn/CDP
b3

SIEE, Slide 83 Copyrig


Circular Buffer Addressing Mode

Buffer Start Address = Xeven[22:16] BSAxx[15:0]

Offset into Buffer = + ARn/CDP

Calculated Address = Xeven[22:16] BSAxx + ARn/CDP

Buffer Length = BKzz[15:0]

SIEE, Slide 84 Copyrig


Circular Buffer Addressing Mode
Buffer
Block size
Offset Xeven Start
Register
Address
AR0
XAR0[22:16] BSA01
AR1
BK03
AR2
XAR2[22:16] BSA01
AR3
AR4
XAR4[22:16] BSA01
AR5
BK03
AR6
XAR6[22:16] BSA01
AR7
CPD XCDP[22:16] BSAC BKC
The even XARn (i.e. 0,2,4,6) determines the 64K Page
SIEE, Slide 85 Copyrig
Selecting Circular or Linear Addressing
Mode
 Use the LSB of Status word ST2_55
15 9 8 7 6 5 4 3 2 1 0
C A A A A A A A A
D R R R R R R R R
other bits or rsvd P 7 6 5 4 3 2 1 0
ST2_55 L L L L L L L L L
C C C C C C C C C

0 = linear mode 1 = circular mode


(default)
 Set or reset status bits:
BSET AR5LC ;AR5 in circular mode
BCLR AR3LC ;AR3 in linear mode

SIEE, Slide 86 Copyrig


Circular Buffer Exercise
Use AR4 as a circular pointer to x{5}: x
A
ARR44 7 0
1 1
.sect “data”
x .int 7,1,9,6,2 ;init data 9 2
.sect “code” 6 3
__________________
AMOV #x,XAR4 ;init XAR
__________________
MOV #x,BSA45 ;init start addr 2 4
__________________
MOV #5,BK47 ;init length
__________________
MOV #0,AR4 ;init AR4 to top
__________________
BSET AR4LC ;set AR4 to circ

MOV #3,T0 ;index


MOV *(AR4+T0),AC0 ;AC0 =_7__, AR4 =_3__
MOV *+AR4(#4h),AC1 ;AC1 =_9__, AR4 =_2__
MOV *AR4(T0),AC2 ;AC2 =_7__, AR4 =_2__
Results are
cumulative

SIEE, Slide 87 Copyrig


Circular Buffer for Coefficients
 Table of coefficients b0 … b3:
 Circular buffer addressed by CDP.
 Initialize XCDP: 7 MSB
 Initialize CDP to 0: offset in the buffer
 Set up CPD in circular addressing mode

s1: AMOV #x,XAR2


AMOV #a0,XCDP
AMOV #(x+1),XAR3
MOV #a0,BSC
MOV #0,CDP
MOV #4,BKC
BSET CDPLC
SIEE, Slide 88 Copyrig
Store Results, 32-bit Moves
 Assuming fractional mode, 2 results are
in high parts of AC0 and AC1
 AC0 and AC1 can be saved separately:
MOV HI(AC0), *AR4+
MOV HI(AC1), *AR4+
 AC0, AC1 can be saved at the same time:
MOV pair(hi(AC0)),dbl(*AR4+)
 Pairs: (AC0,AC1), (AC2,AC3)
 ARi incremented of 2
 Even align y
SIEE, Slide 89 Copyrig
Block Filter Inner Loop
s1: AMOV #x,XAR2
AMOV #a0,XCDP
AMOV #(x+1),XAR3
AMO V #y, XAR 4
MOV #a0,BSAC
MOV #0,CDP
MOV #4,BKC
BSET CDPLC
MOV #0,AC0
MOV #0,AC1
RPT #3
MAC *AR2+,*CDP+,AC0
::MAC *AR3+,*CDP+,AC1
ASUB #2,AR2
ASUB #2,AR3
e1 : MOV pai r(h i(AC 0)) ,db l(* AR4 +)

SIEE, Slide 90 Copyrig


Outer Loop Using RPTB or RPTBlocal
 Use RPTB Repeat Block instruction
 We must specifiy:
 Start address of the block: next instruction
 End address: label specifies last instruction
 The number of repetitions counter:
 BRC0: loop counter initialized with count-1
 Min count = 2
 RPTBlocal: executes from the IBU
 56 bytes maximum (if > 56 Bytes use RPTB)
 Reduces power consumption
SIEE, Slide 91 Copyrig
Outer Loop on m: Calculate M yn-m
s1: AMOV #x,XAR2
AMOV #a0,XCDP
AMOV #(x+1),XAR3
AMOV #y,XAR4
MOV #a0,BSAC
MOV #0,CDP
MOV #4,BKC
BSET CDPLC
MOV #((samps-taps)/2),BRC0
RPTBLOCAL e1
MOV #0,AC0
MOV #0,AC1
RPT #3
MAC *AR2+,*CDP+,AC0
:: MAC *AR3+,*CDP+,AC1
ASUB #2,AR2
ASUB #2,AR3
e1: MOV pair(hi(AC0)),dbl(*AR4+)
SIEE, Slide 92 Copyrig
More Nested loops ?
 Nesting RPTB or RPTBlocal:
 2 levels supported using BRC0 (outer) and
BRC1/BRS1 (inner)
 No saving of registers required for nested
block repeat.
M OV #o ut er _ cn t, BR C0 ; lo ad o ut er l o op c ou nt
M OV #i nn er _ cn t, BR C1 ;l o ad B RC 1, a ut o -l oa d BR S1
R PT BL OC AL o u te r ;u se B R C0
. . .
R PT BL OC AL i n ne r ;B R C1 : de cr em en t s, B RS 1- no ch an ge
. . .
i n ne r: l as t_ in n er
. . .
o u te r: l as t ou t er

SIEE, Slide 93 Copyrig


Laboratory on Block Filter
 Implement a block FIR with 16 coefficients
and input block size = 200.
 Implement subroutine
C5 51 0
64Kx8 SARAM0 8Kx8
1_0000h a{16}
FF_0000h EPtable{16}
ROM
code DARAM2 8Kx8
4000h x{200}
FF_FF00h vectors DARAM3 8Kx8
6000h 16Kx8
SP/SSP CE0
5_0000h y
AC0

All addresses and lengths are shown in bytes

SIEE, Slide 94 Copyrig


Using the Stack and Subroutines
 Subroutines require call and ret.
 During a call the return address is
stored in the Stack SP.
 Let us call fir the subroutine:
 call fir

SIEE, Slide 95 Copyrig


Initialize the Stack
 Declare an unitialized section (.usect) of
appropriate length to reserve space.
 Initialize stack pointer to point to the
top of stack +1.
 Recommendation: place the stack in
internal memory and align on a 4-byte
boundary:
M em
 ALIGN= specifies bytes 0
Size .set 100h
Stack .usect "STK",size STK
AMOV #(stack+size),XSP
SP

SIEE, Slide 96 Copyrig


The System Stack SSP
 When a call occurs PC[15:0] is pushed
on the stack
 The upper 8 bits SP[23:16] are pushed
on the system stack accessed by SSP
System Stack Pointer.
 CFCT is used to store the active loop
context.
 WSP and XSSP share the same upper 7
bits.
 Place SP and SSP with care to avoid
dual-access delays.
SIEE, Slide 97 Copyrig
Data Types
 Byte: 8 bits
 Word: 16 bits
 Long: 32 bits
 Long access assumes address points to MSW
 LSW read from same address with LSB toggled.
 Ptr=100h, MSW=100h, LSW = 101h
 Ptr=101h, MSW=101h, LSW = 100h
 To ensure proper alignment:
 Constants (int, long) are automatically aligned on
type boundaries
 Variables:
 16 bit: no problem
 32 bits use: use the even-align flag:
 .usect “vars”,Nwords,,1
SIEE, Slide 98 Copyrig
Solution: Declarations
.sect "indata"
x0 .copy in7.dat
.def start
.cpl_off
.arms_off
.c54cm_off
stklen .set 100
a0 .usect "coeffs",16,1,1
y0 .usect "results",200,1,1
BOS .usect "STK", stklen,1,1
BOSS .usect "SSTK",stklen,1,1
.sect "init"
table .int 7FCh, 7FDh, 7FEh, 7FFh
.int 800h, 801h, 802h, 803h
.int 803h, 802h, 801h, 800h
.int 7FFh, 7FEh, 7FDh, 7FCh
SIEE, Slide 99 Copyrig
Solution: Code

.sect "code"
.DP a0
start: AMOV #BOS+stklen,XSPc ;set up Stack +
MOV #BOSS+stklen,SSP ;System Stack Ptrs
CALL copy ;copy coeffs

BSET FRCT ;turn on mult. shift


BSET M40 ;turn on 40 bit math
BSET SXMD ;turn on sign exten.
CALL fir ;perform fir
nop
here: B here ;stop

SIEE, Slide 100 Copyrig


Solution: Subroutine copy

copy: AMOV #table,XAR2 ;load pointers


AMOV #a0,XAR3
RPT #7
MOV dbl(*AR2+),dbl(*AR3+)
;move from table to a
RET

SIEE, Slide 101 Copyrig


Solution: Subroutine fir
fir: MOV #92,BRC0 ;block repeat count
AMOV #x0,XAR2 ;initialize pointers
AMOV #x0+1,XAR3 ;for data,
AMOV #y0,XAR4 ;results
AMOV #a0,XCDP ;and coeffiecients
MOV #a0,BSAC ;buffer start address
MOV #16,BKC ;buffer size
MOV #0, CDP ;index
BSET CDPLC ;turn on circ adr CDP

RPTBlocal end
MPYM *AR2+,*CDP ,AC0 ;AC0 1st product
MPYM *AR3+,*CDP+,AC1 ;AC1 gets 2nd prd
RPT #14
MAC *AR2+,*CDP+,AC0 ;form results
:: MAC *AR3+,*CDP+,AC1
MOV pair(hi(AC0)),dbl(*AR4+) ;store AC0/AC1
ASUB #14,AR2 ;wrap data pointers
end ASUB #14,AR3 ;next calculation
RET

SIEE, Slide 102 Copyrig


Implementation of Symmetrical and
Anti-symmetrical FIR filters on ‘C55x
Symmetrical Anti-symmetrical
Coeff Coeff
s s
b0 b1 b2 b3
b0 b1 b2 b3 b4 b5 b6 b7 b4 b5 b6 b7

 These filters may be “folded” and performed with N adds and N/2 MACs
 Filters need to be designed as even length

N
1
2
y (n)   b(i )  x(n  i )  x(n  N  1  i )  N even.
i 0

SIEE, Slide 103 Copyrig


Instructions FIRSADD and FIRSSUB
 FIRSADD Xmem,Ymem, coef,Acx,Acy
 Acy = Acy + (Acx x (*CDP))
 || Acx = Xmem + Ymem
 For symmetrical FIR
 FIRSSUB Xmem,Ymem, coef,Acx,Acy
 Acy = Acy + (Acx x (*CDP))
 || Acx = Xmem - Ymem
 For anti-symmetrical FIR
 If performing a block FIR, dual MAC has
better performance than FIRS.
 A design consideration for migration from
‘C54x.
SIEE, Slide 104 Copyrig
Comparison of C54x and C55x
 2 MAC in ‘C55x versus 1 for C54x
 Well suited for block filtering and 2 taps
per cycle time instead of 1 (for large N).
 Circular addressing modes:
 3 BK registers in C55X instead of 1 in
‘C54x: allows for several simultaneous
circular buffers with different size.
 In C54x, circular addressing mode is
specified in indirect addressing type % in
the instructions.
 In C55x, the mode in set in status register
ST2_55 for each register (linear or
circular). No memory alignment constraint.
SIEE, Slide 105 Copyrig
Comparison of C54x and C55x
Symmetrical and Anti-symmetrical
FIR Filters
 In C54x, instruction FIRS:
 Allows 2 taps/cycle for a symmetrical FIR
 In C55x, instructions FIRSADD +
FIRSSUB:
 Allow us to efficiently implement
symmetrical and anti-symmetrical FIRs.
 Despite the 2 MACs, as there is only 1 ALU,
again 2 taps/cycle for symmetrical or anti-
symmetrical FIRs.

SIEE, Slide 106 Copyrig


Follow On Activities on 5416 DSK
 Laboratory 3 for TMS320C5416 DSK
 To determine by practical experiment the best
FIR window functions for audio.
 Laboratory 4 for TMS320C5416 DSK
 To determine by experiment how many FIR
coefficients are required for acceptable audio
quality.
 Application 4 for TMS320C5416 DSK
 Electronic Crossover for multiple loudspeaker
system. Divides audio signal into treble and bass at
16 different selectable frequencies using FIR
filters.

SIEE, Slide 107 Copyrig


Follow on activities on 5510 DSK
 Application “delays and echo” for
TMS320C5510 DSK
 Simulates delays in communications
networks and reflection of sound heard in a
canyon. Introduces circular buffers and the
configuration used for a Finite Impulse
Response (FIR) filter.

SIEE, Slide 108 Copyrig

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy