Implementation of High Performance FIR Filter Using High Speed & Low Area Multiplier
Implementation of High Performance FIR Filter Using High Speed & Low Area Multiplier
(1)
I. INTRODUCTION Here, x [n] and y [n] are the filter input and filter output
The multiplier [1]-[3], [5] is one of the key hardware blocks in respectively, a [k] is the filter coefficients, N is the filter
most of high performance systems such as digital signal coefficient number. The denotes summation from = 0 to k
processors and microprocessors [2]. With the fast advances in = n where n is the number of feed forward taps in the FIR
technology, many researchers are working on the most filter. Transfer function of FIR filter can be represented as [3]
efficient multipliers [5]. They key requirement is not only [9]:
higher speed and lower power consumption but also
occupying reduced silicon area. This makes them well-suited (2)
for various complex and portable VLSI circuit [6]
implementations. However, the reality is that the area and The frequency response realized in the time domain is of
speed are two conflicting performance factors. Thus, more concern for FIR filter realization (both hardware and
increased speed always results in larger area. In this paper, we software). The transfer function can be calculated via the
found a better trade-off between the two, by realizing a z-transform of a FIR filter frequency response [9].
marginally decreased delay which increases the speed
performance [3] through a small rise in area such that increase
III. DIGITAL ADDERS
in the number of transistors [6]. The new design lowers the
delay of the widely approved Wallace tree multiplier [7]. On
the conventional multiplier, the structural optimization is In digital electronics, adder is a type of digital circuit that
performed, in such a way that the latency of the total circuit performs addition of two numbers. As described in [10], many
reduces significantly. The Wallace tree basically multiplies computers and other kinds of processors, adders are common
two unsigned integers [7]. In this project we compare the not only in the ALU(s), but also in other parts of the
working & the characteristics of different multiplier [8] processor, where they calculate addresses, table indices, and
individually and then choosing the perfect multiplier by many more.
implementing each of them separately in FIR filter. A. Ripple Carry Adder
The parallel multipliers like radix 2 and radix 4,Wallace
multiplier[8] perform the computations using less number A ripple carry adder is a digital circuit that produces the
adders and thus have lesser iterative steps which results in arithmetic sum of two binary numbers. Full adders [12] are
requiring lesser space as compared to the serial multiplier. cascaded to construct ripple carry adder, with the carry output
Here now we are comparing Booth and Wallace multiplier from each full adder linked to the carry input of the next full
[13] [15] to find the efficient one. Area is a very important adder in the chain. As shown in Figure 1 the interconnection
factor because in the fabrication of chips [1] and high of four full adder (FA) circuits to provide a 4-bit ripple carry
adder [9] [12]. It can be seen from Figure the input is coming
from the right side because the first cell traditionally
Manuscript received .
Bhawana Datwani, M.Tech. Scholar, Department of ECE, Jagannath
represents the least significant bit (LSB). Bits a0 and b0 in the
University, Jaipur, India, (e-mail: bhawana2190@gamil.com). figure represent the least significant bits of the numbers to be
added. The s0-s3 expressing the sum output.
Himanshu Joshi, Assistant Professor, Department of ECE, Jagannath
University, Jaipur, (Himanshu.joshi@jagannathuniversity.org)
209 www.erpublication.org
Implementation of High Performance FIR filter using High Speed & Low Area Multiplier
210 www.erpublication.org
International Journal of Engineering and Technical Research (IJETR)
ISSN: 2321-0869 (O) 2454-4698 (P), Volume-5, Issue-2, June 2016
Array multiplier is well recognized because of its regular Table 3Partial Products generation
structure. Multiplication process is based on repeated
addition and shifting procedure. Each partial product is
generated by the multiplication of the multiplicand with one
multiplier digit then these partial product are shifted
according to their bit sequences and then added [9] [13]. The
summation can be performed with normal carry propagation
adder. Total N-1 adders are required where N is the no. of
multiplier bits. The n-operand array consists of n-2
compressor.
Table 4 Final Shift
B. Radix-2 Booth Multiplier
This is technique that allows for smaller, faster
multiplication circuits by recoding the numbers that are
multiplied. Partial products is reduced by factor 2which
implies that it allows only half of product which is needed
during computation. The Booths algorithm is for multiplying
binary signed number in 2s complement [9] [13]-[14]. Let R
and M are the multiplicand and multiplier respectively; and
let n and q represent the number of bits in R and M. Take the
Arithmetically shift the value calculate in step1-4 by signal
2s complement of R which is given as R. For calculation,
place of right.
make the table of U, V, X and X-1 variable, respectively.
b) Take U & V together and arithmetically right shift which
Step1: store the sign bit of 2s complement number. Hence a positive
a) Fill M value in the table. number and a negative number remains unchanged.
b) Fill 0 for M-1 value it should be the previous first least c) Right shift circulate M due to this not use of two for the M
significant bit of M. value.
c) Fill 0 in U and V rows which show the product of M and X d) Repeat the same steps until the n cycles are completed. So
at the end of multiplication operation. the answer is shown, in the last rows of U and V.
d) Take n rows for every cycle; this is because we are
multiplying n bits numbers.
C. Radix-4 Booth Multiplier
Table 1 Making of Booth table The shortcomings of Radix-2 can get overcome by Radix-4
[13] [14] [15] in which it handle more than one bit of
multiplier in each cycle. The modified Booth's algorithm
starts by appending a zero to right of LSB of multiplier. This
recoding scheme applied to a parallel multiplier halves the no.
of partial products so the multiplication time & hardware
requirement can get reduce [8] [14].
Radix-4 Booth algorithm examines strings of three bits
according to the following algorithm given below[14]:
Step2: Booth algorithm requires evaluation of the multiplier a) Increase the sign bit 1 position if required to verify that n is
bits, and shifting of the partial product. Use the first least even.
significant bits of the multiplier M, and the previous least b) The right side of the LSB of the multiplier adds with 0.
significant bits of the multiplier M - 1 to determine the c) As per the value of all vectors, all Partial Product will be 0,
arithmetic action. +y, -y, +2y or -2y.
The values of y are comes negative due to taking the 2s
Table 2 Shift in Booth table complement. The multiplication of y performs by left shifting
y by one bit. As a result implementing of n-bit parallel
multipliers, only n/2 partial products are created[9][14].
211 www.erpublication.org
Implementation of High Performance FIR filter using High Speed & Low Area Multiplier
Wallace 27 47 21 23 8.51
Multiplier 1ns
VI. CONCLUSION
This paper is the clear model of different multiplier and their
implementation in tap delay FIR filter. We found that the
Wallace multipliers are much option than the serial multiplier.
We concluded this from the result of delay and the total area.
In case of Wallace multipliers, the total area is much less than
that of boothl multipliers. Hence the power consumption is
also less. This is clearly depicted in our results. This speeds
up the calculation and makes the system faster. While
Figure 5 Structure of Wallace Multiplier comparing the radix 2 and the radix 4 booth multipliers we
found that radix 4 consumes lesser power than that of radix 2.
V. RESULTS we found that Wallace multiplication method is better than
other multipliers in terms of speed, area and power. So by
After analyzing both the multipliers, and compare their using Wallace multiplier we can achieve the fast and efficient
characteristics in terms of multiplication speed, no of multiplication.
computations required, no of hardware, we come on finding
that Wallace multipliers is much better than Booth
VII. FUTURE WORK
multipliers. By implementing both Radix-2 & Radix -4 and
Wallace multiplier and we analysis that their computation One possible direction is to increase the number of bit of
speed of Wallace multiplier is faster. multiplier. We have only considered 8 bit for encoding as it is
In this project these multipliers are implemented with FIR a simple and popular choice. Higher number of bits recoding
filters to compare some characteristics like the speed, power further reduces the number of LUT's and thus has the potential
consumption, computations, hardware requirement of the of reducing the area.
system. Coding of all the multipliers have done separately in
REFERENCES
VHDL & simulate it to get the accurate waveforms as output.
[1] C. H. Chang, J. Gu, M. Zhang, Ultra low-voltage low power CMOS
Then we implement these multipliers separately with FIR
4-2 and 5-2 compressorsfor fast arithmetic circuits IEEE Transactions
filters using computation techniques like FFT, DFT. These on Circuits and Systems .
coding also written in VHDL language & simulate it to get the [2] Tan,Li Digital Signal Processing: Fundamentals and Applications
RTL circuit of each system. Also get the lookup table , where Edition 2007.
[3] M.Gnanasekaran, M. Manikandan " Low delay-High compact FIR
we get the exact no of i/p, o/p/ no of slices requirement etc for filter using Reduced Wallace Multiplier" 2nd International Conference
the system. Xilinx Estimator analysis these simulated results on Current Trends in Engineering and Technology, ICCTET14
& determine the power consumption of each system. These IEEE 2014.
[4] Proakis,J.G., Manolakis,D.G., Digital Signal Processing 3rd Edition,
results are given below. PHI publication, 2004.
[5] Aparna, P.R., Thomas, N."Design and implementation of a high
Table 5 Area & Delay of Radix -4 Booth multiplier performance multiplier using HDL" International Conference on
Computing, Communication and Applications (ICCCA), 2012
No. of slices 204 (978-1-4673-0270-8).
No. of 4-input LUTs 391 [6] H. B. Bakoglu and J. D. Meindl, Optimal Interconnection Circuits for
No.of bonded I/Os 65 VLSI, IEEE Transactions on Electron Devices, Vol. ED-32, No. 5, pp.
903-909, May 1985.
Delay(ns) 6.177 [7] Vinoth, C.; Bhaaskaran, V.S.K.; Brindha, B.; Sakthikumaran, S.;
Kavinilavu, V.; Bhaskar, B.; Kanagasabapathy, M.; and Sharath, B.;
Table 6 Area & Delay of Wallace Multiplier "A novel low power and high speed Wallace tree multiplier for RISC
processor," 3rd International Conference on Electronics Computer
No. of slices 9 [8] Chen Ping-hua and Zhao Juan; "High-speed Parallel 3232-b
No. of 4-input LUTs 16 Multiplier Using a Radix-16 Booth Encoder," Third International
No.of bonded I/Os 16 Symposium on Intelligent Information Technology Application
[9] D.Jaya Kumar, Dr.E.Logashanmuga "Performance Analysis of FIR
Delay(ns) 5.895 filter using Booth Multiplier" 2nd International Conference on Current
Trends in Engineering and Technology, ICCTET14 IEEE 2014.
Table 7 FIR using Different multipliers
[10] H.Chang, J.Gu, M.Zhang (2004) ,Ultra low-voltage low-power CMOS
Type of No.of No. of No. No. of Dela 4-2 and 5-2 compressors for fast arithmetic circuits,Circuits and Systems
multiplier Slices 4-input of slice y Regular Papers, IEEE Transactions page(s): 1985- 1997, Volume: 51,
Issue: 10,Oct. 2004.Rashmi Ranjan et al./ International Journal of
LUTs bonde FFs Computer Science & Engineering Technology (IJCSET)
d I/Os [11] Aparna, P.R., Thomas, N."Design and implementation of a high
Radix-4 157 297 26 45 15.4 performance multiplier using HDL" International Conference on
Computing, Communication and Applications (ICCCA), 2012
Booth 4ns (978-1-4673-0270-8)
Multiplier
212 www.erpublication.org
International Journal of Engineering and Technical Research (IJETR)
ISSN: 2321-0869 (O) 2454-4698 (P), Volume-5, Issue-2, June 2016
[12] Rashmi Solanki, Prashant Gurjar, Pooja Kansliwal, Mahendra
Vucha"VLSI Implementation of Adders for High Speed
ALU"International Journal of Computer Application, September
2011.
213 www.erpublication.org