0% found this document useful (0 votes)
21 views5 pages

Ijspr 5901 30318

Uploaded by

anuescapist
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views5 pages

Ijspr 5901 30318

Uploaded by

anuescapist
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

INTERNATIONAL JOURNAL OF SCIENTIFIC PROGRESS AND RESEARCH (IJSPR) ISSN: 2349-4689

Issue 159, Volume 59, Number 01, May 2019

Review Paper on Matrix Multiplication using


IEEE 754 Floating Point and Different Types of
Adders
Manjusha Kumari1, Vijay Yadav2
1M. Tech. Scholar, 2Assistant Professor
Department of Electronics and Communication, LNCT, Bhopal

Abstract-Due to advancement of new technology in the field of


Z = ( −1s ) × 2 ( E − Bias ) × (1.M ) …………………….(1)
VLSI and Embedded system, there is an increasing demand of
high speed and low power consumption processor. Speed of
processor greatly depends on its multiplier as well as adder
Value = (−1signbit ) × 2 ( Exponent −1023) × (1.Mantissa ) ……………..(2)
performance. In spite of complexity involved in floating point
Biasing makes the values of exponents within an unsigned
arithmetic, its implementation is increasing day by day. Due to
which high speed adder architecture become important. Several
range suitable for high speed comparison.
adder architecture designs have been developed to increase the
efficiency of the adder. In this paper, we introduce an
architecture that studied of matrix multiplication using IEEE
754 floating point multiplier and different types of adder.
Keywords-IEEE 754, Single Precision Floating Point (SPFP),
Double Precision Floating Point (DPFP)
Figure 1 IEEE 754 Single precision and double precision
I. INTRODUCTION floating point format.
Matrix multiplication is commonly used in most signal • IEEE 754 standard floating point multiplication
processing algorithms. It is also a frequently used kernel algorithm
operation in a wide variety of graphics, image processing
A brief overview of floating point multiplication has been
as well as robotic applications. The matrix multiplication
explained below [3-4].
operation involves a large number of multiplication as well
as accumulation. Multipliers have large area, longer • Both sign bits S1, S2 are need to be Xoring together, then
latency and consume considerable power compared to the result will be sign bit of the final product.
adders. Registers, which are required to store the • Both the exponent bits E1, E2 are added together, then
intermediate product values, are also major power subtract bias value from it. So, we get exponent field of the
intensive component [1]. These components pose a major final product.
challenge for designing VLSI structures for large-order
matrix multipliers with optimized speed and chip-area.
However, area, speed and power are usually conflicting
hardware constraints such that improving upon one factor
degrades the other two [2].
The real numbers represented in binary format are known
as floating point numbers. Based on IEEE-754 standard,
floating point formats are classified into binary and
decimal interchange formats. Floating point multipliers are
very important in dsp applications. This paper focuses on
double precision normalized binary interchange format.
Figure 1 shows the IEEE-754 double precision binary
format representation. Sign (s) is represented with one bit, • Significant bits Sig1 and Sig2 of both the operands are
exponent (e) and fraction (m or mantissa) are represented multiply including their hidden bits.
with eleven and fifty two bits respectively. For a number is
• Normalize the product found in step 3 and change the
said to be a normalized number, it must consist of'one' in
exponent accordingly. After normalization, the leading “1
the MSB of the significant and exponent is greater than
“will become the hidden bit.
zero and smaller than 1023. The real number is represented
by equations (i) & (2).

www.ijspr.com IJSPR | 86
INTERNATIONAL JOURNAL OF SCIENTIFIC PROGRESS AND RESEARCH (IJSPR) ISSN: 2349-4689
Issue 159, Volume 59, Number 01, May 2019

Above algorithm of multiplication algorithm is shown in every single customary viper. Be that as it may, CSA
Figure 2. involves more territory as it has two parallel circuits to
include the same bits yet with diverse convey data. As
II. LITERATURE REVIEW
CSA figures the whole without sitting tight for the
Lakshmi kiran Mukkara et al. [1], for implementation of transitional conveys to spread stage by stage. Finally it is
Low Power VLSI Architectures in the area of Digital the obligation of multiplexer to pick and give the last right
Image Processing applications, Matrix Multiplication is a yield. CSABEC is an adjusted adaptation of CSA in which
key arithmetic operation. To construct VLSI architectures one of the parallel circuits is supplanted by the
with Low Power, High Speed and Low area, Matrix arrangement of Binary to Excess-1 Converters circuit
Multiplication design becomes complex. In this paper, a (BECs). It is turned out to be an awesome way to deal with
simple novel VLSI architecture for FPMM is presented. It decrease the territory.
is designed by considering Pseudo code for matrix
M. K. Jaiswal et al. [5], we display a nonintrusive
multiplication, CSD multiplication algorithm for power
simultaneous mistake identification (CED) technique for
reduction, Conventional floating point number format and
ensuring the control rationale of a contemporary coasting
Pipelining concept for improving speed. FPMM design is
point unit (FPU). The proposed strategy depends on the
applicable for any arbitrary size of matrices by following
perception that control rationale mistakes lead to broad
matrix rules.
information way defilement and influence, with high
Soumya Havaldar et al. [2], gives an FPGA Based High likelihood, the example some portion of the IEEE-754
Speed IEEE-754 Double Precision Floating Point coasting point representation. Consequently, type
Multiplier Using Verilog. This paper has implemented observing can be used to recognize blunders in the control
DPFP Multiplier using parallel Adder. A high speed rationale of the FPU. Anticipating the type includes
floating point double precision multiplier is implemented generally basic operations; in this way, our system causes
on a Virtex-6 FPGA. In addition, the proposed design is fundamentally bring down overhead than the traditional
compliant with IEEE-754 format and handles over flow, methodology of copying the control rationale of the FPU.
under flow, rounding and various exception conditions. In fact, trial results on the open SPARC T1 processor
The design achieved the operating frequency of 414.714 utilizing SPEC2006FP benchmarks demonstrate that when
MHz with an area of 648 slices. contrasted with control rationale duplication, which brings
Ragini Parte et al. [3], IEEE point number-crunching has about a zone overhead of 17.9 percent of the FPU size, our
an immense application in DSP, advanced PCs, robots technique acquires a region overhead of just 5.8 percent
because of its capacity to speak to little numbers and huge yet still accomplishes discovery of more than 93 percent of
numbers and in addition marked numbers and unsigned transient mistakes in the FPU control rationale. Besides,
numbers. Disregarding unpredictability included in gliding the proposed technique offers the subordinate advantage of
point number juggling, its usage is expanding step by step. additionally distinguishing 98.1 percent of the information
Here we break down the impacts of utilizing three unique way mistakes that influence the type, which can't be
sorts of adders while figuring the single accuracy and recognized by means of duplication of control rationale. At
twofold exactness skimming point increase. We long last, when joined with a traditional deposit code-
additionally exhibit the increase of significand bits by based strategy for the portion, our system prompts a
disintegration of operands strategy for IEEE 754 standard. complete CED answer for the whole FPU which gives
scope of 94.1 percent of all blunders at a region expense of
Ross Thompson et al. [4], IEEE-754 determines trade and 16.32 percent of the FPU size.
number juggling positions also, routines for paired and
decimal drifting point number juggling in PC Table 1: Summary of Literature Review
programming world. The execution of a skimming point Entitle of Approach Softwar Paramet Publish
framework utilizing this standard should possible paper ed used e er ed Year
completely in programming, or in equipment, or in any Design of IEEE 754 Xilinx Slice = IEEE
blend of programming and equipment. This venture Vedic IEEE floating 12.1 i 305, 2016
propose VHDL execution of IEEE - 754 Floating point 754 Floating point LUT=
Point multiplier 598
unit .In proposed work the pack, unload and adjusting
Multiplier using
mode was actualized utilizing the VHDL dialect and Vedic
reenactment was checked. Multiplie
r
In this proposition work, DPFP Multiplier alongside SPFP
Multiplier has been actualized with four ordinary Adders Analysis of IEEE 754 Verilog Slice = IEEE
(PA, CSKA, CSA, and CSABEC). Think about their Effects of floating using 353, 2015
Results and CSA is known not the speediest snake among using point the LUT=

www.ijspr.com IJSPR | 87
INTERNATIONAL JOURNAL OF SCIENTIFIC PROGRESS AND RESEARCH (IJSPR) ISSN: 2349-4689
Issue 159, Volume 59, Number 01, May 2019

Exponent multiplier Model 610 use XOR Gate instead of using the last full adder. It not
Adders in using SIM SE only reduces the area occupied by the circuit but also
IEEE- 754 Carry 6.3 reduces the delay involved in calculation. For SPFP and
Multiplier by select DPFP multiplier’s exponent adder, here we Simulate 8 bit
VHDL adder
and 11 bit parallel adders respectively as show in figure 4.
An IEEE 754 IEEE 754 Xilinx Slice = IEEE
Double- floating Spartan 453, 2015
Precision point 3E LUT=
Floating- multiplier xa3s250 678
Point using e-
Multiplier for floating 4vqg10
Denormalize number 0
d and
Normalized
Floating- Figure 4: Modified Parallel Adder (n=7 for SPFP and n=10
Point for DPFP)
Numbers
High IEEE 754 Xilinx Slice = IEEE Carry Skip Adder:-
Performance floating ISE 543, 2011
FPGA point 12.2 LUT= This adder gives the advantage of less delay over Ripple
Implementati multiplier 792 carry adder. It uses the logic of carry skip, i.e. any desired
on of Double using carry can skip any number of adder stages. Here carry skip
Precision adder and logic circuitry uses two gates namely “and gate” and “or
Floating sub- gate”. Due to this fact that carry need not to ripple through
Point tractor each stage. It gives improved delay parameter. It is also
Adder/Subtra
known as Carry bypass adder. Generalized figure of Carry
ctor
Skip Adder is shown in figure 5.
III. DIFFERENT TYPES OF ADDER
Parallel Adder:-
Parallel adder can add all bits in parallel manner i.e.
simultaneously hence increased the addition speed. In this
adder multiple full adders are used to add the two
corresponding bits of two binary numbers and carry bit of
the previous adder. It produces sum bits and carry bit for
the next stage adder. In this adder multiple carry produced
by multiple adders are rippled, i.e. carry bit produced from
an adder works as one of the input for the adder in its Figure 5: Carry Skip Adder
succeeding stage. Hence sometimes it is also known as
Carry Select Adder:-
Ripple Carry Adder (RCA). Generalized diagram of
parallel adder is shown in figure 3. Carry select adder uses multiplexer along with RCAs in
which the carry is used as a select input to choose the
correct output sum bits as well as carry bit. Due to this, it
is called Carry select adder. In this adder two RCAs are
used to calculate the sum bits simultaneously for the same
bits assuming two different carry inputs i.e. ‘1’ and ‘0’. It
is the responsibility of multiplexer to choose correct output
bits out of the two, once the correct carry input is known to
it. Multiplexer delay is included in this adder. Generalized
figure of Carry select adder is shown in figure 3.9. Adders
are the basic building blocks of most of the ALUs
Figure 3: Parallel Adder (n=7 for SPFP and n=10 for (Arithmetic logic units) used in Digital signal processing
DPFP) and various other applications. Many types of adders are
An n-bit parallel adder has one half adder and n-1full available in today’s scenario and many more are
adders if the last carry bit required. But in 754 multiplier’s developing day by day. Half adder and Full adder are the
exponent adder, last carry out does not required so we can two basic types of adders. Almost all other adders are
made with the different arrangements of these two basic

www.ijspr.com IJSPR | 88
INTERNATIONAL JOURNAL OF SCIENTIFIC PROGRESS AND RESEARCH (IJSPR) ISSN: 2349-4689
Issue 159, Volume 59, Number 01, May 2019

adders only. Half adder is used to add two bits and produce FP representation, which will result the 48 bits and 106
sum and carry bits whereas full adder can add three bits bits product value respectively. In this paper we use the
simultaneously and produces sum and carry bits. technique of break up the operands into different groups
then multiply them. We get many product terms, add them
together carefully by shifting them according to which part
of one operand is multiplied by which part of the other
operand. We have decomposed the significand bits of both
the operands ain four groups. Multiply each group of one
operand by each group of second operand.
Matrix Multiplication
In this design we have reduced the resource utilization in
terms of number of multipliers and registers in lieu of the
completion time.

Figure 6: Carry Select Adder


IV. PROPOSED DESIGN
In IEEE754 standard floating point representation, 8 bit
Exponent field in single precision floating point (SP FP)
representation and 11 bit in double precision floating point
(DP FP) representation are need to add with another 8 bit
exponent and 11 bit exponent respectively, in order to
multiply floating point numbers represented in IEEE 754
standard as explained earlier. Ragini et al. [10] has used
parallel adder for adding exponent bits in floating point
multiplication algorithm. We proposed the use of 8-bit
modified CSA with dual RCA and 8-bit modified CSA
with RCA and BEC for adding the exponent bits. We have Figure 7 Proposed PPI-So design for n=3
found the improved area of 8-bit modified Carry select
adder with RCA and BEC over the 8-bit modified CSA This design is particularly useful where resources are
with dual RCA. limited and design can be compromised on basis of
increased completion time. The basic working model for a
• Sign bit calculation 3 × 3 matrix-matrix multiplication is shown in figure 7
To calculate the sign bit of the resultant product for SP FP below.
and DP FP multiplier, the same strategy will work. We just Considering the matrix – matrix multiplication of two n×n
need to XOR together the sign bits of both the operands. If matrices, the calculation is performed using n number of
the resultant bit is ‘1’, then the final product will be a multipliers, n number of registers and n-1 number of
negative number. If the resultant bit is ‘0’, then the final adders. cycles are required to perform the matrix
product will be a positive number. multiplication operation. Each multiplier has two input
• Exponent bit calculation ports: one each from matrix A and B. In each cycle, n
numbers of multiplications are performed and the products
Add the exponent bits of both the operands together, and are fed to the adder block to give a single element of the
then the bias value (127 for SPFP and 1023 for DPFP) is output matrix, C. The data flow to the multipliers are such
subtracted from the result of addition. This result may not that, multiplier is fed from column of matrix A and row
be the exponent bits of the final product. After the of matrix B, where 1 < k < n. At the multiplier, each
significant multiplication, normalization has to be done for element from matrix A is repeated for n consecutive cycles
it. According to the normalized value, exponents need to whereas the elements from matrix B are cycled back after
be adjusted. The adjusted exponent will be the exponent n cycles. The partial products are then fed to the adder
bits of the final product. which computes the final result.
• Significant bit calculation For a better understanding of the process, let us consider
Significant bits including the one hidden bit are need to be the matrix multiplication for n = 3 (as shown in figure 1).
multiply, but the problem is the length of the operands. In this case, 3 multipliers and 3 registers are used to
Number of bits of the operand will become 24 bits in case calculate and store the partial products respectively. These
of SP FP representation and it will be 53 bits in case of DP partial products are then fed to the adder block to compute

www.ijspr.com IJSPR | 89
INTERNATIONAL JOURNAL OF SCIENTIFIC PROGRESS AND RESEARCH (IJSPR) ISSN: 2349-4689
Issue 159, Volume 59, Number 01, May 2019

the final result. The first multiplier receives input from the Symposium on FPGAs for Custom Computing Machines
first column of matrix A (ak1) and first row of matrix B (FCCM"95), pp.155-162, 1995.
(b1k), where. Each element of the matrix A at the first [8] Malik and S. -B. Ko, “A Study on the Floating-Point Adder
in FPGAs”, in Canadian Conference on Electrical and
multiplier is repeated for 3 cycles, such that the data flow
Computer Engineering (CCECE-06), (2006) May, pp. 86–
can be represented as .Similarly, at the first multiplier,
89.
the elements of B are repeated after 3 cycles, such that the [9] D. Sangwan and M. K. Yadav, “Design and Implementation
input data-flow will be . The other two multipliers of Adder/Subtractor and Multiplication Units for Floating-
receive the component of A and B in the similar order as Point Arithmetic”, in International Journal of Electronics
the first multiplier. After the multiplication, the partial Engineering, (2010), pp. 197-203.
products are fed to the adder which computes the elements [10] L. Louca, T. A. Cook and W. H. Johnson, “Implementation
of output matrix C in row major order given by . So the of IEEE Single Precision Floating Point Addition and
entire matrix multiplication operation is performed in =9 Multiplication on FPGAs”, Proceedings of 83rd IEEE
Symposium on FPGAs for Custom Computing Machines
cycles.
(FCCM‟96), (1996), pp. 107–116.
V. CONCLUSION [11] Jaenicke and W. Luk, "Parameterized Floating-Point
Arithmetic on FPGAs", Proc. of IEEE ICASSP, vol. 2,
IEEE754 standardize two basic formats for representing (2001), pp. 897-900.
floating point numbers namely, single precision floating [12] Lee and N. Burgess, “Parameterisable Floating-point
point and double precision floating point. Floating point Operations on FPGA”, Conference Record of the Thirty-
arithmetic has vast applications in many areas like robotics Sixth Asilomar Conference on Signals, Systems, and
and DSP. Delay provided and area required by hardware Computers, (2002).
are the two key factors which are need to be consider Here [13] M. Al-Ashrafy, A. Salem, W. Anis, “An Efficient
we present single precision floating point multiplier by Implementation of Floating Point Multiplier”, Saudi
using two different adders namely modified CSA with International Electronics, Communications and Photonics
Conference (SIECPC), (2011) April 24-26, pp. 1-5.
dual RCA and modified CSA with RCA and BEC.
Among all two adders, modified CSA with RCA and BEC
is the least amount of Maximum combinational path delay
(MCDP). Also, it takes least number of slices i.e. occupy
least area among all two adders.
REFRENCES
[1] Lakshmi kiran Mukkara and K.Venkata Ramanaiah, “A
Simple Novel Floating Point Matrix Multiplier VLSI
Architecture for Digital Image Compression Applications”,
2nd International Conference on Inventive Communication
and Computational Technologies (ICICCT 2018) IEEE.
[2] Soumya Havaldar, K S Gurumurthy, “Design of Vedic IEEE
754 Floating Point Multiplier”, IEEE International
Conference On Recent Trends In Electronics Information
Communication Technology, May 20-21, 2016, India.
[3] Ragini Parte and Jitendra Jain, “Analysis of Effects of using
Exponent Adders in IEEE- 754 Multiplier by VHDL”, 2015
International Conference on Circuit, Power and Computing
Technologies [ICCPCT] 978-1-4799-7074-2/15/$31.00
©2015 IEEE.
[4] Ross Thompson and James E. Stine, “An IEEE 754 Double-
Precision Floating-Point Multiplier for Denormalized and
Normalized Floating-Point Numbers”, International
conference on IEEE 2015.
[5] M. K. Jaiswal and R. C. C. Cheung, “High Performance
FPGA Implementation of Double Precision Floating Point
Adder/Subtractor”, in International Journal of Hybrid
Information Technology, vol. 4, no. 4, (2011) October.
[6] B. Fagin and C. Renard, "Field Programmable Gate Arrays
and Floating Point Arithmetic," IEEE Transactions on
VLS1, vol. 2, no. 3, pp. 365-367, 1994.
[7] N. Shirazi, A. Walters, and P. Athanas, "Quantitative
Analysis of Floating Point Arithmetic on FPGA Based
Custom Computing Machines," Proceedings of the IEEE

www.ijspr.com IJSPR | 90

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy