0% found this document useful (0 votes)

113 views11 pages

VLSI Design and Implementation of Low Power MAC Unit With Block Enabling Technique

In the majority of digital signal processing (DSP) applications the critical operations are the multiplication and accumulation. Real-time signal processing requires high speed and high throughput Multiplier-Accumulator (MAC) unit that consumes low power. Power is reduced by 27% using the block enabling technique compared to the normal design.

Uploaded by

Sivanantham Sadhasivam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

113 views11 pages

VLSI Design and Implementation of Low Power MAC Unit With Block Enabling Technique

Uploaded by

Sivanantham Sadhasivam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

European Journal of Scientific Research ISSN 1450-216X Vol.30 No.4 (2009), pp.620-630 EuroJournals Publishing, Inc. 2009 http://www.eurojournals.com/ejsr.

htm

VLSI Design and Implementation of Low Power MAC Unit with Block Enabling Technique
Shanthala S Asst. Professor, Bangalore Institute of Technology, Bangalore, Research Scholar EC Research Centre, NMAM Institute of Technology, Nitte-574110, India E-mail: shanthala_wg@yahoo.com S. Y. Kulkarni Principal, NMAM Institute of Technology, Nitte-574110, Karnataka, India E-mail: sy_kul@yahoo.com Abstract In the majority of digital signal processing (DSP) applications the critical operations are the multiplication and accumulation. Real-time signal processing requires high speed and high throughput Multiplier-Accumulator (MAC) unit that consumes low power, which is always a key to achieve a high performance digital signal processing system. The purpose of this work is, design and implementation of a low power MAC unit with block enabling technique to save power. Firstly, a 1-bit MAC unit is designed, with appropriate geometries that gives optimized power, area and delay. The delay in the pipeline stages in the MAC unit is estimated based on which a control unit is designed to control the data flow between the MAC blocks for low power. Similarly, the N-bit MAC unit is designed and controlled for low power using a control logic that enables the pipelined stages at appropriate time. The adder cell designed has advantage of high operational speed, small transistor count and low power. The MAC is implemented on a 0.18um CMOS technology using CADENCE VIRTUOSO tool. This paper also investigates on various architectures of multipliers and adders which are suitable for implementation of high throughput signal processing and at the same time to achieve low power consumption. The whole MAC chip is operated at 125 MHz using 1.8 V power supply. The power is reduced by 27% using the block enabling technique compared to the normal design. Keywords: Low Power, MAC, clock gating, block enable, multiplier.

1. Introduction
In the majority of digital signal processing (DSP) applications the critical operations usually involve many multiplications and/or accumulations. For real-time signal processing, a high speed and high throughput Multiplier-Accumulator (MAC) is always a key to achieve a high performance digital signal processing system. In the last few years, the main consideration of MAC design is to enhance its speed. This is because, speed and throughput rate is always the concern of digital signal processing system. But for the epoch of personal communication, low power design also becomes another main design consideration. This is because, battery energy available for these portable products limits the power consumption of the system. Therefore, the main motivation of this work is to investigate various

VLSI Design and Implementation of Low Power MAC Unit with Block Enabling Technique

621

pipelined multiplier/accumulator architectures and circuit design techniques which are suitable for implementing high throughput signal processing algorithms and at the same time achieve low power consumption. A conventional MAC unit consists of (fast multiplier) multiplier and an accumulator that contains the sum of the previous consecutive products. The function of the MAC unit is given by the following equation: F = A i Bi (1.1)
Figure 1: Basic structure of MAC

Figure 2: MAC architecture

8
8 bit Wallace Tree multiplier 17 bit Register 17 bit Accumulator 18 bit Register 18 bit Accumulator Register

output

The main goal of a DSP processor design is to enhance the speed of the MAC unit, and at the same time limit the power consumption. In a pipelined MAC circuit, the delay of pipeline stage is the delay of a 1-bit full adder (Jou, Chen, Yang and Su, 1995) . Estimating this delay will assist in identifying the overall delay of the pipelined MAC. In this work, 1-bit full adder is designed. Area, power and delay are calculated for the full adder, based on which the pipelined MAC unit is designed for low power.

2. Multiplier and Accumulator Unit

MAC is composed of an adder, multiplier and an accumulator. Usually adders implemented are CarrySelect or Carry-Save adders, as speed is of utmost importance in DSP (Chandrakasan, Sheng, & Brodersen, 1992 and Weste & Harris, 3rd Ed). One implementation of the multiplier could be as a parallel array multiplier. The inputs for the MAC are to be fetched from memory location and fed to the multiplier block of the MAC, which will perform multiplication and give the result to adder which will accumulate the result and then will store the result into a memory location. This entire process is to be achieved in a single clock cycle (Weste & Harris, 3rd Ed). Figure 2 is the architecture of the MAC unit which had been designed in this work. The design consists of one 17 bit register, one 8-bit Wallace

622

Shanthala S and S. Y. Kulkarni

tree multiplier, 17-bit accumulator using ripple carry and two18-bit accumulator registers. To multiply the values of A and B, Wallace tree multiplier is used instead of conventional multiplier because Wallace tree multiplier can increase the MAC unit design speed. Ripple Carry Adder (RCA) is used as an accumulator in this design. Apparently, together with the utilization of Wallace tree multiplier approach, carry save adder in the final stage of the Wallace tree multiplier and Ripple Carry adder as the accumulator, this MAC unit design is not only reducing the standby power consumption but also can enhance the MAC unit speed so as to gain better system performance. The operation of the designed MAC unit is as in Equation 2.1. The product of Ai X Bi is always fed back into the 17-bit Ripple Carry accumulator and then added again with the next product Ai x Bi. This MAC unit is capable of multiplying and adding with previous product consecutively up to as many as eight times. Operation: Output = Ai Bi (2.1) In this paper, the design of 8x8 multiplier unit is carried out that can perform accumulation on 17 bit number. This MAC unit has 18 bit output and its operation is to add repeatedly the multiplication results. The total design area is also being inspected by observing the total count of transistors. Power delay product is calculated by multiplying the power consumption result with the time delay. 2.1. Wallace tree Multiplier The design analysis starts with the analysis of elementary algorithm for multiplication by Wallace tree multiplier. Figure 3 shows the algorithm for 8 x 8 bits multiplication performed by Wallace tree multiplier. There are five stages to go through, to complete the multiplication process (Weste & Harris, 3rd Ed). Each stage used half adders and full adders that are denoted by the red circle for the 1bit half adder and the blue circle for the 1 bit full adder. Firstly, we had to reduce the partial products using the
Figure 3: Algorithm for 8 bits x 8 bits Wallace tree multiplier (Harun, 2007)

half adders and full adders that are combined to build a carry save adder (CSA) until there were just two rows of partial products left. Next, we add remaining two rows by using a fast carry propagate adder. In this project, CSA (carry save adder) using ripple carry adder is used to get the final product. Secondly, the schematic of the conventional 8 bits x 8bits high speed Wallace tree multiplier is designed by referring to the algorithm.

VLSI Design and Implementation of Low Power MAC Unit with Block Enabling Technique 2.2. Carry Save Adder

623

When three or more operands are to be added simultaneously using two operand adders, the time consuming carry propagation must be repeated several times. If the number of operands is k, then carries have to propagate (k-1) times (Weste & Harris, 3rd Ed). In the carry save addition, we let the carry propagate only in the last step, while in all the other steps we generate the partial sum and sequence of carries separately. A CSA is capable of reducing the number of operands to be added from 3 to 2 without any carry propagation. A CSA can be implemented in different ways. In the simplest implementation, the basic element of carry save adder is the combination of two half adders or 1 bit full adder(Weste & Harris, 3rd Ed). 2.3. Block Enabling Technique In any MAC unit, data flows from the input register to the output register through multiple stages such as, multiplier stage, adder stage and the accumulator stage as shown in figure 4. Within the multiplier stage, further we find that there are multiple stages of addition. During each operation of multiplication and addition, the blocks in the pipeline may not be required to be on or enabled until the actual data gets in from the previous stage. In block enabling technique, we find the delay of each stage. Every block gets enabled only after the expected delay. For the entire duration until the inputs are available, the successive blocks are disabled, thus saving power. In the next section, we design a 1-bit MAC unit with pipeline structure and find the power consumption.
Figure 4: General Block Diagram of a Pipeline MAC with block enabling Technique cs - control signal

cs 1

cs_2

cs_3

cs_4

cs_5

624 2.4. Pipelined block enabled logic

Shanthala S and S. Y. Kulkarni

Figure 5 shows a three stage pipelined MAC with block enable logic. In this logic, depending upon the delay of individual blocks, the control logic enables the clock, power and logic pins of the block, thus saving power. Figure 6 shows the block schematic of the 1 bit full adder circuit with enable. Each of the blocks in the MAC unit has an enable signal to save power.
Figure 5: MAC with control logic

a en_0 b
Control Logic En_1 En_2 Adder Enable Register Enable

reset

Figure 6: 1bit full adder with enable

a Full Adder b c sum carry

enable

2.5. Accumulator Register Figure 7 shows the 1 bit register file cell that may be represented by a D-flip flop and two gates. Note that in addition to the clock signal, the cell has 3 inputs and 1 output: write select, read select and D input and Q output signal. With in this cell, the D-flip flop will store the value of the input signal whenever write select is equal to 1, consequently, whenever the read select signal is equal to 1, this Dflip flop will pass its stored value to the output through a tristate buffer.

VLSI Design and Implementation of Low Power MAC Unit with Block Enabling Technique
Figure 7: 1 bit Register cell

625

From the observations made, we find that the basic building blocks for any MAC unit are Multiplier, Adder and Register. Multiplier and adder blocks require full adders, and registers require flip-flops or latches. The objective of this work is to find the total area, power and delay of the MAC unit that forms the critical part of any DSP application. At the micro level, the power, delay and area for the basic blocks are calculated based on the experimental setup. Based on the results obtained, the reasons for power and delay are identified at the micro level and remedies are taken to minimize this power. Further this power reduction technique is extended at the macro level. In our design, it is the MAC architecture. Section 3 discusses these results. 2.6. Full adder design Different ways of realizing the full adder (Jou, Chen,Yang & Su, 1997, Suzuki, Ohkubo, Shinbo, Yamanaka, Shimizu, Sasaki & Nakagome, 1993, Lu & Samulei, 1993) are tabulated in table 1 and the results of the same are also compared. From the table it is very clear that the mux based full adder implementation consumes very less power and also has minimum delay. In this work, the mux based full adder is considered for implementation. The mux based full adder has a delay of 0.0012ns, this implies that, when the input is applied it takes 0.0012ns to produce the outputs. Hence we can disable other blocks connected to the output of full adder, and hence power is saved. Using this design, 1-bit pipelined MAC unit is realized. The basic building blocks for the MAC units are the flip flop to store 1-bit data, 1-bit adder, AND gate for control activity. These basic building blocks are taken independently and analyzed for delay and power. The pipelined MAC is incorporated with an enable pin to reduce power consumption, i.e. at any given point of time only one of the blocks gets enabled to ensure data flow from one stage to the next stage. For example, if the adder block is computing, the register block is disabled to save power or during the loading operation, adder block is disabled to save power. This is controlled by an external signal E, which enables or disables the corresponding block to keep it idle. Low power techniques as discussed in (Anantha, Samuel & Borderson, 1992) are considered in this work for reducing power.

626
Table 1: Full adder design comparison
No. of transistors 36 22 30 28 23 22 30 Area (um2) 507.592 324.225 408.127 387.548 375.124 367.721 413.402

Shanthala S and S. Y. Kulkarni

Full adder using Only nand Only mux Exor, and, or Conventional cmos logic Quasi domino Static and dynamic Exor & AND

Power (uw) 0.01293 0.0001459 3.58 0.0328 0.01645 46.65 3.58

Delay (ns) 0.00987 0.0012 0.0065 0.01055 0.00767 0.0109 0.0065

2.7. AND Gate The basic gate that is required to enable or disable the MAC blocks is controlled using an AND gate. The results tabulated in table 2 are by conducting experiments using cadence tools with 180nm technology library. The width of the transistors is varied to find the effect of delay and power. Table 3 lists the effect of width variation on power. It is observed that the delay of AND gate is not constant and it varies as per the input signals. From the table 2, we observe that delay reduces with increase in width, we select 0.4 as the AND gate geometries that gives minimum delay. As the AND gate has delay, the blocks connected to the output of AND gate are disabled until this time, and these blocks are enabled only after the outputs are available, hence saving power. From table 3 we find that the power also varies with input, and the power is maximum for 0.4 geometries.
Table 2: Delay variations for AND gate
Wn / Wp 0.2 0.3 0.4 0.5 Delay td (S) i/p a = pulse, b = 1 2.187 E-10 2.225 E-10 1.805 E-10 1.745 E-10 Delay td (S) i/p a = 1, b = pulse 2.156 E-10 2.192 E-10 1.776 E-10 1.720 E-10

Table 3:

Power variations for AND gate

Wn / Wp 0.2 0.3 0.4 0.5 Power (W) i/p a = pulse, b = 1 4.411 E-11 4.457 E-11 5.439 E-11 2.211 E-11 Power (W) i/p a = 1, b = pulse 1.275 E-08 2.106E-08 8.919E-09 1.301E-08

2.8.1. Bit Register Register forms one of the basic unit for the MAC unit, as the register stores data, there is possibility of leakage current and that affects power dissipation. Also the clock connected to the register cell also keeps changing and hence affects the dynamic power dissipation. In this work, the basic register cell is analyzed for its power consumption. The register cell is enabled with clock gating and the power and delay is calculated. Table 4 shows the power and delay of the basic register cell calculated using cadence tools. We find that the power gets reduced with enable. Knowing the delay, we enable the blocks connected at the output of the 1 bit register only after the output is available. This helps in saving power.

VLSI Design and Implementation of Low Power MAC Unit with Block Enabling Technique
Table 4: Power and delay results for 1-Bit register cell
Power (W) Data i/p di = pulse With enable Without enable 4.029 E-09 4.078 E-09 2.485 E-08 3.628 E-09 3.684 E-09 3.712 E-09 4.636 E-09 4.644 E-09

627

Wn / Wp 0.2 0.3 0.4 0.5

Delay td(S) Data i/p di = pulse With enable Without enable 7.881 E-10 7.882 E-10 6.996 E-10 6.985 E-10 6.369 E-10 6.371 E-10 6.167 E-10 6.152 E-10

2.9.1. Bit Full Adder Mux based full adder is designed in this work using 180nm technology and the results are obtained using cadence tools. Table 5 depicts the results for power and table 6 depicts the results for the delay of 1-B full adder. We find that the power increases with increase in width and also suddenly reduces. This is due to the fact that as we increase the width ratios of the transistors, due to mobility variations, threshold variations occur and hence the power reduces. Hence in this work Wn / Wp ratio of 0.3 is chosen for better results. The delay of full adder is 0.39ns and 0.4317ns, the maximum delay is selected and based upon this delay the output blocks connected to the full adder are enabled.
Table 5: 1 Bit Full adder Power
Wn / Wp 0.2 0.3 0.4 0.5 Power (W) a = pulse, b = 1, Cin = 0 3.357 E-10 3.400 E-10 4.430 E-10 3.225 E-10 Power (W) a = 1, b = pulse, Cin = 0 3.107 E-10 3.797 E-10 4.942 E-10 3.602 E-10

Table 6:

1 Bit Full Adder Delay

Wn / Wp 0.2 0.3 0.4 0.5 Delay td(S) a = pulse, b = 1, Cin = 0 4.496 E-10 3.973 E-10 3.663 E-10 3.569 E-10 Delay td(S) a = 1, b = pulse, Cin = 0 4.912 E-10 4.317 E-10 3.974 E-10 3.866 E-10

2.10.1. Bit MAC Using the basic building blocks discussed, 1-bit MAC unit is designed with clock gating and enable pin. From the result analysis carried out using experimental setup, we find that the AND gate delay is 0.225ns, Full Adder delay is 0.4317ns and register delay is 0.6996ns. When the input is applied at 0ns, all the blocks are enabled simultaneously, the FA block would compute the results on unknown data until 0.225ns, and the register block would be receiving unknown data for 0.6567ns and hence there is wastage of power as these datas are not the actual ones. Hence in this work, we have incorporated a control signal that enables the blocks only after the outputs are available at their inputs. Hence we call this technique as block enable technique. Based on the delay of each block, a control signal is generated to enable the blocks. Tables 7 and 8 depict the power and delay results obtained for the MAC unit with and without enable. From table 7, we find that the power varies with the input. MAC unit with all 0s consume less power than all 1s. The power calculation with enable is found to be more than the power without enable. This is due to the fact that extra control logic added for block enable technique consumes additional power. However, if we neglect the power consumed by the control unit, then due to the

628

Shanthala S and S. Y. Kulkarni

enabling technique we find by hand calculation after getting the power report, the power consumption is reduced by 27% of the actual power.
Table 7: 1 Bit MAC Power calculations with input variations
Power (W) i/p =all 0s With enable Without enable 7.718 E-09 4.47 E-10 7.186 E-09 4.562E-10 5.873 E-09 6.027 E-10 7.717 E-09 4.576 E-10 Power (W) i/p = all 1s With enable Without enable 7.795 E-09 2.571E-08 7.335 E-09 1.241E-04 6.006 E-09 8.979E-10 7.812 E-09 6.828E-05

Wn / Wp 0.2 0.3 0.4 0.5

Table 8:

1 Bit MAC Power and delay with pulse input

Power (W) i/p a = pulse With enable Without enable 2.803E-5 4.786E-10 5.263E-6 5.061E-10 2.397E-6 6.488E-10 4.445E-6 4.862E-10 Delay td(S) i/p a = pulse With enable Without enable 2.922E-8 2.928E-9 2.882E-8 3.235E-9 2.868E-8 3.446E-9 2.863E-8 3.492E-9

Wn / Wp 0.2 0.3 0.4 0.5

After analyzing the basic building blocks at the micro level, we find that appropriate widths of the transistors are very important in deciding the power reduction and delay. In this work we have identified that Wn / Wp can be taken to be 0.3 or 0.4. Using the results, we build the macro blocks like the multiplier block, adder block and the register block for constructing the MAC unit. With the analysis carried out for the 1 bit MAC unit, which is also extended to the macro model, the power and delay analysis is discussed in the next section. 2.11. Multiplier (8 X 8) Table 9 depicts the power dissipation for the multiplier block discussed in this paper. Power consumption of the multiplier block is calculated for all ones and all zeros. One of the operands of the multiplier is set as constant and the other operand is set to a pulse and the power and delay is calculated for variations in the width of the transistors and the results are tabulated. From table 9, we find that the power consumption is less with the width ratio of 0.3. Further increase in width affects the power due to threshold voltage variations. Table 10 tabulates the power and delay values for a 18 bit register obtained using cadence tools for 180nm technology by varying the width ratios of the transistors used in the schematic.
Table 9: Multiplier Power and Delay
Power (W) i/p = pulse 3.950E-7 1.339E-9 9.400E-8 1.379E-7 Delay td(S) i/p = pulse 9.077E-09 5.875E-10 5.511E-10 5.471E-10

Wn / Wp 0.2 0.3 0.4 0.5

VLSI Design and Implementation of Low Power MAC Unit with Block Enabling Technique
Table 10: 18 Bit Register Power and Delay
Power (W) Wn / Wp 0.2 0.3 0.4 0.5 i/p =all 0s 4.337E-9 4.558E-9 6.158E-9 4.789E-9 i/p = all 1s 5.225E-9 4.992E-9 1.401E-8 1.766E-8 i/p = all 0s 1.0737E-8 6.149E-10 5.552E-10 5.288E-10 Delay td(S) i/p = all 1s 6.0737E-8 4.561E-8 5.551E-10 5.287E-10

629

Table 11: Various Blocks delay, power, speed and power delay product
Blocks 1bit full adder 1bit D-flip flop 1 bit register cell 2x1 mux 17 bit accumulator 18 bit accumulator register 8x8 Wallace tree multiplier MAC unit Power (watt) 0.145n 0.0596n 0.2804n 0.00434n 0.0122u 0.987n 0.03324u 0.007698m Delay(s) 0.0012n 0.01425n 0.0405n 0.00458n 0.04438n 0.0724n 0.1152n 0.437n Speed (Hz) 833.33G 70.175G 24.69G 218.34G 22.53G 13.81G 8.68G 2.288G Power delay product(fj) 0.000000174 0.000000849 0.0000113 0.0000000198 0.000541 0.000714 0.0038 3.364

3. Results
3.1. Power Consumption Table 11 shows that total power consumption using TSMC 0.18um is about 0.007698mW for the MAC unit with enable. The delay time is observed from time difference between the rise edges of clock input with the rise edge of the output waveform. It also shows the tabulated result of delay value for each part that was used to design MAC unit using TSMC 0.18um. The delay for the MAC unit is 0.437ns. The design speed is calculated from reciprocal of the delay which means 1/Delay time is equal to speed. Total speed for MAC unit using TSMC 0.18um is 2.288GHz. 3.2. Power-Delay Product The Power-delay product is simply the product of the power consumption and the time delay. The smaller the value of the power-delay product, the better is the performance of the design. Since this MAC unit has almost negligible power-delay product value, it indeed has a better performance in terms of the speed and power dissipation. Based on table 11, total power delay product for MAC unit using TSMC 0.18um is 3.364fj.

4. Conclusion
A 8x8 multiplier-accumulator (MAC) is presented in this work. A full-adder circuit based on mux is used for MAC architecture. Compared to other full-adder circuits, the MUX based full adder has the highest operational speed and less transistor count. The basic building blocks for the MAC unit are identified and each of the blocks is analyzed for its performance. Power and delay is calculated for the blocks. 1-bit MAC unit is designed with enable to reduce the total power consumption based on block enable technique. Using this block, the N-bit MAC unit is constructed and the total power consumption is calculated for the MAC unit. With power reduction techniques adopted in this work, 27% of power is saved. The MAC unit designed in this work can be used in filter realizations for High speed DSP applications. Table 12 summarizes the results obtained. The Full custom design has been carried out for the proposed work and verified using cadence tools. The final GDS II is also generated and is as shown in figure 8.

630
Figure 8: MAC layout

Shanthala S and S. Y. Kulkarni

Table 12: Characteristics of MAC

Supply voltage Power consumption Delay Speed Power- Delay Product IC process 1.8V 0.007698mW 0.437ns 2.288GHz 3.364fj 180nm CMOS

References
[1] [2] [3] [4] [5] [6] [7] [8] S.J. Jou, C.Y.Chen, E.C. Yang, and C.C.Su(1995), A pipelined Multiplier-accumulator using a high speed, low power static and dynamic full adder design, IEEE Custom Integrated circuit conference, 1995, pp. 593-5961 Anantha. P. Chandrakasan, Samuel Sheng, Robert W. Brodersen, Low-Power CMOS digital design(1992), IEEE Journal of Solid-State Circuits, Vol 27, No. 4, April, 1992 Neil H.E. Weste ,and David Harris, CMOS VLSI Design: a circuits and systems perspective, Addison-Wesley Publishing Company, 3rd ed. S.J. Jou, C.Y.Chen, E.C. Yang, and C.C.Su(1997), A pipelined multiplier-accumulator using a high speed, low power static and dynamic full adder design, IEEE journal of Solid-state circuits, vol.32, no.1, Jan.1997,pp.114-118 M.Suzuki, N.Ohkubo, T.Shinbo, T.Yamanaka, A.Shimizu, K.Sasaki, and Y. Nakagome(1993), A 1.5ns 32-bit CMOS ALU in Double pass-transistor logic, IEEE Journal of Solid state circuits, vol.28, no. 11, November 1993, pp.1145-1151 F. Lu and H. Samulei(1993), A 200-MHz CMOS pipelined multiplier- accumulator using a quasi-domino dynamic full adder cell design, IEEE J. Solid state circuits, vol.28, pp.123- 132, Feb 1993 P.C. Anantha, S. Samuel and R.W.Borderson (1992), Low power CMOS digital design, IEEE J. Solid-state circuits, vol.27, pp.473-483, April 1992 Tajul Hamimi Harun(2007), High Speed 8-bits x 8-bits Wallace Tree Multiplier, Chapter 3, dspace.unimap.edu.my/bitstream/123456789/1937/5/Methodology.pdf, May 2007

Vlsi Qa2 PDF
No ratings yet
Vlsi Qa2 PDF
19 pages
School of Electronics Engineering
No ratings yet
School of Electronics Engineering
1 page
795 Comp Science AL
No ratings yet
795 Comp Science AL
19 pages
228557429 (1)
No ratings yet
228557429 (1)
5 pages
LOW POWER AND AREA EFFICIENT MULTIPLIER-ACCUMULATOR UNIT FOR FIR FILTER
No ratings yet
LOW POWER AND AREA EFFICIENT MULTIPLIER-ACCUMULATOR UNIT FOR FIR FILTER
7 pages
SP Unit 3 SB
No ratings yet
SP Unit 3 SB
72 pages
An_emulated_computer_with_assembler_for_teaching_u
No ratings yet
An_emulated_computer_with_assembler_for_teaching_u
9 pages
Design_of_a_Vedic_Multiplier_based_64-bit_Multiplier_Accumulator_Unit_________444
No ratings yet
Design_of_a_Vedic_Multiplier_based_64-bit_Multiplier_Accumulator_Unit_________444
7 pages
An_efficient_MAC_unit_with_low_area_consumption
No ratings yet
An_efficient_MAC_unit_with_low_area_consumption
5 pages
DSD ch-5 Building Blocks
No ratings yet
DSD ch-5 Building Blocks
85 pages
307 - An Efficient Two-Phase 3387-11439-1-PB
No ratings yet
307 - An Efficient Two-Phase 3387-11439-1-PB
7 pages
Implementation of A 32-Bit MAC Unit in AISC Flow Using Vedic Multiplier and CSA
No ratings yet
Implementation of A 32-Bit MAC Unit in AISC Flow Using Vedic Multiplier and CSA
4 pages
Design of High-Speed Area Efficient Mac Unit Using Reversible Logic
No ratings yet
Design of High-Speed Area Efficient Mac Unit Using Reversible Logic
6 pages
AE PG Syllabus 2024 Admitted 13.08.2024
No ratings yet
AE PG Syllabus 2024 Admitted 13.08.2024
37 pages
Performance Analysis of MAC Unit Using Booth, Wallace Tree, Array and Vedic Multipliers
No ratings yet
Performance Analysis of MAC Unit Using Booth, Wallace Tree, Array and Vedic Multipliers
8 pages
DEMP Question Bank
No ratings yet
DEMP Question Bank
2 pages
3rd Semester DLD LAB 3
No ratings yet
3rd Semester DLD LAB 3
7 pages
MCQ of COD
No ratings yet
MCQ of COD
8 pages
Full Adder Literature Review
100% (1)
Full Adder Literature Review
7 pages
Parallel MAC
No ratings yet
Parallel MAC
6 pages
MACIo T
No ratings yet
MACIo T
5 pages
Unit 5 DSP
No ratings yet
Unit 5 DSP
54 pages
3rd Sem Ade Lab Manual(18csl37)
No ratings yet
3rd Sem Ade Lab Manual(18csl37)
63 pages
DESIGN AND IMPLIMENT 64 BIT MAC MULTIPLY AND ACCUMULATE Ijariie2182
No ratings yet
DESIGN AND IMPLIMENT 64 BIT MAC MULTIPLY AND ACCUMULATE Ijariie2182
5 pages
Optimization of Delay IIN Pipeline Mac Unit Using Wallace Tree Multiplier
No ratings yet
Optimization of Delay IIN Pipeline Mac Unit Using Wallace Tree Multiplier
9 pages
An Optimized Modified Parallel Implementation Design of Multiplier and Accumulator Operator
No ratings yet
An Optimized Modified Parallel Implementation Design of Multiplier and Accumulator Operator
39 pages
A Novel Low Power and High Speed Multiply-Accumulate MAC Unit Design For Floating-Point Numbers
No ratings yet
A Novel Low Power and High Speed Multiply-Accumulate MAC Unit Design For Floating-Point Numbers
7 pages
Low Power Datapath Architecture For Multiply - Accumulate MAC Unit
No ratings yet
Low Power Datapath Architecture For Multiply - Accumulate MAC Unit
5 pages
22bce20019. Lab Report-Dld
No ratings yet
22bce20019. Lab Report-Dld
40 pages
Mux Implementation of Bec-1 Based Pipelined Vedic Mac Using Han Carlson Accumulator
No ratings yet
Mux Implementation of Bec-1 Based Pipelined Vedic Mac Using Han Carlson Accumulator
94 pages
Design of High Performance 64 Bit MAC Unit
No ratings yet
Design of High Performance 64 Bit MAC Unit
5 pages
An Efficient MAC Unit With Low Area Consumption
No ratings yet
An Efficient MAC Unit With Low Area Consumption
5 pages
A Novel High Performance Implemance and Design of 64 Bit MAC Unit& Their Delay Comparision
No ratings yet
A Novel High Performance Implemance and Design of 64 Bit MAC Unit& Their Delay Comparision
17 pages
Reduced Area Carry Select Adder With Low Power Consumptions: Gurpreet Kaur, Loveleen Kaur, Navdeep Kaur
No ratings yet
Reduced Area Carry Select Adder With Low Power Consumptions: Gurpreet Kaur, Loveleen Kaur, Navdeep Kaur
6 pages
Ece-Vii-dsp Algorithms & Architecture U2
No ratings yet
Ece-Vii-dsp Algorithms & Architecture U2
21 pages
A New VLSI Architecture of Parallel Multiplier-Accumulator Based On Radix-2 Modified Booth Algorithm
No ratings yet
A New VLSI Architecture of Parallel Multiplier-Accumulator Based On Radix-2 Modified Booth Algorithm
8 pages
Reference 7
No ratings yet
Reference 7
4 pages
A High-Speed, Energy-Efficient Two-Cycle Multiply-Accumulate (MAC) Architecture and Its Application To A Double-Throughput MAC Unit
No ratings yet
A High-Speed, Energy-Efficient Two-Cycle Multiply-Accumulate (MAC) Architecture and Its Application To A Double-Throughput MAC Unit
9 pages
Edited Project
No ratings yet
Edited Project
48 pages
1.5. MAC 1.5.1 Block Diagram of MAC
No ratings yet
1.5. MAC 1.5.1 Block Diagram of MAC
11 pages
Registration Format
No ratings yet
Registration Format
1 page
COE 405, Term 122 Design & Modeling of Digital Systems HW# 5 Solution Due Date: Monday, April 15
No ratings yet
COE 405, Term 122 Design & Modeling of Digital Systems HW# 5 Solution Due Date: Monday, April 15
6 pages
127 - 128 Thanks To Mattrad On The Newsgroup For The Correction!
No ratings yet
127 - 128 Thanks To Mattrad On The Newsgroup For The Correction!
10 pages
Module 2 Notes
No ratings yet
Module 2 Notes
28 pages
p226 Sakthivel PDF
No ratings yet
p226 Sakthivel PDF
6 pages
05 FSM Notes
No ratings yet
05 FSM Notes
50 pages
A New VLSI Architecture of Parallel Multiplier-Accumulator Based On Radix-2 Modified Booth Algorithm
No ratings yet
A New VLSI Architecture of Parallel Multiplier-Accumulator Based On Radix-2 Modified Booth Algorithm
5 pages
Cst203 Scheme
No ratings yet
Cst203 Scheme
4 pages
A New Vlsi Architecture of Parallel Multiplier-Accumulator Based On Radix-2 Modified Booth Algorithm
No ratings yet
A New Vlsi Architecture of Parallel Multiplier-Accumulator Based On Radix-2 Modified Booth Algorithm
6 pages
A Comparative Study of Different Multiplier Designs PDF
No ratings yet
A Comparative Study of Different Multiplier Designs PDF
4 pages
DLCOA Complete Notes Openinapp
No ratings yet
DLCOA Complete Notes Openinapp
402 pages
Vlsid2021 CFP V1 PDF
No ratings yet
Vlsid2021 CFP V1 PDF
2 pages
Vlsid2021 CFP V1 PDF
No ratings yet
Vlsid2021 CFP V1 PDF
2 pages
Design and Implementation of 8-Bit Vedic Multiplier
No ratings yet
Design and Implementation of 8-Bit Vedic Multiplier
3 pages
4106 Revised Syllabus Spring08
No ratings yet
4106 Revised Syllabus Spring08
9 pages
Priyanka - 50300 16 130
No ratings yet
Priyanka - 50300 16 130
4 pages
Efficient Implementation of 16-Bit Multiplier-Accumulator Using Radix-2 Modified Booth Algorithm and SPST Adder Using Verilog
No ratings yet
Efficient Implementation of 16-Bit Multiplier-Accumulator Using Radix-2 Modified Booth Algorithm and SPST Adder Using Verilog
12 pages
Ijarcet Vol 1 Issue 5 346 351
No ratings yet
Ijarcet Vol 1 Issue 5 346 351
6 pages
Verilog Tutorial 1
No ratings yet
Verilog Tutorial 1
6 pages
Simulation Exp. 6 (Digital Addition)
No ratings yet
Simulation Exp. 6 (Digital Addition)
3 pages
Implementation of Low Power and High Speed Multiplier-Accumulator Using SPST Adder and Verilog
No ratings yet
Implementation of Low Power and High Speed Multiplier-Accumulator Using SPST Adder and Verilog
8 pages
Project Report About Multipliers
80% (5)
Project Report About Multipliers
62 pages
Combinational ALU: Computer Architecture: A Constructive Approach
No ratings yet
Combinational ALU: Computer Architecture: A Constructive Approach
28 pages
Low Power Mac For Digital Fir
No ratings yet
Low Power Mac For Digital Fir
4 pages
Full Adder Finfet 2
No ratings yet
Full Adder Finfet 2
8 pages
DSD Lecture Notes
No ratings yet
DSD Lecture Notes
120 pages
Modified Low Power Low Area Array Multiplier With SOC Encounter
No ratings yet
Modified Low Power Low Area Array Multiplier With SOC Encounter
4 pages
Title Design and Implementation of PRBS Generator Using VHDL
No ratings yet
Title Design and Implementation of PRBS Generator Using VHDL
7 pages
Review of MAC Unit For Complex Numbers
No ratings yet
Review of MAC Unit For Complex Numbers
3 pages
Speed Enhanced Multiprecision Multiplier Using Compressing Techniques
No ratings yet
Speed Enhanced Multiprecision Multiplier Using Compressing Techniques
3 pages
Mac
No ratings yet
Mac
20 pages
PaperID 74S201921
No ratings yet
PaperID 74S201921
7 pages
Thesis
0% (1)
Thesis
76 pages
Approaches To Low-Power Implementations of DSP Systems
No ratings yet
Approaches To Low-Power Implementations of DSP Systems
22 pages
Low Power MAC Unit For DSP Processor
No ratings yet
Low Power MAC Unit For DSP Processor
3 pages
DSP Arch
No ratings yet
DSP Arch
10 pages
Ade Encoder Decoder
No ratings yet
Ade Encoder Decoder
4 pages
101 2024 0 B Theory
No ratings yet
101 2024 0 B Theory
24 pages
Design of Modified Low Power Booth Multiplier
No ratings yet
Design of Modified Low Power Booth Multiplier
6 pages
A Reconfigurable Architecture of A High Performance 32-Bit MAC Unit For Embedded DSP
No ratings yet
A Reconfigurable Architecture of A High Performance 32-Bit MAC Unit For Embedded DSP
4 pages
Design FF Low Power Multiplier Unit Using Wallace Tree Algorithm IJERTV9IS020069
No ratings yet
Design FF Low Power Multiplier Unit Using Wallace Tree Algorithm IJERTV9IS020069
5 pages
VLSI Design and Implementation of Low Power MAC Unit With Block Enabling Technique
0% (1)
VLSI Design and Implementation of Low Power MAC Unit With Block Enabling Technique
5 pages
Chapter Five: Analysis and Synthesis of Combinational Logic Circuits
No ratings yet
Chapter Five: Analysis and Synthesis of Combinational Logic Circuits
38 pages
Thesis Phase 1 Report
No ratings yet
Thesis Phase 1 Report
7 pages
SDD HSC Notes-1
No ratings yet
SDD HSC Notes-1
18 pages
Ece5015 Digital-Ic-Design Eth 1.0 40 Ece5015
No ratings yet
Ece5015 Digital-Ic-Design Eth 1.0 40 Ece5015
2 pages
Implementation of MAC Unit Using Booth Multiplier & Ripple Carry Adder
No ratings yet
Implementation of MAC Unit Using Booth Multiplier & Ripple Carry Adder
3 pages
Assignment 1
No ratings yet
Assignment 1
7 pages
Implementation of Low Power and High Speed Multiplier-Accumulator Using SPST Adder and Verilog
No ratings yet
Implementation of Low Power and High Speed Multiplier-Accumulator Using SPST Adder and Verilog
8 pages
Architectures For Programmable Digital Signal Processing Devices
No ratings yet
Architectures For Programmable Digital Signal Processing Devices
24 pages
DIGITAL SYSTEMS LAB MCQs
No ratings yet
DIGITAL SYSTEMS LAB MCQs
5 pages
EC2203 Digital Electronics Question Bank
No ratings yet
EC2203 Digital Electronics Question Bank
16 pages
Lab2 Verilog
No ratings yet
Lab2 Verilog
5 pages
RC DFT Guide
0% (1)
RC DFT Guide
58 pages
Multiplier and Accumulator Unit
100% (4)
Multiplier and Accumulator Unit
4 pages
CSIR JRF Guidelines
No ratings yet
CSIR JRF Guidelines
13 pages
DSP Notes Unit1 and 2
No ratings yet
DSP Notes Unit1 and 2
45 pages
Draw A Neat Circuit of BCD Adder Using IC 7483 and Explain
No ratings yet
Draw A Neat Circuit of BCD Adder Using IC 7483 and Explain
3 pages
What's New in .NET 8? A Complete Guide to the Latest Features
From Everand
What's New in .NET 8? A Complete Guide to the Latest Features
Nitika
No ratings yet
Digital Modulations using Matlab
From Everand
Digital Modulations using Matlab
Mathuranathan Viswanathan
4/5 (6)
Digital Electronics Lab Manual
100% (1)
Digital Electronics Lab Manual
6 pages
Preliminary Specifications: Programmed Data Processor Model Three (PDP-3) October, 1960
From Everand
Preliminary Specifications: Programmed Data Processor Model Three (PDP-3) October, 1960
Digital Equipment Corporation
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

VLSI Design and Implementation of Low Power MAC Unit With Block Enabling Technique

Uploaded by

VLSI Design and Implementation of Low Power MAC Unit With Block Enabling Technique

Uploaded by

European Journal of Scientific Research ISSN 1450-216X Vol.30 No.4 (2009), pp.620-630 EuroJournals Publishing, Inc. 2009 http://www.eurojournals.com/ejsr.

Figure 2: MAC architecture

2. Multiplier and Accumulator Unit

Shanthala S and S. Y. Kulkarni

624 2.4. Pipelined block enabled logic

Shanthala S and S. Y. Kulkarni

Figure 6: 1bit full adder with enable

Shanthala S and S. Y. Kulkarni

Power (uw) 0.01293 0.0001459 3.58 0.0328 0.01645 46.65 3.58

Delay (ns) 0.00987 0.0012 0.0065 0.01055 0.00767 0.0109 0.0065

Power variations for AND gate

Wn / Wp 0.2 0.3 0.4 0.5

1 Bit Full Adder Delay

Shanthala S and S. Y. Kulkarni

Wn / Wp 0.2 0.3 0.4 0.5

1 Bit MAC Power and delay with pulse input

Wn / Wp 0.2 0.3 0.4 0.5

Wn / Wp 0.2 0.3 0.4 0.5

Shanthala S and S. Y. Kulkarni

Table 12: Characteristics of MAC

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.