0% found this document useful (0 votes)
8 views67 pages

LPV_07

The document discusses low power design techniques in CMOS circuits, focusing on low power arithmetic components like adders, multipliers, and division. It details various types of adders, including ripple carry, carry look-ahead, carry select, and conditional sum adders, along with their transistor counts and performance metrics. Additionally, it highlights the importance of factors influencing power dissipation and provides insights into design considerations for efficient low power arithmetic operations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views67 pages

LPV_07

The document discusses low power design techniques in CMOS circuits, focusing on low power arithmetic components like adders, multipliers, and division. It details various types of adders, including ripple carry, carry look-ahead, carry select, and conditional sum adders, along with their transistor counts and performance metrics. Additionally, it highlights the importance of factors influencing power dissipation and provides insights into design considerations for efficient low power arithmetic operations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 67

Low Power Design

Low power Architecture & Systems: Power &


performance management, switching activity reduction,
parallel architecture with voltage reduction, flow graph
transformation, low power arithmetic components, low
power memory design.
Low Power Arithmetic
Components
Rabaey
Outline
• Introduction
• Low Power Arithmetic Components
• Types of Adders
Introduction
• There are four factors which influence the power
dissipation of CMOS circuits. They are:
1. Technology
2. Circuit design style
3. Architecture and
4. Algorithm
Low Power Arithmetic
Components
1. ADDERS
2. Multipliers
3. Division
Adder Types:
• In choosing an adder for a particular application,
the following things must be considered:
• Speed
• Size
• Dynamic power consumption
Half and Full Adders
Problems
• Design a half adder and full adder using basic gates
Half Adder
• Sum = A’B+AB’
• =A B
• Cout = AB
Full Adder
• Sum = A B C
= ABC + AB’C’+A’B’C+AB’C

• Carry= AB+AC+BC
Static CMOS Full Adder
• Sum = ABC + AB’C’+A’B’C+AB’C
• Carry= AB+AC+BC
=AB+C(A+B)
Static CMOS Full Adder

Sum
Static CMOS Full Adder
Carry

• Requires 30 transistors
Precharged circuits

Clk Mp Clk Mp on
1
Out Out
In1 CL
A
In2 PDN
C
In3
B
Clk Me
off
Clk Me

Two phase operation


Precharge (Clk = 0)
Precharged Circuits

Clk Mp Clk Mp off


Out Out
In1 CL ((AB)+C)
A
In2 PDN
C
In3
B
Clk Me
Clk Me on

Two phase operation


Precharge (Clk = 0)
Evaluate (Clk = 1)
Precharged Circuits
• When the clock is low, the output is always high
(Precharge phase)
• When the clock is high, the output is controlled by
the input (Evaluate phase)
NORA (NO Race dynamic CMOS
logic)
• Alternating stages of P- and N- type logic trees to
from the carry and sum outputs
• The P-type stage that forms the carry output is
dynamically pre-charge high.
• While N- type transistor tree that computes the
sum is dynamically pre-charge low.
• This pre-charge and pre-discharge process
requires a two phase complimentary clock
denoted by phi and phi’.
• Requires 22 transistors
NORA (NO Race dynamic CMOS
logic)
CVSL (Cascode Voltage Switch
Logic)
• In CVSL full adder, the output and their
complements are all precharged high while the
clock is low.
• When the clock signal goes high, the
complementary cascoded differential N-type
transistor trees pull either the output or the
complement low.
• The complement of the clock is not necessary.
• Requires 24 transistors
CVSL (Cascode Voltage Switch
Logic)
CVSL (Cascode Voltage Switch
Logic)
DCVS (Differential Cascode Voltage
Switch) Logic
• Replacing the P-type transistors in CVSL with a
cross coupled pair of P-transistor.
• The cross coupled P-transistor acts as a Differential
pair.
• When the output of one side gets pulled low, then
the opposite P-transistor will be turn on, and the
output on that side will be pulled high.
• Requires 22 transistors
DCVS (Differential Cascode Voltage
Switch) Logic
DCVS (Differential Cascode Voltage
Switch) Logic
Other types of adders
• CNTL (CMOS Non Threshold Logic)
▫ Extra N-type transistors
▫ Voltage swing at the output
▫ Requires 34 transistors
• ECDL (Enable/Disable CMOS Differential Logic)
▫ Requires 35 transistors
• ESCL (Enhancement Source Coupled Logic)
▫ Requires 24 transistors
Full adder transistor count and
area
Transistors Rank Area(u2) Rank

CMOS 30 5 21,294 3
NORA 22 1 14,319 1
CVSL 24 3 25,740 4
DCVS 22 1 21,080 2
CNTL 34 6 40,020 7
ECDL 35 7 34,170 6
ESCL 24 3 26,522 5
Delays

Simulated Rank Measured Rank


(nsec) (nsec)
CMOS 46.3 3 60 3
NORA 45.9 2 47.2 1
CVSL 45.4 1 49.2 2
DCVS 61.5 5 72.6 4
CNTL 54.1 4 87 6
ECDL 79.7 6 85 5
ESCL 94.1 7 166 7
Types of Adders

• Various types of adders available are as follows:


▫ Ripple carry adder
▫ Carry look ahead adder
▫ Carry Skip adder
▫ Carry select adder
▫ Conditional sum adder
Ripple carry adder
• It is possible to create a logical circuit using
multiple full adders to add N-bit numbers. Each
full adder inputs a Cin, which is the Cout of the
previous adder. This kind of adder is a ripple
carry adder, since each carry bit "ripples" to the
next full adder.
Ripple carry adder
Ripple carry adder
• Multiple full adders with carry ins and
carry outs
chained together
• Small Layout area
• Large delay time
Carry Look Ahead Adder
• The OR gate used in the full adder to
generate the carry eliminated, resulting in an
adder with only eight gates.
• The carry input for each full adder is
computed by a tree of look-ahead logic
blocks.
• Each block receives Pi (propagate) and Gi
(generate) signals from full adder and
computers the carries.
• Reduces delay in Computation.
Carry Look Ahead Adder
• Why is a Carry Look Ahead Adder
important?
- The CLA is used in most ALU designs
- It is faster compared to ripple carry logic
adders or full adders especially when adding
a large number of bits.
• The Carry Look Ahead Adder is able to
generate carries before the sum is produced
using the propage and generate logic to make
addition much faster.
Equations for Logic of 4-bit CLA

Gi = Ai.Bi Pi = (Ai  Bi)

C1 = G0 + P0.C0
C2 = G1 + P1.C1 = G1 + P1.G0 + P1.P0.C0
C3 = G2 + P2.G1 + P2.P1.G0 + P2.P1.P0.C0
C4 = G3 + P3.G2 + P3.P2.G1 + P3P2.P1.G0 + P3P2.P1.P0.C0

Si = Ai  Bi  Ci = Pi Ci.
4-bit Carry-Look Ahead Adder

Ci+1 = Gi + Pi.Ci

Gi = Ai.Bi

Pi= (Ai Bi)


16-bit Carry-Look Ahead Adder using 4-bit
Carry-Look-Ahead Adders

PG= P3.P2.P1.P0;
GG = G3 + P3G2 + P3.P2.G1. + P3.P2.P1.G0
Carry Save Adder Design
Single Bit Carry Save Adder
Block X Y Z
Xi Yi i i i

Full Carr-Save
Cout Adder Cin Adder
Block Block

Si Ci Si
Carry Save Adder Design
Example of Carry Save
X: 10011
Addition
X: 10011
Y: + 11001 Y: + 11001
Z: + 01011 Z: + 01011
C: 11011 S: 00001

X: 10011
Y: + 11001
Z: + 01011
S: 00001
C: 11011
Sum: 110111
Carry Save Adder Design
3 Operand Carry-Save
Addition
...
X15 Y15 Z15 X14 Y14 Z14 X1 Y1 Z1 X0 Y0 Z0

Carry-Save
Adder
Carry-Save
Adder ... Carry-Save
Adder
Carry-Save
Adder
Block Block Block Block

C15 S15 C14 S14 C1 S1 C0 S0

16-Bit Carry Look-Ahead Block

Sum [17:1] Sum [0]


Carry-Look Ahead Adder Carry Save Adder
Block Block

X [15:0] X [15:0]
CARRY CARRY
LOOK-AHEAD Sum [16:0] Y [15:0] SAVE Sum [17:0]
ADDER ADDER
Y [15:0] Z [15:0]

How do they Differ?


They differ in the way carry is computed. Carry
takes the maximum time to be computed so if we can do it
faster then we get faster adders.
Analysis of the Carry-Lookahead
Adder
• n bit adder, m-bit blocks, n/m blocks
• Delay through the adder: 2 * delay through the lookahead
block + delay through the super-lookahead block
▫ Lookahead block 2 log m
▫ Super-block: 2 log n/m = 2 log n – 2 log m
▫ Total: 2 log n + 2 log m
• Logic: scales like the lookahead blocks
▫ Size p block: O(p^3) from before
▫ Two size of blocks: n/m blocks of size m, one block of size n/m
▫ Total: n/m * m^3 = nm^2, (n/m)^3
▫ Choose m to minimize max(nm^2,(n/m)^3)
▫ Solution at m=n^(2/5). Total is n + n^3/5
Carry Select Adder
• Consists of pairs of k-bit wide blocks, each block
consisting of a pair of ripple carry adders (one
assuming a carry input and one assuming no carry
input) and a multiplexers controlled by the
incoming carry signal.
Carry Select Adder
• “Combinational Speculative Execution”
• Basic intuition:
▫ Adders spend time waiting to see what
carry-in is
• Therefore
▫ Go ahead and guess each way
▫ Pick the right answer when the carry
comes by
Carry-Select adder
• Each block is doubled
▫ One block computes Carry-in=0, other carry-in=1
▫ Actual carry-in (carry-out from previous block) computes result
 m sum bits
 1 carry-out bit

0
1
m-bit block m-bit block
m-bit block

m
m

1 0 1 0

Block 0
m
Block 1
Analysis of Carry-Select Adder
• Delay analysis: Worst-case path is through Block0 then control of
multiplexer chain
• O(m) gates in Block0
• O(p = n/m) gates in multiplexer chain

Blockp1 Blockp0 Block21 Block20 Block11 Block10 Block0

 Choose m to minimize max(n/m, m)


 Minimum is to choose m= n
Twelve-bit Carry-Select
Example
• Problem: add -3 (0xffd, 111111111101) to 17 (0x011,
000000010001))
• Use 4-bit carry select blocks 1 d
0 f 0 f 1 f 1 f

1 0 1 0

0,f 0,0
0,0 0,1 0 e

0
0,0 0
Hardware for the Carry Select
Adder
• n blocks, each of n gates
• Additional hardware is n multiplexers +
additional adder for each block but the
first
• n - n additional adder bits
• Therefore n + 2n - n = 2n gates
• Exactly twice the size of an ordinary
adder, but delay is n instead of n
One-level k-bit Carry-Select Adder
Two-level k-bit Carry Select Adder
Conditional Sum Adder
• Extension of carry-select adder
• Carry select adder
▫ One-level using k/2-bit adders
▫ Two-level using k/4-bit adders
▫ Three-level using k/8-bit adders
▫ Etc.
• Assuming k is a power of two, eventually
have an extreme where there are log2k-
levels using 1-bit adders
▫ This is a conditional sum adder
Conditional Sum Adder:
Top-Level Block for One Bit Position
Three Levels of a Conditional Sum
Adder
x y x
i+3y i+3x y x y i+2 i+2 i+1 i+1 i i

branch point
1-bit conditional
sum block concatenation
c=1 c=0 c=1 c=0 c=1 c=0 c=1 c=0
2 2 2 2 2 2 2 2
1 1
1+1
1 1
2 2 1 1 2 2 1 1

c=1 c=0 c=1 c=0


3 3 3 3

1
2+1
1
2 2
3 3
block carry-in
determines selection
5 4+1
5
c=0
c=1
16-Bit Conditional Sum Adder Example
Conditional Sum Adder Metrics
Carry Skip Adder

• Module bypasses the carry-in based on Propagate


signals.
• Uses the idea that if corresponding bits in the two
words to be added are not equal, a carry signal
into that bit position will be passed to the next bit
position.
• Improves delay of Ripple Carry Adder
16 Bit constant block width carry
skip adder
• Single level
16 Bit Variable block width carry
skip adder
• Multiple level
A few comparisons…
• In terms of area efficiency the
▫ Ripple Carry adder is the most efficient.
▫ Carry Look ahead adder is the least.
• In terms of speed
▫ Carry Look ahead adder is the fastest.
▫ Ripple carry adder has the largest delay
• In terms of power
▫ Ripple carry adder has the least power consumption.
▫ Manchester consumes more power due to precharge
phase.
Worst case delay

Adders Type Adder size


16 32 64
Ripple carry 36 68 132
adder
Constant Width 23 33 39
Carry Skip
Variable Width 17 19 23
Carry Skip
Carry look ahead 10 14 14
adder
Carry select 14 14 14
Conditional sum 12 15 18
Number of gates

Adders Type Adder size


16 32 64
Ripple carry adder 144 288 576
Constant Width Carry 156 304 608
Skip
Variable Width Carry 170 350 695
Skip
Carry look ahead 200 401 808
adder
Carry select 284 597 1228
Conditional sum 368 857 1938
Adder Summary
Adder Delay Size

Ripple-Carry n n

Carry-Lookahead log n n^3


(full)
Carry lookahead 14/5 log n + n^3/5
n
(block)
Carry Select n 2n

Carry-Bypass n n+n
Simulation

• Gate level simulation


• Circuit level simulation
Gate level simulation
• The gate level simulator used to measure the
average number of gates that switch during an
addition accepts as its input a linked list of gates,
with each gate pointing to the inputs, and also
the next gate to be evaluated.
• Average number of logic transitions (table)
Circuit level simulation
• Tables and Graphs:
• Worst case delay of a 16 bit adder estimated with
CAzM (Circuit AnalyZer with Macromodeling)
• Size of 16 bit adders
• Average power consumption of 16 bit adders
calculated with CAzM
Physical Measurement
• Test chip (each adder as separate power pin)
• Only the pads and the output multiplexers share a
power net.
• All the six adders are contained in the center of the
chip
• The adders run horizontally and are laid out from
top to bottom in the same order as in the tables, i.e.
the top most adder is the ripple carry adder and
bottom most adder is the conditional sum adder.
Die photo of test chip

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy