Activity Factor PDF
Activity Factor PDF
Rajaram Sivasubramanian
Associate Professor
ECE Department
Thiagarajar College of Engineering,Madurai-
Engineering,Madurai-15
• Terminology
– Literal
Literal:: A variable or a constant eg. a,b,2,3.14
Cube:: Product of literals e.g. +3a2b, -2a3b2c
– Cube
SOP:: Sum of cubes e.g. +3a2b – 2a3b2c
– SOP
– Cube
Cube--free expression:
expression: No literal or cube can divide all
the cubes of the expression
– Kernel
Kernel:: A cube free sub-
sub-expression of an expression,
e.g. 3 – 2abc
– Co
Co--Kernel
Kernel:: A cube that is used to divide an expression
to get a kernel, e.g. a2b
Kernels and Kernel
Intersections
DEFINITION:
An expression is cube
cube--free if no cube divides the expression evenly (i.e. there is no
literal that is common to all the cubes).
ab + c is cube-
cube-free
ab + ac and abc are not cube-
cube-free
Note:: a cube-
Note cube-free expression must have more than one cube.
DEFINITION:
The primary divisors of an expression F are the set of expressions
D(F) = {F/c | c is a cube}.
37
Kernels and Kernel
Intersections
DEFINITION:
The kernels of an expression F are the set of expressions
K(F) = {G | G D(F) and G is cube-
cube-free}.
DEFINITION:
A cube c used to obtain the kernel K = F/c is called a co
co--kernel of K.
38
Example
Example:
x = adf + aef + bdf + bef + cdf + cef + g
= (a + b + c)(d + e)f + g
kernels co--kernels
co
a+b+c df, ef
d+e af, bf, cf
(a+b+c)(d+e)f+g 1
39
Kernels: Example
F = adf + aef + bdf + bef + cdf + cef + bfg + h
= (a+b+c
(a+b+c)(
)(d+e
d+e)f)f + bfg + h
cube Prim. Div. Kernel Co-
Co-kernel level
a df+ef NO NO --
b df+ef+fg NO NO --
bf d+e+g YES YES 0
cf d+e YES YES 0
df a+b+c YES YES 0
fg b NO NO --
f (a+b+c)(d+e)+bg YES YES 1
1 F YES YES 2
Kerneling Illustrated
1 a((
((bc
bc + fg
fg)(d
)(d + e) + de(b + cf
cf)))
))) + beg
a (bc + fg
fg)(d
)(d + e) + de(b + cf
cf))
ab c(d+e
c( d+e)) + de
abc d+e
abd c+e
abe c+d
ac b(d + e) + def
acd b + ef
Note:: f/f/bc
Note bc = ad + ae = a(d + e)
42
Probabilistic State Transition
Graphs (STGs)
• Edges showing state transitions not only indicate input values
causing transitions and resulting outputs
• Also have labels pij giving conditional probability of transition
from state Si to Sj
– Given that machine is in state Si
– Directly related to signal probabilities at primary inputs
–
• Introduce self-
self-loops in STG for don’t care situations to
transform incompletely-
incompletely-specified machine into completely-
completely-
specified machine
Example
Relationship Between State
Assignment and Power
• Hamming distance between states Si and Sj:
– H (Si, Sj) = # bits in which the assignments differ
• Average Power:
– D (i) = signal activity at node i
– Approximate Ci with fanout factor at node i
• Average power proportional to:
Handling Present State Inputs
• Find state transitions (Si, Sj) of highest probability
• Minimize H (Si, Sj) by changing state assignment of Si Si,,
Sj
• Requires system simulation of circuit over many clock
periods, noting signal values and transitions
• If one-
one-hot design is used, note that H = 2 for all states
– Impossible to obtain optimum power reduction
– Uses too many flip-
flip-flops
• Optimization cost function:
Simulated Annealing Optimization
Algorithm
• Allowed moves:
– Interchange codes of two states
– Assign an unassigned code to a state that is randomly
picked for an exchange
• Accept move if it decreases g
• If move increases g, accept with probability:
e - |d (g) | / Temp
Example State Machine
State Assignments
59
1) Kernel Extraction : Consider the Boolean Function
F = uvy + vwy + xy + uz + vz.
• Identify all co-kernel/kernel paris of F. State their levels.
• Overall process:
– From tree leaves to root, compute trade-
trade-off curves for
matching gates from library
– From root to leaves:
• Select minimum-
minimum-cost solution
• Reduces average power by 22% while keeping the
same delay
– Sometimes increases area as much as 39%
Circuit--Level Optimizations
Circuit
Algorithm Components
• Deep sub-
sub-micron technology:
• Delay of NAND/NOR to INVERTER delay lessens in
deep sub-
sub-micron technology
– Series transistor connection Vds and Vgs smaller than
that for inverter transistor
• Encourages wider use of complex CMOS gates
• Important to order series transistors correctly
– Delay varies by 20%
– Power varies by 10%
CMOS Gate Power Consumption
• For series-
series-connected
transistors, signal with
lower activity should be
on transistor closest to
power supply rail
Calculating Transition Probability
– Use serial-
serial-parallel graph edge reduction techniques
Transistor Reordering
96
Zero Slack Algorithm
Transistor Reordering
Logically equivalent CMOS gates may not have
identical energy/delay characteristics
y (a1 a2)b
a1 a2 a1 a2
b b b b
a2 a1 a2 a1
y y y y
b b a1 a2 a1 a2
a1 a2 a1 a2 b b
A B C D
1 1 0 Then:
0 1 0 POut=0 = 3/4
POut=1 = 1/4
1 0 0
0 0 1
P0→1 = POut=0 * POut=1
= 3/4 * 1/4 = 3/16
100
Transition Probabilities cont’d
A and B with different input signal probability:
PA and PB : Probability that input is 1
P1 : Probability that output is 1
To keep
performance
Large W’s
Higher Capacitance Lower Voltage
• The first stage is driving the gate capacitance of the second and
the parasitic capacitance
• input gate capacitance of both stages is given by NCref, where
Cref represents the gate capacitance of a MOS device with the
smallest allowable (W/L)
Transistor Sizing
• When there is no parasitic capacitance contribution (i.e., α = 0), the
energy increases linearly with respect to N and the solution of utilizing
devices with the smallest (W/L) ratios results in the lowest power.
• At high values of α, when parasitic capacitances begin to dominate over
the gate capacitances, the power decreases temporarily with increasing
device sizes and then starts to increase, resulting in a optimal value for
N.
• The initial decrease in supply voltage achieved from the reduction in
delays more than compensates the increase in capacitance due to
increasing N.N.
• after some point the increase in capacitance dominates the achievable
reduction in voltage, since the incremental speed increase with
transistor sizing is very small
• Minimum sized devices should be used when the total load capacitance
is not dominated by the interconnect
Summary
• Logic
Logic--level multi-
multi-level logic optimization is effective
– State assignment
– Modified MIS algorithm
• Logic
Logic--level Technology mapping
– Tree
Tree--covering algorithm is effective
• Circuit
Circuit--level operations are effective
– Transistor input reordering
– Transistor resizing
Addition of Binary Numbers
Full Adder. The full adder is the fundamental building block
of most arithmetic circuits:
ai bi
Carry-Propagate: pi ai bi
and Carry-Generate gi
g i a i bi
cout
cin
One-bit adder could be
implemented as shown
si
Oklobdzija 2004 Computer Arithmetic 116
High--Speed Addition
High
ci 1 gi pi ci
ai bi
g i ai bi pi ai bi
0
cout
s 1 cin
x2 p1
0
x3 p2
0
x4 p3
0
p9 p8 p7 p6 p5 p4
Arithmetic Local
Circuit Control Dual-rail design does increase
the wiring density, but it offers
the advantage of complete
insensitivity to delays
Arithmetic Local
Circuit Control
Part of an asynchronous chain of
computations.
The Ultimate in Low-
Low-Power Design
A P=A A P=A
B TG Q=B B FRG Q = AB AC
C R = AB C C R = AC AB Some reversible
(a) Toffoli gate (b) Fredkin gate
logic gates.
A P=A
A P=A B Q=AB
PG
FG
B Q=AB C R = AB C
A B
Reversible binary full
B Cout
adder built
of 5 Fredkin gates, 0 + G
with a single Feynman C A
gate used to fan out
1 s
the input B. The label
“G” denotes “garbage.” 0 s
(sum)
3.1
3.2
3.3
3.3
3.3