Delay Estimation
Delay Estimation
Delay estimation:
Critical paths: generally, the slowest paths in a logic design, can be recognized by timing simulations
using timing analyser.
The critical paths can be affected at four main levels:
• The architectural/microarchitectural level
• The logic level
• The circuit level
• The layout level
The critical paths can be affected at four main levels:
• The architectural/microarchitectural level
• The logic level
• The circuit level
• The layout level
Microarchitectural level: designer should know the algos of function implementation and the technology being
targeted, Viz., how many gate delays fit in a lock cycles, how fast processing occurs, how fast memories are
accessed, how long signal takes to propagate along a wire, etc.
Tradeoffs include: the no. of pipeline stages, no. of execution units and the size of memories.
Logic level: tradeoffs include types of functional blocks (eg. Ripple carry vs CLA adders), no. of stages in the gate
cycle, fan-in and fan-out of the gates.
Circuit level: delay can be tuned at this level by proper selection of transistor sizes or using other styles of CMOS
logic.
Layout level: The floorplan is of great importance because it determines the wire length that can dominate the delay.
Also, tuning of particular cells may reduce parasitic capacitance.
Some definitions
RC Delay model
Elmore delay model
Linear delay model
RC delay model of MOS transistors (of k unit width)
The propagation delay of a logic gate can be estimated from RC delay model
N N i
t pd Rn i Ci Ci R j
i 1 i 1 j 1
Here Rn-i represents the total resistance from the source to the node i.
Example: Find the Elmore’s delay at the nodes Vout3 and Vout4 in the RC tree.
Logical effort is also defined as “the slope of the gate’s delay vs. fanout curve divided by the
slope of an inverter’s delay vs. fanout curve”.
Introduction to logical effort
Chip designers face a bewildering array of choices–
What is the best circuit topology for a function?
How many stages of logic give least delay?
How wide should the transistors be?
Logical effort is a method to make these decisions
–Uses a simple model of delay
–Allows back-of-the-envelope calculations
–Helps make rapid comparisons between alternatives
–Emphasizes remarkable symmetries?
The logical effort of a logic gate is defined as how worse it is at delivering output current
than would be an inverter with identical input capacitance.
The logical effort of a logic gate is defined as the ratio of its input capacitance to that of an
inverter that delivers equal output current.
Calculating logical effort
The logical effort of a logic function depends mainly on the circuit topology and slightly on the electrical
properties of the fabrication process used to build it.
Logical effort of individual stages of logic can be combined to find the logical effort of networks.
The Electrical effort h describes how the electrical environment of the logic gate affects performance and
how the size of the transistors in the gate determines its load driving capability.
Calculations of logical effort of some of the gates and digital circuits
For a crude approximation, let’s take a transistor of width w has diffusion capacitance equal to wCd.
The parasitic capacitance of an inverter is thus calculated as
For pull-up transistor (of width γ) , the diffusion capacitance = γCd
The parasitic delay of inverter pinv= parasitic capacitance/ input capacitance = Cd /Cg= 1.0 (approx)
w d
The parasitic pcapacitance
1
of logic
p gates can be estimated from the inverter parameters as follows:
inv
where w : width of transistors connected to the logic gate’s output
Parasitic delay (p) of various logic gates and digital circuits
This inverter approximation can be applied to an n-input NAND/NOR gate whose pull-down
transistor has width w and pull-up transistor has width γ, connected to the output signal.
Hence, p = npinv
Increasing transistor sizes reduces resistance but increases capacitance correspondingly, so parasitic
delay is, on first order, independent of gate size
This method gives crude estimation, more refined results can be found using Elmore’s delay model.
Example of applying linear delay model to logic gates
Example of applying linear delay model to logic gates
Use the linear delay model to estimate the frequency of an N-stage ring oscillator (RO)
constructed in a 65-nm process with τ= 3 ps.
For an inverter,
Logical Effort: g = 1
Electrical Effort: h = 1
Parasitic Delay: p = 1
Thus the delay of each stage : d = gh+p = 2
An N-stage RO has a period of 2N stage delays, therefore the period T = 2*2N
Frequency of N-stage : fosc = 1/(4N)
f = 1/(2*N*d )
N = 1/(2*f*d)
τ=3RC
• Path Logical Effort, (product of logical efforts of each stage along the path)
• Path Effort
• Path Effort
The branching effort (i.e., the total capacitance seen by a stage to the capacitance on the path)
Now, we can define the path effort F as the product of logical, electrical and branching efforts of the
path. F = GBH
Path Delay; D = = DF + P
Path Delay; D = = DF + P
Example 1: Consider the path from to involving three two-input NAND gates shown in Figure. The input
capacitance of the first gate is C and the load capacitance is also C. What is the least delay of this path
and how should the transistors be sized to achieve least delay?
To compute the path effort, we must compute the logical, branching, and electrical efforts along the path.
The path logical effort is the product of the logical efforts of the three NAND gates, G = g0g1g2
= (4/3)3 = 2.37.
The branching effort is B = 1, because all of the fanouts along the path are one, i.e., there is no
branching.
The electrical effort is H = C/C = 1. Hence, the path effort is F = GBH = 2.36
1
Now, we find the least delay achievable along the path to b D NF P = 3(2.37)1/3 + 3 (2pinv)
N
delay units
Delay in Multistage Logic Networks 1
D NF N P
Designing Fast Circuits
Example 1: Consider the path from A to B involving three
two-input NAND gates shown in Figure. The input capacitance
of the first gate is C and the load capacitance is also C. What
is the least delay of this path and how should the transistors
be sized to achieve least delay?
To compute the path effort, we must compute the logical, branching, and electrical efforts along the path.
The path logical effort is the product of the logical efforts of the three NAND gates, G = g0g1g2
= (4/3)3 = 2.37.
The branching effort is B = 1, because all of the fanouts along the path are one, i.e., there is no branching.
The electrical effort is H = C/C = 1. Hence, the path effort is F = GBH = 2.36
1
Now, we find the least delay achievable along the path to be D NF N P = 3(2.37)1/3 + 3 (2pinv)
delay units.
This minimum delay can be realized if the transistor sizes in each logic gate are chosen properly
To achieve this minimum delay, we must equalize the effort in each stage.
Since the path effort is 64, the stage effort should be (64)1/3 = 4.
Starting from the output, z = 5.4C*(4/3)/4 = 1.5
The second stage drives three copies of the third stage, so y= 3z*(4/3)/4 = z = 1.5C
We can check the math by finding the size of the first stage 2y*(4/3)/4 = (2/3)y = C, as given in the
design
Delay in Multistage Logic Networks 1
D NF P
N
Example 5: Consider three alternative circuits for driving a load 25 times the
input capacitance of the circuit. The first design uses one inverter, the
second uses three inverters in series, and the third uses five in series. All
three designs compute the same logic function. Which is best, and what is
the minimum delay?
In all three cases, the path logical effort is one, the branching effort is one, and the electrical effort is 25.
the path delay D = N(25)1/N +Npinv where N = 1, 3, or 5.
For N =1, D = 26 delay units
For N = 3, D = 11.8 and for N = 5,
D = 14.5 delay units
The best choice is N =3. In this design, each stage will bear an effort of (25)1/3 = 2.9,
so each inverter will be 2.9 times larger than its predecessor.
Delay in Multistage Logic Networks 1
D NF P N
From the previous example, the logical and branching efforts are both 1, but the electrical effort is
20000/7.2 = 2777.
IF N= 6, the stage effort will be f = (2777)1/6 = 3.75
Thus the input capacitance of each inverter in the string will be 3.75 times that of its predecessor.
The path delay will be D = 6*3.75 + 6 *pinv = 28.5 delay units.
This corresponds to an absolute delay of 28.5τ = 1.43 ns, assuming τ = 50 ps.