0% found this document useful (0 votes)
23 views55 pages

11 Timing Analysis Logic

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views55 pages

11 Timing Analysis Logic

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

Timing Analysis

Pingqiang Zhou
ShanghaiTech University
ASIC Timing: Role of CAD Tools
 ASIC timing has deep interactions with logic and layout
synthesis.
High-level description
+ Timing Specifications

Logic Layout
Synthesis Synthesis

Connected cells with Placed cells


delay constraints on with real locations,
signal paths real connecting wires
2
ASIC Timing: Role of CAD Tools
 Requirement on timing analysis
 Logic-side tools must estimate delays through
unplaced/unrouted logic.
 Layout tools must estimate delays through placed/routed
logic.

Logic Layout
Synthesis Synthesis

3
Our Topics for ASIC Timing
 Logic-side: Static Timing Analysis
 How do we estimate the worst-case timing through a logic
network?
 Turns out to be longest paths through a graph, which
properly models the gates and wires.

 Layout-side: Interconnect Delay Analysis


 We place the gates, route the wires. Then, how do we estimate
wire delays?
 The problem is built up on electrical circuit model. We will
show key results.

4
Timing Analysis at the Logic Level
 Goal: Verify timing behavior of our logic design
 Input:
 A gate-level netlist.
 Timing models of the gates and/or wires.
 Output:
 Signal arrival time at various points in the network.
 Longest delays through gate network.
 Does the netlist satisfy the timing requirement? If not, where
are key problems?
 This is surprisingly complicated in the real world...

5
Analyzing Design Performance
 Assume design is synchronous.
 All storage is in explicit sequential elements, e.g., flip-flop elements.
 Consequence: we can just focus on delays through combinational
gates.
Launch Capture

Combinational
Flip Flops

Flip Flops
Logic
(No feedback
loops)

Clock
6
Question: Can’t We Just Simulate Logic?
 What logic simulation does?
 Determines how a system will behave by simulating the logical
function.
 Gives the most accurate answer with good simulation models.
 … but it is (practically) impossible to give a complete answer –
especially timing.
 Requires examination of an exponential number of cases.
 All possible input vectors …
 With all possible relative timings …
 Under all possible manufacturing variations …
 We need a different, faster solution...

7
Timing Analysis: Basic Model
 Assume we know clock cycle 1ns
 E.g., 1GHz clock, cycle = 1ns.

 For logic to work correctly, longest delay through


network must be shorter than the clock cycle.

Combinational
Flip Flops

Flip Flops
Logic
Longest delay
< Clock cycle

8
Clock
Timing Analysis: Gate Delay Models
 First: we need a model of delay through each logic gate.

What’s gate delay ∆?



 Delay of a single gate:

X Y X Y

1
9
[Courtesy: UC Berkeley]
10
11 [Courtesy: UC Berkeley]
12 [Courtesy: UC Berkeley]
13 [Courtesy: UC Berkeley]
In Reality: Gate Delay is Very Complex
 Gate type affects delay  Gate loading affects delay

∆ ≠ ∆ ∆ ≠ ∆

 Waveform shape affects delay  Transition direction


affects delay

∆ ≠ ∆ ∆ ≠ ∆

14
In Reality: Gate Delay is Very Complex
 Gate input pin affects delay
 Why? At transistor level, inputs are not symmetric.

∆ ≠ ∆

 At nanoscale, delays are even statistical


 Why? Depends on process, voltage, and thermal (PVT) variations.

PDF


200 240 280 ∆
15
Our Model: Pin-to-Pin Delay
 In our lecture, we keep it simple: Fixed, pin-to-pin delay
model
 No slopes, transition direction, distributions. Loading effects
“pushed” into gate delay itself.
 Per-pin delays are essential, but we will use just 1 value per
gate, for simplicity.
 Turns out this is enough to see all the interesting algorithm
ideas.
∆=3 ∆=5

∆=3 ∆=5
16
Do We Consider Logical Function?
 Does logic function matter?
 Try an example, where we “erase” gates.
 In this example: PI = Primary Input, PO = Primary Output

PI ∆=8 ∆=8
∆=2 ∆=2
PI ∆=1 ∆=1 PO

PI ∆=1

What is the longest delay? 20


17
Now, Suppose We Know Logic Gates
Can we indeed have the longest path? No!

PI ∆=8 ∆=8
0 0
∆=2 ∆=2
1 1 PO
PI ∆=1 ∆=1

PI
 We cannot sensitize this path: cannot make a logic change
at this input propagate down this path to change this output.
18
Topological vs. Logical Timing Analysis
 When we ignore logic, this is called Topological Analysis.
 We only work with graph and delays, don’t consider logic.
 We can get wrong answers: what we found was called a
False Path.
 Going forward: we ignore logic (Too tough to deal with)
 Assume that all paths are statically sensitizable.
 Means: Can find a constant pattern of inputs to other PIs that
makes some output sensitive to some input.
 Reminder: this is exactly the Boolean Difference concept of
sensitivity.
 This timing analysis has a name: Static Timing Analysis
(STA).
19
STA Representation: Delay Graph
 From gate-level network, we build a delay graph.
 Vertices: Wires in gate network, one per gate output, also one
for each PI and PO.
 Edges: Input pin to output pin of gate in network (one edge
per input pin). Put gate delays on edges.

PI a ∆=4 c a 4
∆=3 e c 3
PI ∆=4 PO b 4 e
b PI d ∆=3 d 3

20
Delay Graph
 Common convention: Add Source/Sink nodes
 Add one “source” (src) node that has a 0-weight edge to
each PI.
 Add one “sink” (snk) node that has a 0-weight edge from
each PO.
 Why do this?
 Now, the network has exactly 1 “entry” node, and 1 “exit” node.
 All the longest (or shortest) path question have same start/end
nodes.
0 a 4
c 3 0
src 0 b 4 e snk
21 0 d 3
Representation: Delay Graph
 What about interconnect delay?
 Can still use delay graph: model each wire as a “special” gate
that just has a delay.
∆=1 x
PI a ∆=4 c ∆=2 w
∆=3 e ∆=2 q
PI ∆=4 PO
b ∆=2 y PI d z ∆=3
∆=1
1 4
0 a x 2
2 c w 3 2 0
src 0 b y 4 e q snk
0 d z 3
22 1
Operations on Delay Graph
 So how do we use delay graph to do timing analysis?
 What we don’t do: Try to enumerate all the source-to-sink
paths.
 Why not? Exponential explosion in number of paths, even for
small graph.

0 1 2 … n
How many paths
from 0 to n?
2𝑛

 There’s a smarter answer: Node-oriented timing analysis


 Find, for each node in delay graph, worst delay to the node
along any path.

23
Define Values on Nodes in Delay Graph
 Arrival Time at a node (AT)
 AT(n) = Latest time the signal can become stable node n
 Think: Longest path from source
 Required Arrival Time at node (RAT)
 RAT(n) =Latest time the signal is allowed to become
stable at node n
 Think: Longest path to sink

AT RAT
n snk
src
Other paths
24
Define Values on Nodes in Delay Graph
 Slack at node n: Slack(n) = RAT(n) – AT(n)
 Amount of timing “margin” for the signal: positive is good,
negative is bad.
 Determined by longest path through node.
 Amount by which a signal can be delayed at node and
not increase the longest path through the network
 Can increase delay at node (to minimize power, circuit
area) with positive slack and not degrade overall
performance.
AT RAT
n snk Slack(n) = RAT(n) – AT(n)
src
25 Other paths
Slack is Hugely Important in Timing Analysis
 About slacks
 Defined so negative slack always bad: it indicates a timing
problem.
 Measures “sensitivity” of network to this node’s delay.
 Positive slack
 Good: can change something at this node, and not hurt network’s
overall timing.
 Example: make this node slower, maybe save some power, not hurt
timing.
 Negative slack
 Bad: have problem at this node; more negative the slack, bigger the
problem.
 Looking for a node to “fix” to help timing? These nodes are where to
26
look first. These affect the critical paths the most.
How To Compute ATs? Recursively
predecessor successor
paths * * paths

src p ∆(p,n) n s snk



* *

predecessor successor

0, if n is source
AT(n) = maximum delay to n =
max {AT(p)+∆(p,n)}, else
27
p ∈ prec(n)
How To Compute ATs?
 Big idea
 If we know the longest path to each predecessor of n, it’s a
simple “Maximum” operation to compute the longest path to
n itself.

AT(x)=5 x ∆=7
AT(n) = max {AT(p)+∆(p,n)}
p ∈ {x,y,z}
src AT(y)=10 y ∆=1 n
= max {5+7, 10+1, 5+5}
∆=5
z =12
AT(z)=5

28
How To Compute RATs?
predecessor successor
paths * * paths

src p n s snk
∆(n,s)



* *
predecessor successor
 RAT(n): Latest time in cycle where n could change and signal
would still propagate to sink before end of cycle.
 First, what is RAT(snk)? RAT(snk) = Cycle Time
 How about internal node n? RAT(n) = min {RAT(s)−∆(n,s)}
s ∈ succ(n)
29
How To Compute RATs? Recursively
predecessor successor
paths * * paths

src p n s snk
∆(n,s)



* *
predecessor successor

Cycle Time, if n is sink


RAT(n) =
min {RAT(s)−∆(n,s)}, else
s ∈ succ(n)
30
ATs versus RATs: Look at Clock Cycle
 Why the differences between AT and RAT definitions?
0, if n is source
AT(n) = max {AT(p)+∆(p,n)}, else
p ∈ prec(n)
Cycle Time, if n is sink
RAT(n) = min {RAT(s)−∆(n,s)}, else
s ∈ succ(n)
RAT: longest logic
RAT(n) longest delay to the capture
AT: longest logic edge of clock
delay after launch AT(n)
edge of clock.

31
Launch Clock Cycle Time Capture
Negative Slack is BAD!

Slack = RAT – AT is Negative!

Signal arrives too late, and


there is too much delay
from node to output.
RAT(n) Signal does not arrive at flip
flop input before the capture
AT(n) edge of clock.

Launch Clock Cycle Time Capture


32
Example
 Suppose clock cycle is 12.
 AT=longest path from source TO node.
 RAT=(cycle time 12) – (longest path FROM node to sink).
 Slack = RAT – AT

1 3 2
0
a d g i 0
5 1
0 4 3 0
src b f j snk
0 4 2 0
1
c e h k
2 3 5
33
Compute ATs

Compute ATs from src to snk

0 1 3
4 2
7
1
0
a d g i 0
0 0 5 1 15
6 12 0
0 4 3
src b f j snk
0 4 2 0
1
c e h k
2 3 5
0 2 10 15
34
Compute RATs
 Clock cycle is 12.

Compute RATs from snk to src

-3 -2 3
10 2
12
1
0
a d g i 0
-3 -1 5 1 12
3 12 0
0 4 3
src b f j snk
0 4 2 0
1
c e h k
2 3 5
2 4 7 12
35
Compute Slack
 Slack = RAT - AT

0 -3 -3 1 -2 -3 3 4 10 6 2 7 12 5
1
0
a d g i 0
0 -3 -3 5 1
0 -1 -1 1212 0 0 15 12 -3
src 0 b 4
f 3
j snk
0 6 3 -3 4 2 0
1
c e h k
2 3 5
022 24 2 10 7 -3 15 12 -3

36
Analyzing the Example
 Worst (most negative) slack is -3.
 Big results:
 Your timing violation at sink = the worst slack value.
 The worst slack appears along this entire worst path.

0 -3 -3 1 -2 -3 3 4 10 6 2 7 12 5
1
0
a d g i 0
0 -3 -3 5 1
0 -1 -1 1212 0 0 15 12 -3
src 0 b 4 f 3
j snk
0
1 6 3 -3 4 2 0
c e h k
2 3 5
022 24 2 10 7 -3 15 12 -3
37
Analyzing the Example
 Look at those slacks
 A negative slack at an output (PO) means a failed timing
requirement.
 A negative slack on internal node n means there is a path from n
to some problem PO.

 So, slacks are hugely useful!


 Beyond just knowing what is the worst path, slacks tell us the
problem gates on this path.

38
The Most Typical STA Problem
 Answer this problem: What are all the too-slow paths that
violate timing?
 Most useful report:
 Report paths in order, from slowest to fastest.
 In other words: Enumerate these paths, in delay order.

Flip Flops Logic

Flip Flops
39
Clock
What Do We Need?
 Calculate all the ATs.
 Calculate all the RATs.
 Calculate all the Slacks.
 … do all of this very efficiently: Delay graphs are huge!
 …enumerate the violating paths, in worst delay order.
0 -3 -3 1 -2 -3 4 10 6 7 12 5
1 3 2
a d g i 0
0
0 -3 -3 0 -1 -1 5 1 15 12 -3
12 12 0
0 4 3 0
src b f j snk
0 6 3 -3 4 2 0
1
c e h k
2 3 5
40 0 2 2 2 4 2 10 7 -3 15 12 -3
Computational Strategy
 Topological sorting (“Topsorting”) the delay graph.
 Sort the vertices in the delay graph into one single ordered list.
 Essential property: if there is an edge from 𝑝 to 𝑠, then 𝑝
appears before 𝑠 in sorted order.
 Compute ATs by going forward through the sorted list.
 Compute RATs by going backward through the sorted list.

5
3
b d 6 Legal Topsorting Order
11 a, b, c, d, e, f
a f
a, b, d, c, e, f
4 9
c e 15
41
Assume Have Topsort: Compute ATs
computeATs() {
AT(SRC) = 0;
foreach ( n in topsort order ) {
AT(n) = -∞;
foreach ( node p in pred(n) )
AT(n) = max( AT(n), AT(p) + ∆(p,n) );
}
} * *

src p ∆(p,n) n s snk


* *
42 predecessor successor
Compute RATs
 Trick: Pretend all edges are reversed, they point from SNK to
SRC, and walk graph backwards.
computeRATs() {
RAT(sink) = CycleTime;
foreach ( n in reverse topsort order ) {
RAT(n) = ∞;
foreach (successor s in succ(n) )
RAT(n) = min( RAT(n), RAT(s) - ∆(n,s) );
}
} * *
src p n s snk
∆(n,s)

* *
43 predecessor successor
Using Slack For Path Reporting
 Useful slack property: all nodes on longest path have same worst
slack value.
 Surprising result: slack let us can find N worst paths, even
though we did not trace them all.
AT=3 AT=8
Assume clock cycle = 29 RAT=3 RAT=23
Slack=0 Slack=15
5
3
b d 6
AT=0 AT=29
RAT=0 a 11
f RAT=29
Slack=0 9 Slack=0
4 c e 15
AT=4 AT=14
RAT=5 RAT=14
44
Slack=1 Slack=0
N-Worst Path Reporting
 We evolve partial paths; each partial path stores 3 things:
(Path itself, Delay of this path, Slack of the final node on path)
 We store the partial paths in a min heap, which is indexed on
the Slack value.
 Initially this heap contains only the source node.
 Algorithm is quite simple (and just like maze routing!).
 Expand: Pop partial path off the heap – it has the smallest (most
negative) slack.
 Reach target? If its end node is the sink, print out the path.
 Reach: Else add each successor node to make new partial paths,
push them back onto the heap, each with
(Path, Delay, Slack) labeled.
 Repeat until N paths are reported – go pop next partial path.
45
Worst Case Path Reporting: Example
Slack=0 Slack=15
5
3
b d 6
11
Slack=0 a f Slack=0
4 9
Source c e 15 Sink
Slack=1 Slack=0
 Min heap entry of the form (Path, Delay, Slack)
 Initially, heap contains only the source node.

Min Heap Min Heap


Expand path a, (a-b,3,0)
(a,0,0)
reach b & c (a-c,4,1)
46
Worst Case Path Reporting: Example
Slack=0 Slack=15
5
3
b d 6
11
Slack=0 a f Slack=0
4 9
Source c e 15 Sink
Slack=1 Slack=0

Min Heap
Min Heap
(a-b-e,14,0)
(a-b,3,0) Expand path a-b,
(a-c,4,1)
(a-c,4,1) reach d & e
(a-b-d,8,15)

47
Worst Case Path Reporting: Example
Slack=0 Slack=15
5
3
b d 6
11
Slack=0 a f Slack=0
4 9
Source c e 15 Sink
Slack=1 Slack=0
f is sink!. Report 1st
Min Heap worst path a-b-e-f,
(a-b-e,14,0) with delay=29
Expand path a-b-e,
(a-c,4,1) reach f Min Heap
(a-b-d,8,15)
(a-c,4,1)
(a-b-d,8,15)
48
Worst Case Path Reporting: Example
Slack=0 Slack=15
5
3
b d 6
11
Slack=0 a f Slack=0
4 9
Source c e 15 Sink
Slack=1 Slack=0
Min Heap Min Heap
(a-c,4,1) Expand path a-c, (a-c-e,13,0)
(a-b-d,8,15) reach e (a-b-d,8,15)

49
Worst Case Path Reporting: Example
Slack=0 Slack=15
5
3
b d 6
11
Slack=0 a f Slack=0
4 9
Source c e 15 Sink
Slack=1 Slack=0
f is sink!. Report 2nd
Min Heap worst path a-c-e-f,
Expand path a-c-e, with delay=28
(a-c-e,13,0)
(a-b-d,8,15) reach f Min Heap
(a-b-d,8,15)
50
Worst Case Path Reporting: Example
Slack=0 Slack=15
5
3
b d 6
11
Slack=0 a f Slack=0
4 9
Source c e 15 Sink
Slack=1 Slack=0
f is sink!. Report 3rd
worst path a-b-d-f,
Min Heap with delay=14
Expand path a-b-d,
(a-b-d,8,15) reach f Min Heap
(EMPTY) Done!
51
Worst Case Path Reporting: Example
Slack=0 Slack=15
5
3
b d 6
11
Slack=0 a f Slack=0
4 9
Source c e 15 Sink
Slack=1 Slack=0

 We find three paths: Note: only 3 possible paths


 a-b-e-f, delay = 29 from source to sink in graph,
 a-c-e-f, delay = 28 so we found them correctly in
 a-b-d-f, delay = 14. delay order!

52
Static Timing Analysis: Summary
 STA is a very important step in design of complex ASICs.
 It’s a critical “sign off” step, which means: you don’t get to
fabricate unless you pass.
 Several big ideas
 Gate level delay models matter, and can be pretty complex in
real world.
 Logical ≠ Topological path analysis (i.e., STA).
 Build delay graph, calculate ATs, RATs, slacks recursively.
 Concept of slack is big: lets us locate worst paths, and problem
gates on path.
 A similar idea to maze routing lets us find worst paths in delay
order.
53
Static Timing Analysis: Aside
 STA is a huge topic – several things we did not cover.
 STA for sequential elements
 How do we model flip flops and latches, so we can verify, e.g., that setup and
hold times are met? More tricks with delay graph.
 Early mode versus late mode timing
 Our development was only so-called late mode timing, where we care about
longest path. Early mode focuses on shortest paths, and is critical for more
advanced timing, e.g., with transparent latches.
 Incremental STA
 In practice, you change 10,000 gates out of 1,000,000 gates, you don’t want to
redo the whole STA analysis. Advanced methods can update incrementally.

54
55

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy