Physical Design Ad Guid
Physical Design Ad Guid
2
Outline
• ASIC Design Flow
• Physical Design
— Introduction to Physical Design
— Physical Design Inputs
— Physical Design Flow
▫ Import Design & Partitioning
▫ Floorplanning & Power planning
▫ Placement & Placement Optimizations
▫ CTS & CTS Optimizations
▫ Routing & Routing Optimizations
▫ Physical Verification (DRC, LVS, ERC)
▫ DFM Checks
▫ Formal Verification (LEC)
▫ Parasitic Extraction (RC Extraction)
▫ Timing Analysis (STA), Power Analysis & IR Drop Analysis
▫ Tapeout
3
ASIC Physical Design
ASIC Design Flow
Partitioning
System Specification
Floor Planning
ENTITY test is Architectural Design
port a: in bit; end
ENTITY test; RTL Design Placement
and Verification
F F F F F F
F F F F F F
Synthesize the F F F F F F CTS
Design F
F
F
F
F F
F F
F
F
F
F
Fabrication
Formal Verification
Packaging and
Testing Static Timing Analysis
• Possible Issues
Static Timing
— Timing Violations Analysis
Signoff
5
ASIC Physical Design
Physical Design Inputs
• Netlist (.v or .vhd)
—Netlist contains —Netlist also consists of
▪ Std. Cell instance – Name & ▪ Ports of Standard Cells and Macros
Drive Strength ▪ Interconnection details
▪ Macros & Memories instances
• Constraints
— Types of Constraints — Synopsys Design Constraints
▪ Design Rule Constraints (SDC)
▪ Optimization Constraints —Timing Constraints
—Design Rules from the Fab. ▪ Clock Definition (Time Period,
▪ Max. Cap./ Transition/Fanout Duty Cycle)
▪ Clock Uncertainties ▪ Timing Exceptions (False
—Optimization Constraints from the Paths, Asynchronous Paths)
designer —Non-Timing Constraints
7
ASIC Physical Design
Physical Design Inputs
• Technology Related files
— Technology file — Interconnect Parasitic file
▪ Defines Units and Design Rules for ▪ Used for layer parasitic extraction
Layers and Vias as per the Technology ▪ Contains Layer/ Via capacitance
▪ Name and Number conventions and resistance values in a Lookup
of Layers and Vias Table (LUT) format
▪ Physical and Electrical parameters ▪ Also used to generate parasitic
of Layers and Vias formats for the extraction tools (e.g.
▪ E.g. nxtgrd, captbl)
Direction/Type/Pitch/Width/Offset/ ▪ Extraction tool formats are more
Thickness/Resistance/Capacitance/ accurate than interconnect
Max. Metal Density/Antenna Rule/ parasitic formats
Blockages/Design Rules ▪ .ict - Interconnect Technology
▪ Manufacturing Grid definition Format (Cadence Format)
▪ Site/Unit Tile definition ▪ .itf - Interconnect Technology
▪ Technology file has to load before Format (Synopsys Format)
loading other LEF files since it holds ▪ .ptf - Process Technology File
the layer information for that (Mentor Graphics Format)
particular technology
▪ .tech.lef (Cadence Format) — Map file
▪ .tf - technology file (Synopsys Format) ▪ Useful if is there is 2 different
naming convections in Technology
file, LEF or Interconnect Parasitic file
8
ASIC Physical Design
Physical Design Inputs
• Power Specification File • Clock Tree
—Power Modes & Power Domains Constraints/ Specification
—TieUp supply & Tie Low supply • Root Pin Definition
—Power Nets & GND Nets • Insertion Delay (ID) and Skew Target
• Maximum Capacitance/
• Optimization Directives Transition/ Fanout (DRVs)
—Don’t use • Transition can be classified into Leaf
• Cells that are not supposed Transition and Buffer Transition
to optimize • No. of Buffer Levels (Tree depth)
—Size only/ use only • List of Buffers/ Inverters for CTS
• Upsizing/ Downsizing only with • List of Through pin, Preserved
this list of cells Pin, Exclude Pin
• NDRs can be defined in CTS Spec.
• Design Exchange Formats for the Clock Tree Routing
—List & locations of Components, Vias, • Macro Models
Pins, Nets, Special nets
—Die dimensions, Row definitions, • IO Information File
Placement and Bounding Box Data, —Pin/ Pad locations
Routing Grids, Power Grids, Pre-routes —Edge and order for IO Placement
—.def, .fp are the common formats —.tdf, .io are common formats 9
ASIC Physical Design
Physical Design Flow
Clock Tree
Routing Post-CTS Opt. Pre-CTS Opt.
Synthesis (CTS)
Static Timing
Tape-out IR-Drop Analysis Power Analysis
Analysis
10
ASIC Physical Design
Import Design
• Import Design
— The following input files information are loaded to the PnR tool
• Netlist (.v/ .vhd/ .edif)
• Physical Libraries (.lef)
• Timing Libraries (.lib)
• Technology Files
• Constraints (.sdc)
• IO Info. File (optional)
• Power Spec. File (optional)
• Optimization Directives (optional)
• Clock Tree Spec. File (optional at floorplan stage)
• DEF/ FP (optional if floorplan is not done)
— Core area is approximately calculated by the tool from the Netlist
— While Importing, first we have to load the LEF files and then LIB files
11
ASIC Physical Design
Import Design
• Sanity Checks
— Sanity Checks mainly checks the quality of netlist in terms of timing
— It also consists of checking the issues related to Library files,
Timing Constraints, IOs and Optimization Directives
— Some of the Netlist Sanity Checks:
▪ Floating Pins
▪ Unconstrained Pins
▪ Un-driven i/p Ports
▪ Unloaded o/p Ports
▪ Pin direction mismatches
▪ Multiple drivers etc.
— Other possible issues include Unconnected/ Wrongly Connected Tie-
high/ Tie-low Pins and Power Pins (since Tie-up or Tie-down
connectivity always through Tie-Cells)
12
ASIC Physical Design
Partitioning
• Physical Design Netlist
— All Ports must be defined and should be present
— No Assignment Statements (1’b0 or 1’b1 statements): Assignment
statements causes feed-through (i/p directly to o/p) and can be
avoided by adding buffers
— No Unmapped Cells
— No Combinational Timing Loops
• Styles of Implementation
— Flat
▪ Small to Medium ASIC
▪ Better Area Usage Since no reserve space around each sub-design for power/ground
— Hierarchical
▪ For very large design
▪ When sub-systems are design individually
▪ Possible only if a design hierarchy exist
13
ASIC Physical Design
Partitioning
• The Hierarchical Partitioning is done prior to Floorplan
• Partition can be done based on
— Design Hierarchy
— Timing Criticality
— Functionality
— Clock Domain
— Design Files
— Block Size
• Partitioning Inputs and
Outputs by Registers
• Minimize Cross-
Partition-Boundary IO
• For Sub-block designs, the Partitioning is not required
• For Full Chip only we need to design with Partitioning
14
ASIC Physical Design
Floorplanning
• Terminologies and Definitions
— Utilization
▪ Area of the core that is used by placed Standard Cells and Macros expressed
in percentage
— Manufacturing Grid
▪ The smallest geometry that semiconductor foundry can process or
smallest resolution of your technology process (e.g. 0.005)
▪ All drawn geometries during Physical Design must snap to this
▪ grid While Masking fab. use this as reference lines
— Standard Cell Site/ Standard Cell Placement Tile/ Unit Tile
▪ The minimum Width and Height a Cell that can occupy in the design
▪ The Standard Cell Site will have the same height as Standard Cells, but the
width will be as small as your smallest Filler Cell
▪ It’s one Vertical Routing Track and the Standard Cell
▪ Height All Standard Cells must be multiple of Unit Tile
— Standard Cell Rows
▪ Rows are actually the Standard Cell Sites abut side by side and then Standard
Cells are placed on these Rows
▪ Cells with the equal no. of Track definition will have same height 15
ASIC Physical Design
Floorplanning
• Terminologies and Definitions
— Placement Grid
▪ Placement Grid is made up of Standard Cell Site
▪ Its always a multiple of Manufacturing Grid
▪ Placement Grid is made up of the Rows which are composed of Sites
— Routing Grid and Routing Track
▪ Horizontal and Vertical line drawn on the layout area which will guide for
making interconnections
▪ The Routing Grid is made up of the Routing Tracks
▪ Routing Tracks can be Grid-based, Gridless based or Subgrid-based
— Flight-line/ Fly-line
▪ Virtual connection between Macros and Macro or Macros and IOs
— Macro
▪ Any instances other than Standard Cell and is as loaded as black box to
the design is Macro
▪ Intellectual Property (IP) e.g. RAM, ROM, PLL, Analog Designs etc.
▪ Hard Macro: IP with Layout implemented
▪ Soft Macro: IP without Layout implemented (HDL)
16
ASIC Physical Design
Floorplanning
• Steps in Floorplan
— Initialize with Chip & Core Aspect Ratio (AR)
— Initialize with Core Utilization
— Initialize Row Configuration & Cell Orientation
— Provide the Core to Pad/ IO spacing (Core to IO clearance)
— Pins/ Pads Placement
— Macro Placement by Fly-line Analysis
— Macro Placement requirements are also need to consider
— Blockage Management (Placement/ Routing)
17
ASIC Physical Design
Floorplanning
• Initialization
— Row Configuration
▪ Slanting lines in the side of the cell rows denote the Cell Orientation
18
ASIC Physical Design
Floorplanning
• Initialization
— Utilization = + x 100 %
Aspect Ratio = =
—
—or simply Height/Width
—Aspect Ratio decides the shape
—Full chip Aspect Ratio can have a maximum value of 1.25
19
ASIC Physical Design
Floorplanning
• IO Placement
— Chip Level its IO Pads and Block Level its IO Pins
— Pin is a logical entity and is a property of a Port
— Port is a physical entity and a Port have only 1 Pin associated with
— it Netlist will have Pins and Layout will have Ports
— Unplaced Port is not
represented in the Layout
— Different types of IOs
▪ Signal Pads/Pins
▪ Core Power Pads/Pins
▪ IO Power Pads/Pins
▪ Corner Pads (Doesn’t
hold any logic, provides
IO Pad Ring connectivity)
▪ Filler Pads (Fill the gaps between IO pads to get the Ring Connectivity)
— Physical-only pads that are not part of the input Gate level Netlist need
to be inserted prior to reading IO constraints 20
ASIC Physical Design
Floorplanning
• IO Placement
— IO Pads enables the design to operate at different voltages with the help
of Level Shifters, Pre-Drivers (at Core Voltage) Post-Drivers (at IO
Voltage)
— No of Core Power Pads needed:
. x x
— There will be 1 Core GND Pad along with every Core Power Pad
21
ASIC Physical Design
Floorplanning
• Macro Placement
— Fly-line Analysis (For Connectivity information)
— Macro keep-out (For Uniform Standard Cell Region)
— Channel Calculation (Critical for Congestion and Timing)
— Avoid odd shaped area for Standard Cells
— Funnel shaped Macro Placements are preferred
— Fix the Macro locations, so that tool wont alter during Optimization
— Spacing between Macro:
x .
+ S
22
ASIC Physical Design
Floorplanning
• Macro Placement Tips
— Place macros around chip periphery, so that core area will be
— clustered Consider connections to fixed cells when placing Macros
— In advanced Technology Nodes Macro Orientation is fixed since the Poly
Orientation can’t vary, so there will be restrictions in Macro Orientation
— Reserve enough room around Macros for IO
— Routing Reduce open fields as much as possible
— Provide necessary Blockages around the Macro
23
ASIC Physical Design
Floorplanning
• Blockages
— Placement Blockage & Routing Blockage Rectilinear
— Both of the Blockages can again be classified as- Macro
Without
• Hard, Soft and Partial Blockages
Blockage
— Hard Blockage
• Complete Standard Cell Blockage
— Soft Blockage With
• Non-Buffering Blockage Blockage
— Partial Blockage
• Partial Standard Cell Blockage and is used
to avoid congestion
• We can Block Standard Cells as per the
required percentage value
— Keep-out/ Halo Macro
• Halo is similar to Soft Blockage (Terminology in Cadence EDI)
• Its basically a keep-out Macro margin
• Halo respects Macro while other Blockages respect location Halo around Macro
i.e., even if Macro is moved Halo also moves along with it 24
ASIC Physical Design
Floorplanning
• Issues arises due to bad Floorplan
— Congestion near Macro Pins/ Corners due to insufficient Placement
Blockage
— Std. Cell placement in narrow channels led to Congestion
— Macros of same partition which are placed far apart can cause Timing
Violation
27
ASIC Physical Design
Power planning
• Power Plan: Calculations
▪ Total Dynamic Core Current =
. +
28
Courtesy: asic-soc.blogspot.in
ASIC Physical Design
Power planning
• Sub-block Configuration
Grid Offset
Grid Steps
Core Boundary
Grid Spacing
Chip Boundary
Ring Width
Core Area
Ring Spacing
Rails
29
ASIC Physical Design
Power planning
• Full Chip Configuration
• Cell Padding
—Cell Padding is done to reserve space for avoiding Routing Congestion
—Cell Padding adds Hard Constraints to Placement
—The Constraints are honored by Cell, Legalization, CTS, and Timing Optimization
31
ASIC Physical Design
Pre-Placement Optimization
• Pre-Placement Optimization Goals
—Routability
—Performance (Timing)
—Power (with Cells)
• Zero-RC Optimization
—Optimizes the netlist without any delay models, thus provides an optimal
starting point for placement
—Timing during 0-RC Opt and that of during Synthesis has to be matched
—Else indicate problems in the Technology File, Timing Library, Constraint Files, or
overall design
—Logical restructuring and up/down size are optimizations at the 0-RC stage
34
ASIC Physical Design
Placement
• Placement Stages
— Global Placement
— Detail Placement
— Placement Legalization
— In-Place Optimizations
• Global/ Coarse Placement
— To get the approximate initial location Global/ Coarse Placement
— Cells are not legally placed and there
can be overlapping
• Detail/ Legal Placement
— To avoid cell overlapping
— Cells have legalized locations
— Legalize placement will place the cells in
their legal position with no overlap
Detail/ Legal Placement
35
ASIC Physical Design
Placement
• Placement Legalization
— Placed Macros are legally oriented with Standard Cell Rows
• In-Place Optimizations
— Scan Chain Reordering
• After Placement, report Congestion, Utilization and Timing
• Tie off cell instances provide connectivity between the
Tie-high and Tie-low logical inputs pins of the Netlist
instances to Power and Ground
• Tie off cells are placed after the placement of Standard Cells
• After placement check the Cell Density
• Global Route (GR)
—Whole region is divided into an array of rectangular sub-regions each of which
may accommodate tens of routing tracks in each dimension called Global Cells
—Global Route is performed to estimate the inter-connect parasitics and
Routing Congestion Map 36
ASIC Physical Design
• VT Swapping
— To optimize for leakage power (HVT, RVT/SVT, LVT)
• Cloning
— To reduce fanout
• Buffering
— Long nets are buffered or remove buffers to bring the timing advantage
• Re-Buffering
— To improve slews, reduce net capacitance and reduce fanout
• Logical Restructuring
— To optimize timing and area without changing the functionality of the design
— Breaking complex cells into simpler cells or vice versa
• Pin Swapping 37
ASIC Physical Design
a a C d
b d 0.2
b A 0.035 0.026 A e
f e 0.2
a a
B g B f 0.2
B
b 0.1 g 0.2
b h
h 0.2
39
ASIC Physical Design
Pre CTS Optimization
• Set the Optimization Directives
— don’t_use, size_only
• Perform High Fanout Nets Synthesize (HFNS)
— High Fanout Nets are Synthesized before Clock Tree Synthesis
— HFNS is the Buffering of High Fanout Nets
— Usually High Fanout Nets may have Fanout of more than 1000
Eg., Reset, Clear etc.
• Set CTS Routing Rules
— Shielding
— Non Default Rules (NDR)
• Set RC Delay Models
40
ASIC Physical Design
Pre CTS Optimization
• Non-Default Rule (NDR)
— The user-defined Routing rules apart from the default Routing Rule
— Often used to “harden” the sensitive nets like Clock Nets
— NDRs make the Clock Routes less sensitive to CrossTalk or EM effects
— Double/ Triple Width for avoiding Electromigration
— Double/ Triple Spacing for avoiding Crosstalk
— NDRs will improve Insertion Delay
Sig1
Default Clk
Routing Rule Sig2
Sig
1
Gn Double Spacing
NDR Route d
on Clock net
Clk Double Width
Gnd
G
r
o
u
n
d
S
h
i
e
l
d
i
n
g
Sig2
41
ASIC Physical Design
Clock Tree Synthesis (CTS)
• The Clock Problem
— Clock skew
— Long clock insertion delay
— Skew across clocks
— Heavy clock net loading
— Clock is power hungry
— Clock to signal coupling effect (CrossTalk)
— Electromigration on clock net
• Clock Tree is a path from the Clock Source (Root) to Clock
Sinks (Leaf)
• Clock Tree Synthesis is the process of creating this Clock
Path from Clock Source to Clock Sinks
• All Clock pins of flip Flop are considered as Clock Sinks
(Leaf); where the Clock Tree Synthesis ends
42
ASIC Physical Design
Clock Tree Synthesis (CTS)
Before CTS Clock Source
FF FF FF FF FF FF FF FF FF FF
FF FF FF FF FF FF FF FF FF FF
43
ASIC Physical Design
Clock Tree Synthesis (CTS)
• Main concerns for Clock Design
— Skew
▪ Most important concern for clock networks
▪ For increased clock frequency, skew may contribute over 10% of the system
cycle time
▪ Due to variations in trace length, metal width and height, coupling caps
▪ It can also be due to variations in local clock load, local power supply, local
gate length and threshold, local temperature
— Power
▪ Very important, as clock is a major
power consumer Clock
▪ It switches at every clock cycle
— Noise
▪ Clock is often a very strong aggressor
▪ May need shielding
— Delay
▪ Not really important
▪ But Slew Rate is important
(sharp transition)
44
ASIC Physical Design
Clock Tree Synthesis (CTS)
• Clock Skew: Spatial Clock Variation
Clock Skew
Difference in clock
arrival time at two
spatially distinct points
A A
Compressed timing
path
B
Skew
45
ASIC Physical Design
Clock Tree Synthesis (CTS)
• Clock Jitter: Temporal Clock Variation
Compressed timing
path
Period A ≠ Period B
Clock Jitter
Difference in clock
period over time
46
ASIC Physical Design
Clock Tree Synthesis (CTS)
• CTS Pre-requisites
—Legally Placed and Optimized with acceptable Congestion
—Timing should be good
—No Design Rule Violations
—Power/Ground nets are pre-routed
—HFNS done
—Logical/Physical Library should have special Clock Cells
• CTS Objects
—The timer starts from every Clock Source and traces forward over Combinational Arcs
until it reaches the Clock Pin of a flop or another Clock Source
—All Pins/ Timing Arcs in the forward trace before a valid Leaf are considered to be in
the clock network
—Pin or Combinational Timing Arcs that trace to a non-clock pin are not part of Clock
Tree network (e.g. D pin of FF)
— Sequential elements are traced through if it is a source of the Generated Clock
— Clock tracing after the propagation of Case Analysis
— Clock tracing should be Mode aware
— Inverters are added in Clock Tree for better Duty Cycle
— Limit the buffer/inverter list to just 3 or 4 buf/inv sizes
47
ASIC Physical Design
Clock Tree Synthesis (CTS)
• CTS Flow
— Check and fix Macro locations
— Read CTS SDC: Clock Tree begins at SDC defined clock pin
and ends at stop pin of the flop
— Generate CTS Specification file Example of CTS spec file
▪ Max. Skew AutoCTSRootPin SH1/I23/Z
ExcludePin + XPU/CAM/C
▪ Max. and Min. Insertion Delay MaxDelay 5ns MinDelay 0ns
▪ Max. Transition, Capacitance, Fanout
Buffer buf1 buf2 inv1 inv2 del1
▪ No. Buffer levels (Tree depth) MaxSkew 500ps
▪ Buffer/ Inverter list MaxDepth 20
LeafPin + FPU/CORE/A rising
▪ Clock Tree Routing Metal Layers END
▪ Clock Tree Leaf Pin, Root Pin, Preserve
Pin, Through Pin and Exclude Pin
— Compile CTS using CTS Spec. file
— Place Clock Tree Cells
— Route Clock Tree (Optional and can be done during Signal net
routing also) 48
ASIC Physical Design
Clock Tree Synthesis (CTS)
• CTS Algorithms
— RC Tree Based CTS Clock
Source
— H Tree based Algorithm
— X Tree based Algorithm
— Method of Mean and Median (MMM)
RC-Tree
— Geometric Matching Algorithm (GMA)
Clock
— Pi Configuration Source
H-Tree
Clock
Clock Source
Source
GMA
Pi Configuration 49
Courtesy: usebackend.wordpress.com
ASIC Physical Design
Clock Tree Synthesis (CTS)
• Before CTS all Clock Pins are driven by a single Clock Source
F F F F F F
F F F F F F
F F F F F F
F F F F F F
F F F F F F
F F F F F F
F F F F F F
F F F F F F
50
Courtesy: vlsi-basics.com
ASIC Physical Design
Clock Tree Synthesis (CTS)
• After CTS the buffer tree is built to balance the loads
and minimize the skew
F F F F F F
F F F F F F
F F F F F F
F F F F F F
Clock
sink pins
Source clock pin Buffer Tree
F F F F F F
F F F F F F
Clock
F F F F F F
F F F F F F
51
ASIC Physical Design
Clock Tree Synthesis (CTS)
• After CTS a “delay line” is added to meet the minimum Insertion
Delay (ID)
F F F F F F
F F F F F F
Extra buffers
added for F F F F F F
balancing the F F F F F F
Minimum
Insertion Delay
F F F F F F
F F F F F F
Clock
F F F F F F
F F F F F F
52
ASIC Physical Design
Clock Tree Synthesis (CTS)
• Analyze the Clock Tree
— Report Timing (both Setup and Hold)
— If timing not met then check clocks be grouped (balanced together)
— Report Insertion Delay & Skew and verify that the targets are achieved
— Report DRV targets (Fanout, Capacitance and Transition)
— Check the intended Leaf Cell (Clock Sinks) is reached
— Check the Clock Tree Exceptions are not in the Clock Tree
— Report the pre-existing cells, such as Clock Gating Cells
— Do Quality-of-Report (QoR)
— Check Clock Tree converges either with itself or with another Clock Tree
— Clock Tree has timing relationship with other Clock Trees for inter Clock
Skew balancing
— Check Design Rule Constraints
— Check Routing Constraints
— Report Power and Area
53
ASIC Physical Design
Post CTS Optimization
• Post CTS Optimization
— Optimization with Useful Skew
— Optimization with Total Negative Slack (TNS)
— Fine Grid Spacing
— Post CTS Optimization Techniques
▪ Shielding
▪ Sizing
▪ Buffer re-location
▪ Level adjustment
— Optimize the design for Hold Time
▪ Hold Violations should be fixed first in Best Corner and then in Worst Corner
— Area Optimizations
54
ASIC Physical Design
Routing
• Importance of Routing as Technology shrinks
—Device (Gate) delay decreases
—Interconnect resistance increases
—Vertical heights of interconnect
layers increase, in an attempt to
offset increasing interconnect
resistance
—Area component of interconnect
capacitance no longer dominates
—Lateral (sidewall) and fringing
components of capacitance start
to dominate the total capacitance
of the interconnect
—Interconnect capacitance dominates Multi-level Interconnection (MLI)
total Gate loading Technology Layer stacks
• Routing Objectives
— Skew requirements
— Open/Short circuit clean
— Routed paths must meet setup and hold timing margin
— DRVs max. Capacitance/ Transition must be under the limit
— Metal traces must meet foundry physical DRC requirements
— Layout geometries should meet Current Density specification 55
ASIC Physical Design
Routing
• Routing Stages
— Trial/Global Routing
▪ Identifying routable path for the nets
driving/ driven pins in a shortest distance
▪ Does not consider DRC rules, which gives an
overall view of routing and congested nets
▪ Assign layers to the nets
▪ Identify and assign net segments over
the specific routable window called
Global Route Cell (GRC)
▪ Avoid congested areas and also long detours
▪ Avoid routing over blockages
▪ Avoid routing for pre-route nets such as
▪ Rings/Stripes/Rails Uses Steiner Tree and Maze algorithm
— Track Assignment
▪ Takes the Global Routed Layout and assigns each nets to the specific Tracks
and layer geometry
▪ It does not follow the physical DRC rules
▪ It will do the timing aware Track Assignment
▪ It helps in Via Minimization
56
ASIC Physical Design
Routing
• Routing Stages
— Detail/Nano Routing
▪ Detailed routing follows up with the track
routed net segments and performs the
complete DRC aware and timing driven routing
▪ It is the final routing for the design built after
the CTS and the timing is freeze
▪ Filler Cells are adding before Detailed Routing
▪ Detail Routing is done after analyze the cause
for congestion in the design, add density screen
or change flooplan etc. Trace Grid
Point
• Grid Based Routing
—Metal traces (routes) are built along and M1
centered upon routing tracks on the grid points
— Various types of grids are Manufacturing Grid,
Routing Grid (Pitch) and Placement Grid
— Grid dimension should be multiple of
Manufacturing Grid
M2
Pitch Track 57
ASIC Physical Design
Routing
• Routing Preferences
—Typically Routing only in “Manhattan” N/S E/W directions
E.g. layer 1 – N/S Layer 2 – E/W
Metal1
— Spacing checks with the adjacent layers VIA34
Metal2
— Width checks for all layers Metal3
— Via dimension rules Metal4
— Slotting rules Metal5
VIA23
— A segment cannot cross another
segment on the same wiring layer
—Wire segments can cross wires on
other layers
—Power and Ground have their own VIA12
layers, mostly the top layers VIA45
• Layer Routing directions: Each metal layer has its own preferred
routing direction and are defined in a technology rule file
—M1: Horizontal, M2: Vertical , M3: Horizontal, M4: Vertical and so on
• In some cases, we can avoid following preferred routing direction for
smart routing (Non-preferred direction)
58
ASIC Physical Design
Post Routing Optimization
• Signal Integrity (SI) Optimization by NDRs and Shielding for
the sensitive nets
• Types of Shielding for sensitive nets
— Same layer shielding
— Adjacent layer/ Coaxial shielding
Critical
net
Critical net
Non-critical
nets
Non-critical nets Ground net
Ground net in Metal 4
Metal 3 Layer Metal 4 Layer Metal 5 Layer
Same Layer shielding Adjacent Layer/
Coaxial shielding
59
ASIC Physical Design
Post Routing Optimization
• Filler Cell insertion
— Filler Cells can be inserted before or after Detailed Routing
— If Fillers contain metal routing other than Pre-Routing then
Fillers should be inserted before Routing
— Width of the smallest Filler Cell is the Placement Grid Width
— Once Fillers are inserted then the placement is fixed and tool
can’t move Cells for further optimization
60
ASIC Physical Design
Post Routing Optimization
• Metal Fill
— Filling up the empty metal tracks with metal shapes to met metal
density rules
— 2 types of Metal Fill
▪ Floating Metal Fill: Doesn’t completely shield the aggressor nets, so SI will
be prominent
▪ Grounded Metal Fill: Completely shields the aggressor nets, so less SI
▪ impact Grounded Metal Fill is complex as compared to Floating Metal Fill
— Metal Density Rule helps to avoid Over Etching/ Metal Erosion
• Spare Cells Tie-up/ Tie-down
— Tie Cells connects the Gate of Cells to VDD/ VSS so reduces ESD
— Tie-up Cells help in avoid Power Bounce
— Tie-down Cells help in avoid Ground Bounce
— Tie Cells are basically MOS in Diode-Connected configuration
61
ASIC Physical Design
Physical Verification (DRC)
• Design Rule Check (DRC) is the process of checking physical layout
data against fabrication-specific rules specified by the foundry to
ensure successful fabrication
• Process specific design rules must be followed when drawing layouts
to avoid any manufacturing defects during the fabrication of an IC
• Process design rules are the minimum allowable drawing dimensions
which affects the X and Y dimensions of layout and not the depth/vertical
dimensions DRC Rule130nm
Width-based
90nm 65nm 45nm
1-2 2-3 3-5 7
• As Technology Shrinks Spacing
Min-Area Rule
1 pitch 2 pitch 3 pitch 5 pitch
—Number of Design Rules are increasing
—Complexity of Routing Rules is increasing Cut Number N/A 1-2 4-5 5-6
(Via)
—Increasing the number of objects involved Dense EoL
N/A N/A M1/M2 All Layers
—More Design Rules depending on Width, (OPC)
Min-step
Halo, Parallel Length (OPC) N/A 1 5 5
64
ASIC Physical Design
Physical Verification (LVS)
• Layout Versus Schematic (LVS) verifies the connectivity of a
Verilog Netlist and Layout Netlist (Extracted Netlist from GDS)
• Tool extracts circuit devices and interconnects from the
layout and saved as Layout Netlist (SPICE format)
• As LVS performs comparison between 2 Netlist, it does
not compare the functionalities of both the Netlist
• Input Requirements
— LVS Rule deck
— Verilog Netlist
— Physical layout database (GDS)
— Spice Netlist (Extracted by the tool from GDS)
• LVS checks examples
— Short Net Error, Open Net Error, Extract errors, Compare errors
65
ASIC Physical Design
Physical Verification (LVS)
• Open Net Error
Same net is routed in two different metal layers but not connected
Same net with different pin names Two different nets shorting together
66
ASIC Physical Design
Physical Verification (LVS)
• Extract Errors
— Parameter Mismatch
— Device parameters on schematic and layout are compared
— Example: Let us consider a transistor here, LVS checks are necessary
parameters like width, length, multiplication factor etc.
67
ASIC Physical Design
Physical Verification (LVS)
• Compare Errors
— Malformed Devices
— Pin Errors
— Device Mismatch
— Net Mismatch
68
ASIC Physical Design
Physical Verification (ERC)
• Electrical Rule Check (ERC) is used to analyze or confirm
the electrical connectivity of an IC design
• ERC checks are run to identify the following errors in layout
— To locate devices connected directly between Power and Ground
— To locate floating Devices, Substrates and Wells
— To locate devices which are shorted
— To locate devices with missing connections
• Well Tap connection error: The Well Taps should bias the
Wells as specified in the schematics
69
Courtesy: asicpd.blogspot.in
ASIC Physical Design
Physical Verification (ERC)
• Well Tap Density Error: If there is no enough Taps for a
given area then this error is flagged
• Taps need to be placed regularly which biases the Well
to prevent Latch-up
e.g., In typical 90nm process the Well Tap Density Rule
require Well-taps to be placed every 50 microns
• Tools: Mentor Graphics Calibre, Synopsys Hercules,
Cadence Assura, Magma Quartz
70
ASIC Physical Design
DFM Checks
• Antenna Check (Gate-Oxide Integrity check)
— Maximum net length restriction connected to Gate terminal
• Redundant Contacts/ Via
— Multiple Via improves both Yield and Timing by resistance paralleling
• Metal Filling
— Narrow Metal Layer separated from other Metal Layers may get high
density of etchant than closely spaced wires
— Over etched filling up empty tracks with metal shapes to meet Metal
Density Rules
• Metal Slotting
— Wide metal lines (Power Nets) expands significantly due to the high
temperature during fabrication leads to destruction of the isolation
and passivation layer that protect the wafer
— To avoid it put slots or holes in these metal layers at regular intervals
— Slotting also prevent the stress damage during wafer dicing and
packaging 71
ASIC Physical Design
Formal Verification
• Formal Verification
— Verify the two representations of circuit design exhibits same behavior
— Checks the behavior of the Combinational Logics by checking the
Compare Points
— Targets implementation errors and not the design errors
— Power checks: checks Power Switches/ Retention Cells/ Isolation Cells/
Level Shifters and all power connectivity
— If any manual editing in the design then LEC has to be done at any
point of time
• Formal Verification • Informal Verification
— Complete coverage (Simulation)
— Effectively exhaustive — Incomplete coverage
simulation Limited amount of simulation
—
— Cover all possible — Spot check a limited number
sequences of inputs of input sequences
— Check all corner cases Many corner cases not checked
—
—No test vectors are needed
— Designer provides test vectors
72
ASIC Physical Design
Formal Verification
• Types of Formal Verification
—Gate-level to Gate-level (Logical Equivalence Check after Routing)
• To ensure that some netlist post-processing did not change the functionality of
the circuit
—RTL to Gate-level (after Synthesis)
• To verify that the netlist correctly implements the original RTL code
—RTL to RTL (before Synthesis)
• To verify that two RTL descriptions are logically identical
• Logical Equivalence Check (LEC) will have two stages
—Constrains setup stage
—Logical Equivalence Check stage
73
ASIC Physical Design
Parasitic Extraction
• Parasitic Extraction: Importance
— Shrinking process geometries
— New device structures
— An increasing number of metal layers at each new process node
— Much more closer nets at each new process node
— Increasing wire aspect ratio of height to width
— Increasing operating frequency
• Parasitic Capacitance can be reduced by using higher
metals, provide spacing, shielding, Avoid parallel routing
• At higher clock frequencies, RC interconnect modeling is
no longer adequate and inductance must be included in
interconnect modeling
• Reluctance (Inductance) effect becomes more and more
prominent as the resistance (both device and interconnect)
decreases and the operating frequency increases 74
ASIC Physical Design
Parasitic Extraction
• Capacitance
C= εo W H/d
— Transistors
▪ Depends on area of transistor gate, physical of materials, thickness of insulator,
diffusion to substrate
— Poly to Substrate L
▪ Parallel plate and fringing d i H
— Capacitance between W
conductors
▪ Coupling Capacitance
▪ Area Capacitance
▪ Fringing Capacitance
▪ Crossover Capacitance
75
ASIC Physical Design
Parasitic Extraction
• Coupling Capacitance/ Lateral Capacitance
— The capacitance between nets on the same Metal layer
— Dominant over interlayer capacitances with every new process
technology
• Fringing Capacitance
— Capacitance between nets of
different Metal layers and
other layers due to Sidewall
Capacitance
• Parallel/Crossover Capacitance
— Capacitance between nets
area area
of 2 different Metal layers
SUBSTRATE
• Area Capacitance
— Capacitance between Metal layers and Substrate
• In modern processes, the width of interconnect wires at
lower levels of metal is so small that the Fringing Capacitance
of the wire is larger than the Area Capacitance
76
ASIC Physical Design
Parasitic Extraction
• Resistance
R = ρ L/H W
— Wire Resistivity
— Complex 3D geometry around Vias
• Inductance
— Self Inductance;
— Mutual Inductance,
— At high frequency Skin effect possibility
• Models used for Parasitic Extraction
— Lumped-C, Lumped-RC, Lumped-RLC
— Pi segment
— Pin-to-pin delays are modeled by RC delays
77
ASIC Physical Design
Parasitic Extraction
• Sub-femto Farad accuracy required for extraction of designs
at advanced technology nodes
• STA tool uses extraction data at fast corner while calculating
hold and slow data while calculating setup to be pessimistic
as possible, so that your chip doesn't fail after it comes back
from the fab
• Common Extraction Formats: Standard Parasitic Format
(SPF), Reduced Standard Parasitic Format (RSPF), Detailed
Standard Parasitic Format (DSPF), Standard Parasitic
Extraction Format (SPEF)
• Tools: Synopsys Star-RCXT, Cadence QRC, Mentor
Graphics Calibre xRC
78
ASIC Physical Design
Timing Analysis
• Static Timing Analysis: Methodical analysis of a digital circuit
to determine if the timing constraints imposed are met and to
check the design is working properly
• Static Timing Analysis Flow
— Read the inputs required
— Setting up Constraints: IO Delay Constraints, DRVs, Timing Exceptions
(False/ Multi-Cycle paths), Recovery and Removal, Minimum Pulse
Width
— Construct Timing Graph: Partition Clock Domain, Ideal/ Propagated
Clock, Case Analysis
— Propagation
— Timing Report: End points with violations/ Paths enumeration
• Input Requirement
— Routed Netlist (.v)
— Libraries (.lib only)
— Constraints (.sdc)
— Delay Format (.sdf)
— Parasitic Values (.spef)
• Tools: Synopsys PrimeTime, Cadence ETS, Cadence Tempus
79
ASIC Physical Design
Timing Analysis (SI)
• Signal Integrity (SI)
— SI refers to the quality of the signal transportation during the circuit
operation
— In deep sub-micron the delays associated with the logic elements far
outweighed delays associated with the interconnect
— SI effects like Crosstalk (both noise and timing), Voltage (IR) Drop,
Waveform Integrity and Electromigration have complex
interdependencies
— When the technology shrinks, the effect of coupling capacitance also
increases
— Crosstalk is the undesirable phenomenon, caused by the cross
coupling capacitance between metal wires in a chip
— Signal Integrity comes as an added feature of Timing Signoff tools
— Crosstalk effects can be analyzed by enabling the SI switch in tools
— If Crosstalk is enabled then the tool will by default do the timing
in On Chip Variation (OCV) mode
— Tool can read the .spef consists of coupling capacitance info. 80
ASIC Physical Design
Power Analysis & IR Drop Analysis
• Power Analysis
— Static/ Leakage Power Analysis
— Dynamic Power Analysis
• IR Drop Analysis
— Static IR Drop Analysis
— Dynamic IR Drop Analysis
• Tools for Power and IR Drop Analysis
— Synopsys Prime Power
— Cadence EPS and Voltus
— Apache Redhawk
• Tape-out
— Final GDSII (Graphical Data Stream Information Interchange) or CIF
(Caltech Intermediate Format) to Foundry
— GDS contains Physical Layout information
81
Thank You
82
Analysis in
ASIC Physical Design
Outline
• Timing Analysis
— Dynamic vs. Static Timing Analysis
— Static Timing Analysis (STA)
• Congestion Analysis
• Power Analysis
— Dynamic Power Analysis
— Static Power Analysis
• IR Drop Analysis
— Dynamic IR Drop Analysis
— Static IR Drop Analysis
2
Timing Analysis
3
Timing Analysis
Dynamic Timing Analysis (DTA) Static Timing Analysis (STA)
Verifies functionality of the design by applying Checks Static Delay requirements of the circuit
input vectors and checking for correct output without any input or output vectors, so analysis
vectors times are relatively short and STA does not
check for logical correctness of the design
Quality increases with the increase of input Clock related all information has to be fed to
test vectors the design in the form of constraints and the
correctness of the constraints decides the
quality
Increased Test Vectors increase Simulation Timing can be analyzed for worst case and best
Time case simultaneously and also all timing paths
are considered
Can be used for synchronous as well as Not suitable for asynchronous designs
asynchronous designs
Also best suitable for designs having clocks Not suitable for designs having clocks crossing
crossing multiple domains multiple domains
Computational complexity involved in finding Has more pessimism and thus gives maximum
the Input Patterns/Vectors that produces delay of the design and STA and it works with
maximum delay at the output timing models
4
Static Timing Analysis (STA)
• Static Timing Analysis
— Effective methodology for verifying the timing characteristics of a
design without the use of test vectors
— Static Timing Analysis can be done only for Register-Transfer-Logic
(RTL) designs
— Functionality of the design must be cleared before the design is
subjected to STA
— STA approach typically takes a fraction of the time it takes to run
logic simulation
• STA tool analyzes all paths from each and every start point
to each and every end point and compares it against the
constraint that exists for that path
• Main steps of STA
— Break the design into sets of timing paths
— Calculate the delay of each path
— Check all path delays to see if the given timing constraints are met
5
Static Timing Analysis (STA)
• Clocked Storage Elements
— Transparent Latch, Level Sensitive
▪ Data passes through Latch when clock high, latched when clock is low
6
Static Timing Analysis (STA)
• Delays
— Time taken by a signal to propagate through a Cell or Net
— Actual Path Delay is sum of net and Cell Delays along the timing path
— Cell Delay is a function of Input Transition Time (Slew Rate), Total
Output Load (Net Cap + Sum of attached pin caps) and Process
Parameters (Temperature, Power Level)
—Intrinsic delay
▪ Internal to the Cell from Input pin to
Output pin caused by internal capacitance
—Propagation Delay
▪ Delay by a cell for a change of input signal
to result a change at output signal as a
function of Input Slew and Output load
▪ Propagation Delay can be Low to High
(tPLH) and High to Low (tPHL)
▪ Maximum Propagation Delay (Clock to
Q) is considered for Setup check
—Contamination Delay
▪ Best case delay from valid input to output
▪ Minimum Propagation Delay (Clock to Q) which is called Contamination Delay is
considered for Hold check
—Net Delay
▪ Total time for charging/discharging all the parasitic present in the given net 7
Static Timing Analysis (STA)
• Pins related to Clock Design
—Start/ Source / Root Pins
▪ Source pin of a Clock
—Stop/ Sink/ Leaf Pins
▪ All Clock Pins of Flip Flops
▪ Clock wont propagate after this Pin
—Through pin
▪ To make a Clock pin of a flop not a CTS Leaf pin
—Preserved Pin
▪ If we need to preserve a pin w.r.t. location etc.
—Exclude/ Ignore Pins
▪ All non-clock pins (D pin of Flip Flops or combo logic
▪ inputs) Not considered for Clock propagation
—Float Pins (Implicit Stop/ Macro Model)
▪ Same as Stop/ Sink Pin but internal Clock
Latency of it is considered for Clock Tree
▪ Its actually entry pin of the Hard Macro
—Explicit Sync (Stop) Pin
▪ Input of combo logic while considering Clock
▪ Tree Important while considering Clock Gating
—Explicit Exclude (Ignore) Sync Pin
▪ Clock Pin of Flop is not considered as Sync/ Stop pin
▪ This pin is due to Clock Gating concept
▪ In clock gating the signal will be given to AND Gate
8
Static Timing Analysis (STA)
• Timing Arc A
— Timing Arc is internal to the cell B Y
— Combinational Cells has Timing Arcs from each C
Input to each Output of the cell
— Flip-flops have Timing Arcs from the Clock Input pin to Data Output Q pin
(Propagation delay/ Delay Arc) and from Clock Input pin to Data Input
D pin (setup, hold checks/ Constraint Arc)
— Latches have 2 timing arcs: D Q
▪ Clock pin to Output Q pin, when D is stable
▪ Data D pin to Output Q pin when D
changes (Latch is transparent) Clk
• Timing Unate
— How Output changes for different types of transitions on Input
— Positive Unate if Output Transition is same as Input Transition
— Negative Unate if Output Transition opposite to Input Transition
— Non-Unate if the Output Transition cannot be determined solely from the
direction of change of an Input. It also depends upon the state of the
other Inputs
9
Static Timing Analysis (STA)
• Clock definitions in STA
— Synchronous Clocks
▪ 2 clocks are synchronous w.r.t. each other
▪ Timing paths launched by one clock and captured by another
— Asynchronous Clocks
▪ 2 clocks are asynchronous w.r.t. each other
▪ If no timing relation, STA can’t be applied, so the tool wont check the timing
— Mutually-Exclusive Clocks
▪ Only one clock can be active at the circuit at any given time
— Generated Clocks
▪ Clock generated from a clock source as a multiple of the source clock frequency
▪ The frequency can be a multiple or can be a divided by of the source clock
— Virtual Clocks
▪ Exists but not associated with any pin or port of the design
▪ Used as a reference in STA to specify Input Delays and Output Loads relative to
a clock (Needed to fix the Input2Reg and Reg2Output Violations)
▪ By defining Virtual Clock IO Constraints can be defined relative to this Virtual
Clock with no specification of the source port or pin
10
Static Timing Analysis (STA)
• A Timing Path is a point-to-point path in a design which
can propagate data from one flip-flop to another
—Each path has a start point and an end point
—Start point: Input ports or Clock pins of flip-flops
—Endpoints: Output ports or Data input pins of flip-flops
Timing Paths
11
Static Timing Analysis (STA)
• Timing Path Groups
— Timing paths are grouped into path groups by
the clocks controlling their endpoints
— Input pin/port to Register
▪ Delays off-chip + Combinational logic delays up to
the first sequential device
— Register to Register
▪ Start at a sequential device
▪ CLK-to-Q transition delay + the combinational
logic delay + external delay requirements
— Register to Output pin/port
▪ Delay and timing constraint (Setup and Hold) times
between sequential devices for synchronous clocks
+ source and destination clock propagation times
— Input pin/port to Output pin/port
▪ Delays off-chip + combinational logic delays
+ external delay requirements
12
Static Timing Analysis (STA)
• Clock Latency
— Total time taken by the clock signal to reach the input of the register
— Source latency is the time between clock sources to clock definition
ports
— Network latency is the time between clock definition ports to clock
leaf cells in the design
• Insertion Delay (ID)
— ID is the clock latency,
but after Clock Tree is
synthesized
• ID is the physical delay and
Clock Latency is the virtual delay
• Latency is a target given to the tool through SDC file or
clock tree attribute file and Insertion Delay is the achieved
delay value after CTS
13
Static Timing Analysis (STA)
• Source and Network Latency (Original Clock &
Generated Clock)
14
Static Timing Analysis (STA)
• Clock Uncertainty
— Clock Uncertainty is the time difference between the arrivals of clock
signals at registers in one clock domain or between domains
— Uncertainties include Clock Skew, Clock Jitter and Clock Margin
• Clock Skew
Skew
— Clock Skew refers to the absolute time
difference in clock signal arrival between
two points in the clock network
T -T =T
LAUNCH_CLOCK CAPTURE_CLOCK SKEW
— Positive Skew occurs when the Capture Clock is late w.r.t. Launch Clock
— Negative Skew occurs when the Capture Clock is early w.r.t. Launch Clock
— Local Skew is the Skew between the clock phase delays of two flip-
flops which are the Source and Target flop of a path (Source and
Destination flop)
— Global Skew is the difference between the longest and shortest branch of a
Clock Tree (Maximum Insertion Delay – Minimum Insertion Delay)
15
Static Timing Analysis (STA)
• Clock Jitter
— Jitter is the short-term variations of a signal with respect to its ideal
position in time
— The two major components of Jitter are random Jitter and
deterministic Jitter
— Factors causing Jitter includes imperfections in Clock oscillator, supply
voltage variations, Temperature variations, Crosstalk
Original
Clock
Jitter
affected
Clock
• Glitch
— Unexpected switching of any waveform
— Due to late arrival time of Gate and it is for a short period of time
— Cause extra delay and also it can cause extra power from false
transitions
16
Static Timing Analysis (STA)
Reference clock
waveform
0 15 30
Reference clock
with uncertainty
0 15 30
Reference clock
with latency
5.5 20.5 35.5
Reference clock
with transition
0 15 30
17
Static Timing Analysis (STA)
• Pulse Width
— Pulse Width is the time between the active and inactive states of the
same signal
— Minimum high pulse width is the amount of time after the rising edge of a
clock, that the clock signal of a clocked device must remain stable
— Minimum low pulse width is the amount of time after the falling edge of
a clock, that the clock signal of a clocked device must remain stable
• Duty Cycle
— Percentage of clock period having high pulse
— Typically clock waveforms are of 50% Duty Cycle
• Transition/ Slew
— Time taken by a signal to change the state (Volts/Second)
— Rise Slew (tR) is called Rise Time and Fall Slew (tF) is called Fall Time
— Minimum/ Maximum Transition is the Minimum/ Maximum slope
allowed at leaf pins
— Transition affects Power Dissipation, Latency and Pulse width
18
Static Timing Analysis (STA)
• Asynchronous Path
— A path from an input port to an asynchronous set or clear pin of a
sequential element
• Critical Path
— The path which creates longest delay
— Also called worst path/ late path/ max. path
— Timing sensitive functional paths no additional gates are allowed to be
added to the path
• Shortest Path
— One that takes the shortest time; this is also called the best path or
early path or a min path
19
Static Timing Analysis (STA)
• Clock Gating Path
— Path passed through a “gated element” to achieve additional
advantages
— Clock Gating transformation does not change the state of the flops
and register
20
Static Timing Analysis (STA)
• Launch Path
— Launch path islaunch clock path which is responsible for launching the
data at launch flip flop
• Capture Path
— Capture path is capture clock path which is responsible for capturing the
data at capture flip flop
• Arrival Time
— Launch path and data path together constitute arrival time of data at the
input of capture flip-flop
• Required Time
— Capture clock period and its path delay together constitute required time
of data at the input of capture register
21
Static Timing Analysis (STA)
• Common Path Pessimism
— Same Clock Path may be a Launch Path for one Data Path and can be a
Capture Path for another Data Path
— While doing OCV derating, same path may get both Min./ Max. delay
— But a path can have either as a Maximum delay or a Minimum delay
(or anything in between) but never both delays at the same time
— STA tools will have techniques to remove artificially introduced pessimism
between the Launch Clock Path and the Capture Clock Path
22
Static Timing Analysis (STA)
• Slack
—Difference between Required Time (RT) and Arrival Time (AT)
—PositiveSlack at a node implies that the arrival time at that node may be
increased without affecting the overall delay of the circuit
—Negative Slack implies that a path is too slow, and the path must speed up if
the whole circuit is to work at the desired speed
• Setup Time
—Setup time is the minimum amount of time the data signal should be held
steady before the clock event so that the data are reliably sampled by the
clock
T +T +T ≤T -T
LAUNCH_CLOCK CLK-Q_MAX COMB_MAX CAPTURE_CLOCK SETUP
• Hold Time
—Hold time is the minimum amount of time the data signal should be held
steady after the clock event so that the data are reliably sampled
T +T +T ≥T +T
LAUNCH CLOCK CLK-Q_MIN COMBO_MIN CAPTURE_CLOCK HOLD
23
Static Timing Analysis (STA)
• Setup Time and Hold Time Violations
— If Setup time, TSETUP for a flip-flop and if the data is not stable before
TSETUP from the active edge of clock, then there is a Setup Violation at
that flip-flop
— If hold time, THOLD for a flip flop and if the data is not stable after THOLD
time from the active edge of clock, then there is a hold violation at
that flip-flop
— For a single cycle circuit the signal has to propagate through Data
path in one clock cycle
D
IN DATA MUST REMAIN STABLE
HOLD
CLK
SETUP
24
Static Timing Analysis (STA)
• Recovery Time
—Recovery time is the minimum time that an asynchronous control input pin must be
stable after being de-asserted and before the next clock transition (active edge)
• Removal Time
—Removal time is the minimum time that an asynchronous control input pin must be
stable before being de-asserted and before the previous clock transition (active
edge)
• Recovery Time and Removal Time Violations
—This check is to ensure that the asynchronously signal rise/ fall edge is not occurring at
the clock edge; it should be some time before or after the clock edge
—If that violates, then Recovery Time and Removal Time Violations
—Although a flip-flop is asynchronously SET or CLEAR, the negation from its RESET
state is synchronous
25
Static Timing Analysis (STA)
• Single Cycle Path
— Timing path that is designed to take only one clock cycle for the data to
propagate from the start point to the endpoint
— Start point and endpoint are flops clocked by the same clock
— By default tool will consider all timing paths as single cycle paths
Hold 0 15 30
Check
Setup Check
0 15 30 26
Static Timing Analysis (STA)
• Multi-Cycle Path
— Timing path that is designed to take more than one clock cycle for the
data to propagate from the start point to the endpoint
— Start point and endpoint are flops clocked by the same clock
— Need to specify the Launch edge and Capturing edge in SDC
Setup Check
0 1 3 4 6
5 0 0
Hold Check 5
0 1
5
3 4 6 27
0 5 0
Static Timing Analysis (STA)
• Half Cycle Path
— Timing path that is designed to take half clock cycle (both of the clock
edges) for the data to propagate from the start point to the endpoint
— Start point and endpoint are flops clocked by the same clock
— No need to specify the Launch edge and Capturing edge in SDC, since
the tool can identify it from the netlist
1/2 clock period delay
0 15 30
0 15 30 28
Static Timing Analysis (STA)
• False Path
— Physically exist in the design but are Logically/ Functionally inactive/
incorrect path
— Means no data is transferred from Start Point to End Point
— The goal in STA is to do timing analysis on all “true” timing paths,
so these paths are excluded from timing analysis
— Similarly timing can be disable for a pin or port or cell where the delay will
be computed but won’t report it
CDC Signal
Transmitting Receiving
Flop Flop
Clock 1 Clock 2
Clock 1
1 Clock Domain
30
Metastability
Static Timing Analysis (STA)
• Clock Domain Synchronization Scheme
— Pulse Width check
▪ The control signals is stable for longer than one receive clock period
▪ Ensures that data will not be lost due to inadequate width of the control signal
— Data Stability check
▪ The data updated by the transmit domain cannot be captured by
the immediately following receive clock edge
▪ Ensures that the captured data will not be metastable in the receive domain
31
Static Timing Analysis (STA)
• Bottleneck Analysis
— Lists the cells causing the timing violations on multiple paths
— By identifying and fixing the violation caused by a Bottleneck Cell
improved timing can be achieved
32
Static Timing Analysis (STA)
• Multi-VT Cells
— Differentthreshold voltages are achieved by implanting dopants in different
concentration
— Need Multi-VT Library
— Sub-threshold leakage varies exponentially with VT compared to the weaker
dependency of delay over VT
— If the optimization target is power performance, first use the HVT cells library and
then try LVT cells
— If the optimization target is to meet timing then first use LVT cells and then HVT cells
— If you swap the capture flop from SVT to LVT or HVT, there will be very minimal
setup/hold impact in most flops, it is of zero impact for hold
— If you swap the launch flop from SVT to LVT or HVT, Setup will be improve and hold
will be impacted correspondingly
— High Voltage Threshold (HVT )
▪ Use in non-timing critical paths
▪ Use in power critical paths
▪ Has low leakage and low speed
— Low Voltage Threshold (LVT )
▪ Use in timing critical paths
▪ Use in non-power critical paths
▪ Has high leakage and high speed
— Standard Voltage Threshold/ Regular Voltage Threshold (SVT/ RVT)
▪ Medium delay and medium power requirement
33
Static Timing Analysis (STA)
• Time Borrowing
— Time Borrowing is basically for Latched based Timing Analysis
— Edge-triggered flip-flops change states at the clock edges, whereas
latches change states as long as the clock pin is enabled
— In latch based design longer combinational path can be compensated
by shorter path delays in the subsequent logic stages
— The technique of Borrowing Time from the shorter paths of the
subsequent logic stages to the longer path is called Time Borrowing or
Cycle Stealing
34
Static Timing Analysis (STA)
• Time Borrowing
— Time Borrowing typically only affects setup slack calculation since time
borrowing slows data arrival times
— When the clocks of the Launching and Capturing Latches are out of
phase, time borrowing is not to happen
— Timing borrowing can be multistage
— Maximum Borrow Time:
Clock Pulse Width minus the library Setup Time of the Latch
— Negative Borrow Time:
Arrival Time minus the clock edge is a negative number, the amount
of time borrowing is negative (no borrowing)
35
Static Timing Analysis (STA)
• Time Borrowing: Scenarios
— Scenario 1: When data is launching
from a positive edge triggered flip flop
and capture is to a negative level
sensitive latch
— Scenario 2: When launch is from a
Scenario 1
negative level sensitive latch and
capture is to a positive edge triggered
flip flop
— Scenario 3: When launch and capture
are from positive level sensitive latches
Scenario 2
Scenario 3 36
Static Timing Analysis (STA)
• Types of Static Timing Analysis
— Path Based STA (PBA)
▪ First, extract all possible topological paths
▪ Next, for each path calculate it’s delay and compare it with
endpoint (required) value
▪ Calculate the Arrival Time (AT) by adding cell delay in timing paths
▪ Check all path delays to see if the given Required Arrival Time (RAT) is met
— Graph Based STA (GBA)
▪ Two types of timing data :
▪ Arrival times, AT (propagated forward from inputs) Required
▪ Arrival Times RAT (propagated from outputs) Slack is
▪ calculated on every design element: Slack = RT – AT
37
Static Timing Analysis (STA)
Path Based STA (PBA) Graph Based STA (GBA)
• Path specific STA • Parameter based STA
• Wont use worst skew • Wont use worst skew
• Intensive computation required • Not so intensive computations
• Less Pessimistic • More Pessimistic
• More accurate • Less accurate compare to PBA
• Timing constraints will be • Timing constraints will be
checked at end points of the checked at each node of the
timing paths timing paths
• Not favorable for large no. of • Not favorable for large no. of
paths Corners
• PBA select either max. path or • GBA the max. path alone is
min. path Selected
• Timing information associated • Timing information associated
with topological paths with discrete design elements
(collections of design elements) (ports, pins, gates)
• Traces every possible timing • Its incremental; breadth based
paths
• Always done after GBA
38
Static Timing Analysis (STA)
• Block-based STA vs. Path-based STA (example)
Path-based:
AT=2
2 2+2+3 = 7 (OK)
3 1 2+3+1+3 = 9 (OK)
3
1 2 2+3+3+2 = 10 (OK)
AT=5 3 RAT=10 5+1+1+3 = 10 (OK)
1 5+1+3+2 = 11 (Problem!)
5+1+2 = 8 (OK)
AT=2 AT=7
Block-based:
AT=2 RAT=5
Critical path is determined
2 RAT=7
1
as collection of gates with
3 AT=6 3 RAT=10 the same, negative slack:
1 RAT=5 2 Slack = RT – AT
AT=5 3 AT=11 In our case, we see one
AT=5 1 AT=9 RAT=10 Critical path with slack = -1
RAT=4 RAT=8
39
Static Timing Analysis (STA)
• STA Summary Report
------------------------------------------------------------
timeDesign Summary
------------------------------------------------------------
+--------------------+---------+---------+---------+---------+---------+---------+
| Setup mode | all | reg2reg | in2reg | reg2out | in2out | clkgate
|
+-------------------- +--------- +--------- +--------- +--------- +--------- +--------- +
| WNS (ns):| -7.815 | -5.368 | -7.815 | -0.582 | -7.110 | N/A |
| TNS (ns):| -2113.3 | -1239.7 | -1969.2 | -1.269 | -38.582 | N/A
|
| Violating Paths:| 757 | 708 | 375 | 8 | 6 | N/A |
| All Paths:| 1811 | 1344 | 819 | 18 | 6 | N/A |
+-------------------- +--------- +--------- +--------- +--------- +--------- +--------- +
+---------------- +--------------------------- +-------------- +
| | Real | Total |
| DRVs +-------------- +------------ +-------------- |
| |Nr nets(terms)| Worst Vio |Nr nets(terms)|
+---------------- +-------------- +------------ +-------------- +
| max_cap | 135 (135) | -3.518 | 136 (136) |
| max_tran | 370 (14467) | -7.767 | 388 (14485) |
| max_fanout | 0 (0) | 0 | 0 (0) |
+---------------- +-------------- +------------ +-------------- +
Density: 78.864%
Routing Overflow: 0.00% H and 0.23% V
------------------------------------------------------------ 40
Congestion Analysis
41
Congestion Analysis
• As the Technology advances, millions of transistors can
be packed onto the surface of a chip
• Thus the increased circuit density introduces
additional Congestion
• Intuitively speaking, Congestion in a layout means too
many nets are routed in local regions
• This causes detoured nets and un-routable nets in
Detailed Routing
• Congestion Analysis
— Routing Congestion Analysis
▪ Congestion in general referred to Routing Congestion
▪ Routing congestion is the difference between supplied and available
▪ tracks A track is nothing but a routing resource which fills the entire Core
— Placement Congestion Analysis
▪ Placement Congestion is due to overlap of Standard Cells, it is called
Overlapping rather than called as Congestion
▪ Overlapping issue can be fixed by aligning cells to the Placement Grid by Legalization
42
Congestion Analysis
• In recent years, several congestion estimation and
removal methods have been proposed
• They fall into two categories: Congestion estimation and
removal during global routing stage, and Congestion
estimation and removal during Placement stage
• To estimate Congestion, tool does Initial/ Global Routing
• Congestion reports are generated after each Routing
stages which shows the difference between supplied and
demanded Tracks or G-cells
• Overflow = Routing Demand - Routing Supply (0% otherwise)
• Usually starts the initial Target Utilization with 65% to 70%
• 7/3 in a 2D congestion map : There are 7 routes that are
passing through a particular edge of a Global Route Cell (GRC),
but there are only 3 routing tracks available. There is an
overflow of 4.
43
Congestion Analysis
• Causes for Routing Congestion
— Missing Placement Blockages
— Inefficient floorplan
— Improper macro placement and macro channels
(Placing macros in the middle of floorplan etc.)
— Floorplan the macros without giving routing Global Bin Global Bin Edge
space for interconnection between macros
— High Cell Density (High local utilization)
— If your design had more number of AOI/OAI
cells you will see this congestion issue
— Placement of standard cells near macros
— High pin density on one edge of block
— Too many buffers added for optimization
— No proper logic optimization
— Very Robust Power network
— High via density due to dense power mesh Routing demand = 3
— Crisscross IO pin alignment is also a problem Assume routing supply is
— Module splitting 1, overflow = 3 - 1 = 2
44
Congestion Analysis
• Congestion Fixes Nets crossing the
global routing cell (GRC)
tracks
Horizontal
28/
reduce congestion 28
Congestion
45
Congestion Analysis
• Routing congestion, results when too many routes need to go
through an area with insufficient “routing tracks” to
accommodate them
After Fixing
47
Power Analysis
48
Power Analysis
• Power Analysis
— Power Density of the Integrated Circuit increase exponentially
with every Technology generation
P =P +P
TOTAL DYNAMIC LEAKAGE
P =P +P
DYNAMIC SWITCHING SHORT_CIRCUIT
where
μ - Carrier mobility
COX - Gate capacitance
VT - Threshold voltage
VGS - Gate-Source voltage
W and L - Dimensions of the transistor
VTH - Thermal voltage, kT/q = 25.9mV at room temperature
n - function of device fabrication process (ranges 1.0 -2.5)
51
Power Analysis
• Static Power Dissipation
— Leakage Power, is consumed when the transistors are not switching
— Dependent on the voltage, temperature and state of the transistors
— Leakage Power = V * Ileak
• Types of Static Leakages
— Reverse biased diode leakage from the diffusion layers and the substrate
— Gate Induced Drain Leakage
— Gate Oxide Tunnelling
— Sub-threshold Leakage caused by reduced threshold voltages which
prevents the Gate from completely turning OFF
• Static Power Reduction Techniques
— Using Multi VT cell in the design and optimizing for leakage by replacing high VT
cell for non timing critical paths
— Power Gating
▪ Power Shut-off groups of logic which are not used
— Voltage Scaling
— Multi VDD and Voltage Island
— Multi-threshold CMOS (Back Biasing)
52
Power Analysis
• Dynamic/ Switching Power
— Dynamic power is the power consumed when the device is active,
when signals are changing values (by switching logic states)
— Primary source of dynamic power consumption is switching power
PDYN= A C V2 F
where,
A is activity factor, i.e., the fraction of the circuit that is switching
C is Load capacitance
V is supply voltage
F is clock frequency
• Dynamic Power Calculation depends on
— Switching frequency
— Transition
— Output load
— Cell internal power
53
Power Analysis
• Dynamic Power Dissipation
— Dynamic power is dissipated any time the voltage on a net changes
due to some stimulus
• Types of Dynamic Power
— Net Switching Power = (Cint * V*V *f)
— Internal Power = (Cint * V*V *f) + (V * Isc)
• Short Circuit : = (V*ISC) During switching both PMOS and NMOS
becomes on which results in a short circuit current
• Internal Capacitance Loading Power = (Cint * V*V *f) is the power
consumed while charging/discharging internal nets
54
Power Analysis
• Dynamic Power Reduction Techniques
55
Power Analysis
• Dynamic Power Reduction Techniques
— Clock Gating
▪ Architectural Technique to reduce
Dynamic Power along the Clock Path
▪ Clock gates should be placed at
the Root of the Clock
▪ Results in small delay, more area and
makes the design complex Q D
ICG Cell
▪ Clock Gating logic is generally in the
CLK
form of "Integrated clock gating" (ICG)
▪ Sequential clock gating is the process of
extracting/propagating the enable
conditions to the upstream/downstream
sequential elements, so that additional
registers can be clock gated
▪ As the granularity on which you gate the clock of a synchronous circuit
approaches zero, the power consumption of that circuit approaches that
of an asynchronous circuit: the circuit only generates logic transitions
when it is actively computing 56
IR Drop Analysis
57
IR Drop Analysis
• IR Drop
— The voltage that gets to the internal circuitry is less than that applied to
the chip, since every metal layer offers resistance to the flow of current
— When a current, I passes through a conductor with resistor R, it exhibits
a voltage drop V which is equal to the resistance times the current,
Ohm’s law, V=IR
— IR Drop is defined as the average of the peak
currents in the power network multiplied by
the effective resistance from the power
supply pads to the center of the chip
— IR Drop is a reduction in voltage that occurs
on both Power and Ground networks
— IR Drop Analysis ensures that Power Delivery Network (PDN) is
robust, and that your system will function to specification
— IR Drop is determined by the current flow and the supply voltage
— As distance between supply voltage and the component increases the IR
Drop also increases
58
IR Drop Analysis
• IR Drop Analysis
— IR Drop Analysis will compute the actual IDD and ISS currents, because
these values are time-dependent
— IR Drop Analysis will compute Global IR drop which is important and more
accurate, but cannot be compute separately (parallel) for smaller blocks,
which may led to bigger run time
— Local IR Drop
▪ IR Drop become a local phenomenon
when a number of gates in close
proximity switches at once
▪ Local IR Drop can also be caused by a
higher resistance to a specific portion
of the Grid
— Global IR Drop
▪ IR Drop is a global phenomenon when activity in one region of a chip causes an IR Drop in other
regions
— In a well-meshed power grid with equally distributed currents, the power
grid typically has a set of equipotential IR Drop surfaces that form
concentric circles cantered in the middle of the chip
— So the center of the chip usually has the largest IR Drop or the lowest supply
voltage
— Peak IR Drop is much larger than the Average IR Drop
— Peak IR Drop happens in the worst-case switch patterns of the gates
59
IR Drop Analysis
• Types of IR Drop
▪ Static IR Drop
— Static IR drop is average voltage drop for the design
— The average current depends totally on the time period
— Static IR drop was good for signoff analysis in older technology nodes where sufficient
natural decoupling capacitance from the power network and non-switching logic
were available
— Localized switching is only considered
— Only be a few % of the supply voltage
0% drop
— Can be reduced by lowering the 2.5% drop
resistance of Supply and Signal Paths 5% drop
— Missing Vias n3
n4
< Vdd
— Insufficient number of Power Pads n2
— Open circuits n8
62
IR Drop Analysis
• IR Drop: Impacts
— IR Drop Analysis confirms that the worst case voltage drop (which is
considered for the worst corner for timing) on a chip meets IR Drop
targets
— Impacts in Timing
▪ If this Voltage Drop is too severe, the circuit will not get enough
voltage, resulting in the malfunction or timing failure
▪ If IR Drop increases Clock Skew then it will result in Hold Time Violations
▪ If IR Drop increases Signal Skew then it will result in Setup Time Violations
63
IR Drop Analysis
• IR Drop Plot
— Power grid has a set of equipotential surfaces that form concentric circles
centered in the middle of a block
Insufficient
Power Pads
Vdrop = IR + Ldi/dt
66
Physical Design
Essentials
Outline
• Issues in ASIC Physical Design
— Design Parasitics, Latch-up, Electro-Static Discharge,
Electromigration, Antenna Effect, Cross Talk, Soft
Errors, Self-Heating
• Cells in ASIC Physical Design
— Standard Cells, ICG Cells,Well taps, End caps, Filler Cells, Decap
Cells, ESD Clamps, Spare Cells, Tie Cells, Delay Cells, Metrology
Cells
• IO Design
• Delay Models
— Interconnect Delay Models
— Cell Delay Models
• Engineering Change Order (ECO)
• Types of Standard Cell Libraries
2
Issues in ASIC Physical Design
3
Issues in ASIC Physical Design
ASIC Design Parasitics
• Parasitic Resistance
– If resistance increases delay also get increases (Delay= R.C)
– As technology shrinks interconnects also shrinks and thus wire resistance will get increase
– To avoid this situation we will increase the height of interconnects
• Parasitic Capacitance
– As technology shrinks height of nets getting increase, so sidewall capacitance is increasing
– As technology shrinks the dielectric become thinner, the capacitance will get increases
– To reduce the capacitance, minimize the surface area which can be in common
– So we keep the adjacent metal layers vertical and horizontal in designs
• Parasitic Inductance
– Mutual inductance affects: High frequency bus
– Self-inductance affects: Clock nets
– To limit inductance, we provide current return paths for high frequency signals
– Separation and Shielding are the possible remedies
– The rule of thumb has been that when the length of the signal path was long enough to
become some percentage of a wavelength that the line itself starts to become a
concern for signal integrity
– Prominent above 500MHz & below 130nm for long wire nets & Power/Clock lines
4
Issues in ASIC Physical Design
Latch-up
• What is Latch-up?
— Phenomenon occur with CMOS/ BiCMOS circuits
— Generation of a low-impedance path between the VDD supply
and the Ground
• Reason for Latch-up
— Due to regenerative feedback between the parasitic PNP and the
NPN Transistors
• Impact in the design
— PN Junctions can produce Parasitic Thyristor
▪ Forms by PNP/ NPN structures
▪ Considerable input current is necessary to activate
— Thyristor formed from parasitic transistors is triggered and generates
short-circuit between VDD & GND
— Results in self destruction/ system failure due to the direct
connection between VDD & GND
5
Issues in ASIC Physical Design
Latch-up
• NPN Transistor
– Emitter – drain /source of the
N-channel MOSFET
– Base – P Substrate
– Collector – N Well in which the
complementary P- channel
MOSFET is located
• PNP Transistor
– Emitter – drain /source of the
P-channel MOSFET
– Base – N Well in which the
complementary P-channel
MOSFET is located
– Collector – P Substrate
• Thyristor/SCR/PNPN
diode
– Anode – drain /source of the P-
channel MOSFET
– Cathode – drain /source of the
N-channel MOSFET
– Gate – P Substrate
6
Courtesy: vlsi.itu.edu.tr
Issues in ASIC Physical Design
Latch-up
\emdash Remedies for Latch-up
– Latch-up resistant CMOS process
Reduces the gain of parasitic transistors(use of Si starting material with a thin
epitaxial layer on top of a highly doped substrate)
Increase the holding voltage above VDD supply
Increase the dopant concentration of substrate & well (but will lead to higher VT)
Retrograde well structure (Highly doped area at bottom and lightly doped at top)
– Layout techniques
Sufficient space between NMOS & PMOS
This reduces the current gain of the parasitic transistors
limited success because can be increased only to a certain limit
Reduce RS and RW by keeping Substrate & Well contacts as close as possible
Place substrate contacts as close as possible to the source connection of
transistors connected to the supply rails (VSS n-devices, VDD p-devices)
This reduces the value of RSUBSTRATE and RWELL
A very conservative rule would place one substrate contact for every supply
(VSS or VDD) connection
In Std. Cells based designs a common Well Tap is taking out as per the need
Guard Rings
Gain of transistors is reduced (in analog designs)
7
Issues in ASIC Physical Design
ESD
\emdash Electrostatic Discharge (ESD)
– When two non-conducting materials rub together, then are separated,
opposite electrostatic charges remain on both which attempt to
equalize each other
– A transient discharge of static charge that arises from either human
handling or a machine contact
• Reasons for Electrostatic Discharge
— Thin & vulnerable Gate Oxide of the CMOS makes ESD protection
essential for CMOS
— Can be due to inductive or capacitive coupling
— ESD can occur during the removal of extra metal by rubbing in
metallization process
— ESD occurs so rapidly that normal GND wires exhibits too much
inductance to drain the charge before it can do damage
\emdash Impact on the design
— ESD can also burn-out device/ interconnect if thermally initiated
— PMOS is stronger than NMOS in ESD protection, because snap back
holding voltage is lower for NMOS
8
Issues in ASIC Physical Design
ESD
\emdash Human Body Model (HBM)
– The actual capacitance of the human body is between 150 pF and 500 pF
& the internal resistance of the human body ranges from a few kilo-
ohms to a few hundred
– Peak current ≈ 1.3A, rise time ≈10-30ns
9
Courtesy: ami.ac.uk
Issues in ASIC Physical Design
ESD
\emdash Machine Model (MM)
– MM models the ESD of manufacturing / testing equipment
– Peak current ≈ 3.7A, rise time ≈15-30ns, bandwidth ≈ 12 MHz
– ESD stress caused by charged machines is severe because of zero body
resistance
– MM ESD withstand voltage is typically one tenth of HBM
– Most ESD protection circuits can only protect HBM and MM
10
Courtesy: ami.ac.uk
Issues in ASIC Physical Design
ESD
• Charged Device Model (CDM)
– CDM models the ESD of charged integrated circuits
– As more and more circuits and functions getting integrated causes
large Die size which provides large body capacitance which in turn
stores charges for CDM in the body of IC
– Inductance in the model is mainly due to the inductance of bond wires
– Gate oxide breakdown is the signature failure of CDM stress, in
contrast to the thermal failure signature of HBM and MM stress
– CDM stress is the most difficult ESD stress to protect against since
fastest transient and has the max. peak current
– Peak current ≈ 10A, rise time ≈1ns
11
Courtesy: ami.ac.uk
Issues in ASIC Physical Design
ESD
\emdash ESD Protection
– The integration of Clamping Diodes
Limits the dangerous voltages and conduct excess currents into regions
of the circuit that are safe
— The Protection Diodes
\emdash Oriented to be blocking in normal operation
\emdash Situated between the connection to the component to be protected and
the supply voltage lines safe regions consist primarily of the supply-voltage connections
12
Issues in ASIC Physical Design
Electromigration
• Electromigration (EM)
— A failure mechanism caused by
high energy electrons impacting
the atoms in a material and
causing them to shift position
— Enhanced and directional mobility
of atoms under the influence of
an electric field
• Reason for Electromigration
— Forms a positive feedback path
where EM will cause an atom to
move down a wire, slightly narrowing
the wire width at that location and increasing the current density
— This increased current density then further increases electromigration,
causing more atoms to be displaced Transport of material caused by
the gradual movement of ions in a conductor due to the momentum
transfer between conducting electrons &
diffusing metal atoms
— It is most problematic in areas of high current density
— Significant as size decreases & is most significant for unidirectional
(DC) current
13
Issues in ASIC Physical Design
Electromigration
\emdash Impact in the design
— Excessive EM leads to open (voids) & short circuits (Hillocks) and thus
decreases the reliability of the chip
— Approaching life time of device faster
— Increased power consumption
— Higher on-chip temperatures
— High Voltage operation
— High frequency switching
voltage levels
— EM resistance can be increased by
alloying with Copper
— Controlling temperature by using a
thermal-aware IC design methodology METAL6 METAL5
— DFM techniques that reduce variability
— Besides, need to be aware of “dishing” effect (CMP)
15
Issues in ASIC Physical Design
Electromigration
\emdash Types of EM checks
— Related to Currents — Related to Nets
1. Average EM checks 1. Signal EM checks
2. RMS EM checks 2. Power EM checks
Peak EM checks
— Limits for all these EM checks will be specified in technology file as a
function of minimum life of the device, depending on the application
— All the three Current related EM checks need to be satisfied for Signal EM
unless otherwise specified
— For Power nets, satisfying Average EM numbers would suffice
• EM failure mechanisms
– Timing Failure: Narrowing of the wire will increase wire resistance, which
may cause a timing failure if a signal can no longer propagate within the
clock period
– Functional Failure: Electromigration will continue until the wire
completely breaks, allowing no further current flow and resulting
in functional failure
16
Issues in ASIC Physical Design
Electromigration
• EM Rule Types
— Metal Layer based (This was the only rule used in older technologies)
— Metal length or width dependent EM Rules
— Length and width of upper and bottom Metal and also depends on Via width
— Complex rules with polynomials
• Black’s Equation
Mean Time To Failure (MTTF), t50 = CJ-ne(Ea/kT)
— t50 = the median lifetime of the population of metal lines subjected to EM
— C = a constant based on metal line properties (depends on cross sectional
area)
— J = the current density (Jdc < 1 – 2 mA / mm2)
— n = integer constant from 1 to 7; many experts believe that n = 2
— T = temperature in degree Kelvin
— k = the Boltzmann constant
— Ea (Activation Energy) = 0.5 - 0.7 eV for pure Al
17
Issues in ASIC Physical Design
Antenna Effect
• Antenna Effect
—A phenomenon of charge accumulation in metal segments that are connected to
an isolated Gate (Poly) during the metallization process
— This phenomenon occurs during process, so also known Process Antenna Effect
(PAE)
— It occurs when conducting net act as antenna, amplifying the charge effect
— The conductive layers are receiving the charge, so termed as Antenna Effect
18
Issues in ASIC Physical Design
Antenna Effect
• Impact in the design
— If the area of the layer connected directly to
the Gate the static charges are discharged
through the Gate, the discharge can damage
the oxide that insulates the gate and cause
the chip to fail
Charge accumulation
— Fowler-Nordheim (F-N) tunneling current will
& discharging on Poly
discharge through the thin oxide and cause
damage to it
19
Courtesy: Semiwiki.com
Issues in ASIC Physical Design
Antenna Effect
• Remedies for PAE
— Assigning higher metal layers for routing
▪ Higher metal layers will not be connected
directly to the Gate Connect various metals
through Via connections
— Inserting Jumpers
▪ If PAE is in lower layers then PAE can be reduced
by connecting it to higher layers through Jumpers
▪ Jumpers will reduce the peripheral metal length,
which is attached to the Gate
— Connecting Antenna diode
▪ If it is in higher layers, Jumper wont be a solution,
hence need diodes
▪ As soon as extra charge is induced onto metal/ poly
the diode diverts the extra charges to the substrate
▪ But for buffer insertion higher metal layers has to
come to lower metal layer (M1 or M2) to connect
to pins of buffer and go back and also there may
not be enough place for buffer insertion
▪ After routing only we go for antenna check,
so Buffer insertion may lead to congestion and DRC violations 20
Issues in ASIC Physical Design
Antenna Effect
• Remedies for PAE
21
Move the Via to reduce area of Metal 1
Issues in ASIC Physical Design
Antenna Effect
\emdash Antenna Ratio (AR)
– A design rule to prevent charge accumulation during Metal/ Poly-Si layer
etching which limits the area of metal segment connected to the Gate
oxide
– Foundries set a maximum allowable AR for the chips they fabricate
– The AR is defined as the ratio of plasma-exposed area As,metal to the
gate oxide area Apoly as formulated,
22
Courtesy: eetindia.co.in
Issues in ASIC Physical Design
Antenna Effect
• Antenna Effect possibilities example
\emdash Assume a foundry setting a maximum allowable antenna ratio of 500
\emdash If a net has two input gates that each have an area of 1 square
micron, any metal layers that connect to the gates and have an area larger than
1,000 square microns have process antenna violations because they would
cause the antenna ratio to be higher than 500
23
Issues in ASIC Physical Design
Antenna Effect
• Antenna (ANT) Rules
— The Antenna Ratio
— For Aluminium at Etching stage (metal deposition)
The top of the metal is protected by a resist during this step, so the
antenna rules for this process should be based on the metal sidewall area
— For Copper at Chemical-Mechanical Polishing (CMP) stage
Charge accumulation occurs during CMP
In this process, the sides of the metal are protected, so the antenna
rules need to be based on the metal's top surface area
— Metal used in the process depends on Technology
— From 28nm onwards Aluminium is replacing Copper
24
Courtesy: vlsi-asic-soc.blogspot.in
Issues in ASIC Physical Design
Antenna Effect
• PAE as a side effect of the manufacturing process
– Plasma etchers/ ion implanters induce charge into various structures
connected to Gate Oxide
– This induced charges destroy the Oxide layer - a permanent damage
— Conductor layer pattern etching processes
— Amount of accumulated charge is proportional to perimeter length
— Ashing processes
Amount of accumulated charge is proportional to area
Ashing processes remove remaining photo resist layers after etching processes of
a conductor layer
In the late stage of the processes, the area of a conductor layer pattern is
directly exposed to plasma
— Contact etching processes
The amount of accumulated charge is proportional to the total area of the contacts
Contact etching processes dig holes between two conductor layers
In the late stage of the processes, the area of all the contacts on the
lower conductor layer pattern is directly exposed to plasma
25
Issues in ASIC Physical Design
Crosstalk
\emdash What is Crosstalk?
— Refers to a signal affecting another signal
being transmitted in vicinity caused by
capacitive/ inductive coupling
— Crosstalk is the unwanted coupling of
energy between two or more adjacent
lines which can change the required
signal and is also termed as Xtalk
— Occurs on long adjacent wires
— Can be interpreted as the coupling of
energy from 1 line to another via:
Mutual Capacitance,
Cm(due to Electric Field)
Mutual Inductance,
Lm (due to Magnetic Field)
26
Courtesy: synopsys.com
Issues in ASIC Physical Design
Crosstalk
• Impact of Crosstalk in the design
– Functional Failures
Noise induced glitches
If the Glitch duration is that of clock period
duration, an extra clock cycle effect
– Timing violations
If aggressor switches in opposite direction
to the victim : Setup time Violation
If aggressor switches in same direction
to the victim : Hold time Violation
– If the victim line is not terminated at both
ends in its characteristic impedance the Setup time Violation
induced spurious signals can reflect at the
ends of the line and travel in the opposite
direction down the line
– Thus a reflected near-end crosstalk can
end up appearing at the far end and vice
versa
Hold time Violation
27
Issues in ASIC Physical Design
Crosstalk
• Types of Crosstalk
– Energy that is coupled from the actual signal line, the aggressor, onto a
quiet passive victim line so that the transferred energy "travels back" to
the start of the victim line. This is known as the backward or near-end
crosstalk
– Energy that is coupled from the active signal line, the aggressor, onto a
quiet passive victim line so that the transferred energy "travels
forward" to the end of the victim line. This known as forward or far-end
crosstalk
Inductive Coupling:
Current induced in
opposite direction only
Capacitive Coupling:
Coupled current flows in
both directions
28
Courtesy: basebandhub.com
Issues in ASIC Physical Design
Crosstalk
\emdash Remedies to avoid Xtalk
— Its a 3 dimensional problem, so height,
width and length matters
— Noise/Bump violations can be fixed by
changing the spacing between critical nets
— Shield the clock nets (critical nets) from
other nets by ground lines
— Net Re-ordering
Avoid routing the critical nets parallely
for long distances
— Modify the clock net (critical nets) minimum
width from normal value to a larger one
This makes the router to skip a grid near
clock net to prevent spacing violation
This technique not only reduces crosstalk,
but will also have a lower resistance due to larger line width &
less side wall capacitance
— Can be fixed either by upsizing (increasing the drive
strength) of the victim, or by downsizing
(decreasing the drive strength) of the aggressor
29
Issues in ASIC Physical Design
Soft Errors
• Soft Error (Random Particle Error)
— Softerror is the phenomenon of an erroneous
change in the logical value of a transistor, and
can be caused by several effects, including
fluctuations in signal voltage, noise in the
power supply, inductive coupling effects etc.,
but, majority of soft errors are caused by
cosmic particle strike on the chip
— With technology scaling, even low-energy
particles can cause Soft Errors
— Soft errors are radiation induced faults which happen due to a particle hit, either
by an alpha particle from impurities in packaging material or a neutron from
cosmic rays
— When particles strike the silicon substrate they create hole-electron pairs which
are then collected by PN-Junctions via drift and diffusion mechanisms
— This collected charge creates a transient current pulse and if it is large enough, it
can flip the value stored in the state saving element (bit cell, latch etc.)
— These upsets are called Single Event Upsets (SEU)
30
Soft Errors
• Impact in the design
— Softerror can result in incorrect results, segmentation faults, application or
system crash, or even the system entering an infinite loop
— When particle strike happens in combinational circuit, the result is a glitch
which can then propagate to a latch where it could be clocked in and
incorrect data can be latched
• Precautions to avoid Soft Errors
— Radiation Hardening: Technique to reduce the Soft Error rate in digital circuits
— Radiation hardening is often accomplished by increasing the size of transistors
who share a Drain/ Source region at the node
31
Issues in ASIC Physical Design
Self-Heating
\emdash If current flows through a wire, then due to the
resistance of the wire heat will generate
\emdash Oxide surrounding wires is a thermal insulator, so
heat tends to build up in wires
\emdash Hotter wires are more resistive & become slower
\emdash Wire self-heating is only a negligible effect in the
supply lines on bulk-CMOS ICs
\emdash Self-heating Design Rule/ Self-heating Limit AC
current densities for reliability
2
— Typical limit: JRMS < 1.5 MA/ cm (for Aluminum nets)
— It limits the unavoidable degradation of Electromigration lifetime due
to temperature increase in the current carrying or in any nearby
interconnect
32
Cells in ASIC Physical Design
33
Cells in ASIC Physical Design
Cells in ASIC Physical Design
• Special Cell Requirements in IC Design is to minimize
the possible CMOS issues
• More no. of transistors than are necessary for
basic functioning. e.g.,
— To limit the Overshoots and Undershoots
— To protect the components from destruction
— To isolates 2 components by PN Junction
• Common Special Cells used in CMOS IC Design:
▪ Standard Cells
▪ ICG Cells
▪ Well taps (Tap Cells)
▪ End caps
▪ Filler Cells
▪ Decap Cells
▪ ESD Clamps
▪ Spare Cells
▪ Tie Cells
▪ Delay Cells
▪ Metrology Cells 34
Cells in ASIC Physical Design
Standard Cells
• A Standard Cell is a group of transistor and its interconnect
structures that provides a Boolean logic function (e.g., AND, OR,
XOR, XNOR, Inverters) or a storage function (Flip-flop or Latch)
• Std. Cell methodology
has helped designers to
scale ASICs from
comparatively simple
single-function ICs, to
complex multi-million
gate SoCs
• Cell-based methodology
makes designer to focus on
the implementation
(physical) aspects
A Standard Cell Layout 35
Cells in ASIC Physical Design
Standard Cells
\emdash The cell's Boolean logic function is called its logical
view: functional behavior is captured in the form of a truth
table or Boolean algebra equation (for combinational logic), or
a state transition table (for sequential logic)
\emdash AOIs (AND-OR-INVERTER) provide a way at the gate
level to use less transistors than separate ANDs and a NORs
\emdash ASIC design logic builds upon a standard logic cell
library, therefore, do not optimize transistors only logic gates
\emdash Types of Standard Cells
— Buffers (Inverting and Non-inverting )
— Combinational (AND, OR, NAND, NOR, AOI, OAI, OA, AO, MUX)
— Arithmetic (XOR, full-adder, half-adder), Sequential (latches, clock-
gates, D-type flip/flops with any optional combination of scan input,
set and reset)
— Miscellaneous (ICG Cells, Well Taps, Tie Cells, End Caps, Decaps, Filler
Cells, Spare Cells, Delay Cells, Antenna Diode, ESD diodes)
36
Cells in ASIC Physical Design
ICG Cells
• Integrated Clock Gating Cells (ICG Cells)
— During idle modes, the clocks can be gated-off to save dynamic power dissipation on flip-flops
— Proper circuit is essential to achieve a gated clock state to prevent false glitches on clock path
— Use a combination of AND and a Latch to avoid any glitches on the clocks. A glitch can propagate a false
edge on to the design
• Insertion of ICG
— Manual insertion of ICG
The clock gating can be implemented through logic circuits and ICG’s
Most of Clock Gating Cells from vendor libraries have a RTL code
— Automated Insertion of ICG –
Some power aware tools insert the ICG’s
through automated software algorithms
\emdash Types of Clock Gating Cells
— Latch Based Clock Gating Buffer for Neg-edge Latch Based Clock
The circuit employs a latch and OR gate with one input inverted Gating Buffer Negedge
The output clock is always clock gated low when Enable is low
— Latch Based Clock Gating Buffer for Pos-edge
The circuit employs a latch with inverted clock input and a AND gate
The output clock is always clock gated HIGH when Enable is low
• ICG module IO’s
— 3 input ports – clock, clock enable and test
— 1 output port – clock for gated clock
Latch Based Clock
Gating Buffer Posedge
37
Cells in ASIC Physical Design
Well Taps
• Physical only cell which helps to tie MOS Substrate and N-Wells to VDD
and GND levels, and thus avoid latch-up possibilities
• Switching circuits dump current into Well/ Substrate and if there is a high
resistance between Well/ Substrate and the VDD/ GND grids the Substrate
can be at different potential than VDD/ GND which causes latch-up
• Well Tap Cells reduce resistance between
VDD/ GND to wells of the Substrate
• Tap Cells are usually placed on the
Power Rails of the Standard Cells
• Standard Cells do not have internal
tap to N-well (P substrate process) to
reduce design complexity of Standard Cells
• These library cells do not have any
signal connectivity
• Hence Tap to Wells is done by external
cells called "Tap cells" which are sprinkled
all over Core Area at regular distance as decided by the foundry
• More Taps reduces resistance, but will also increases core area, so we
need a trade-off which will be provided by the foundry
• Place well taps at regular intervals throughout the design with the
specified distances and snaps them to legal positions
38
Courtesy: design-reuse.com
Cells in ASIC Physical Design
End Caps
\emdash End-cap cells are preplaced physical-only cells required to meet
certain design rules and placed at the ends of the site rows by satisfying well
tie-off requirements for the core rows
\emdash These library cells do not have any signal connectivity
\emdash They connect only to the power and ground rails once power
rails are created in the design
\emdash They also ensure that gaps do not occur between the well and
implant layers i.e. well proximity effect
\emdash This prevents DRC violations by satisfying well tie-off requirements
for the core rows
\emdash Each end of the core row, left and right, can have only one end
cap cell specified
\emdash However, you can specify a list of different end caps for inserting
horizontal end cap lines, which terminate the top and bottom boundaries of
objects such as macros
\emdash End caps have a fixed attribute and cannot be moved by
optimization steps
\emdash A core row can be fragmented (contains gaps), since rows do not
intersect objects such as power domains. For this, the tool places end cap cells
on both ends of the un-fragmented segment
39
Cells in ASIC Physical Design
Filler Cells
• Physical only cells which provide N-Well
continuity and avoid N-Well spacing DRC
• Filler cells are inserting for density rules, to meet
Core Utilization targets and to avoid sagging of layer
• Filler cells are inserting at the last stage
of Placement and Routing
• Some of the small cells also don’t have the
Bulk/Substrate connection because of their
Filler Cell Layout
small size (thin cells)
\emdash In those cases, the abutment of cells through inserting
Filler Cells can connect those Substrates of small cells to VDD/ GND
nets
\emdash i.e. those thin cells can use the bulk connection of the other cells
\emdash Filler cells are used to make up the Poly density (if that filler
cell is having any poly structure inside), but certainly not for metal
density
\emdash Filler cells are also useful for ECO
40
Cells in ASIC Physical Design
Decap Cells
\emdash Decaps are on-chip decoupling capacitors (Extrinsic Capacitances) that are
attached to the power mesh to decrease noise effects (dynamic I.R. Drop)
\emdash Supply voltage variations caused by Instantaneous Voltage Drop
(IVD) lead to problems related to spurious transitions and delay variations
\emdash Decap cells are typically poly gate transistors where source and drain are
connected to the ground rail, and the gate is connected to the power rail
\emdash Decap helps to smoothen out the Glitches and Ground bounce
\emdash 3% to 8% of the core physical area is required for Decaps refered as decap density
\emdash It is important to place only the necessary amount of decaps since they
normally come with a quite serious down- side as they are leaky devices
\emdash Another drawback, which many designers ignore, is the interaction of the
decap cells with the package RLC network
\emdash Since the die is essentially a capacitor with very small R and L, and the package is a
hug RL network, the more decap cells placed the more chance of tuning the circuit into its
resonance frequency. That would be trouble, since both VDD and GND will be oscillating
\emdash NMOS Decaps are superior to PMOS decaps because of the high frequency
operation and large REFF and CEFF for the same area
41
CMOS Decap
Cells in ASIC Physical Design
ESD Clamps
• ESD Clamp/ ESD Diode is the primary protection device that protects
against ESD surges at the I/O pad by clamping the voltage and allowing the
high ESD current to be discharged safely to the ground terminal
• The main function of ESD Clamp is to protect the Gate oxide
• Snap back device (Diode implementation between the grounds)
provides Snapback voltage (ESD Voltage) to get grounded thus the ESD current
won’t be getting in to Gate
• The design of ESD Clamp must ensure that Electrical Overstress
(EOS) events do not cause failure
• The ESD Clamp is essential for HBM, MM, and CDM
42
Courtesy: renesas.eu
Cells in ASIC Physical Design
Spare Cells
\emdash Pre-placed inactive (with inputs tied off) gates in the empty areas
of a design (or even in the crowded areas) before tape-out (Mostly NAND Gates)
\emdash ECO Cells/ Spare Cells are collection of Gates coming in different
sizes for doing small functional ECO and connect them with minimal mask
changes called a metal-only ECO
\emdash Provides new functions on a design which exhibits post-
production problems
\emdash No change is made to the diffusion
layer, M1 and a contact layer only
need to change
• Disadvantages:
— They are connected to VSS and VDD and
despite having their inputs tied off, they
are still drawing Static Current
— The designer may not have the right cell
in the right place at the time of the ECO
43
Courtesy: design-reuse.com
Cells in ASIC Physical Design
Tie Cells
• Tie-high and Tie-Low cells are used to connect the
Gate of the transistor to either Power or Ground
• In deep sub micron process, if the Gate is connected
Output
to Power/ Ground, the transistor might be turned Input
ON/ OFF due to Power or Ground Bounce
• The suggestion from foundry is to use Tie Cells for
the purpose
• The cells which require VDD, comes and connect to
Tie High (so Tie High is a Power Supply Cell), while the Tie-up Cell
cells which wants VSS connects itself to Tie-Low
• Without Tie Cells, unused inputs are tied to logic-high
or logic-low, and these connections are made by routing
the input pin right to the Power/ Ground grid
Input
• With Tie Cells, unused inputs in the original netlist Output
are tied to logic-high or logic-low, and somewhere during
the physical design process, Tie Cells are inserted
\emdash The unused inputs are then connected to a
Tie-high or Tie-low Cell
Tie-down Cell 44
Cells in ASIC Physical Design
Delay Cells
• Delay cells
— Are buffer cells with slower transition time
— Can drive high currents
— Are helpful in reducing Slew Rate (0-1 or 1-0 Transition Time)
— Are of wider channel
— Have delay starting from 20ps to few Nano seconds
— Will have constant delay
• Delay cell insertion is the conventional way to fix hold time
violation tends to penalized in area percentage increment
• Lesser number of delay cells are required for hold time fixing as
compared to buffers but it will have area much greater than normal
buffers
• Increasing gate width reduces gate capacitance hence
reduces delay, but results in higher leakage
• It has inverter in input and a inverter in output and in between
these two inverters it has a combination of a inverter and pass
transistors. Pair of inverter and pass transistor provide at large delay
• Depending on the delay of the cell, pair of inverter and
pass transistor can be repeated multiple times
45
Cells in ASIC Physical Design
Metrology Cells
• To enable the reliable re-productivity of micro-scale
devices used in high volume and low cost
• To measure and monitor the process parameters
during manufacturing
• The effect of process variations during fabrication time can
be identified and measured
46
IO Design
47
IO Design
IO Pads
\emdash Input Output Pads
— Input/ Output circuits (I/O Pads) are
intermediate structures connecting
internal signals from the core of the
integrated circuit to the external
pins of the chip package
— Typically I/O pads are organized into
a rectangular Pad Frame
— The input/output pads are spaced
with a Pad Pitch
— Pads will have pins on all metal layers
used in design for easy access while
routing the design
— Number of layers depends on
technology
— Multiple Power Pads are often used to reduce the power
— Pads consists of some logic cells like level shifters and buffers which will
control the voltages of input and output signals and to increase/
decrease drive strength
48
IO Design
IO Pads
• Structure of Pads
— Bonding Pad
Area to which the bond wire is soldered
The wire goes from the bonding pad to a chip pin
— ESD (Electrostatic Discharge) protection circuitry consisting of a pair of
big PMOS, NMOS in a reverse biased diode structure
— Driving and Logic Circuitry for which the area of is designated
49
Courtesy: ece.ucdavis.edu
IO Design
IO Pad Design
• Implementation Guidelines
— Isolate sensitive asynchronous inputs such as Clock or Bidirectional Pins from
other switching pads with Power/Ground Pads
— Group Bidirectional Pads together such that all are in the input/ output mode
— Avoid continuous placing of simultaneous switching pads
— 2 extra pins = 1 extra pad on 2 sides and 4 extra pins = 1 extra pad on each
side
— Power supply pads must be evenly distributed
— The number of Power Pads required are calculated based on the IO Signal
Pads power requirement and Core Power requirement (IR drop limit)
— No. of IO Power Pads required in a design,
Thumb Rule: One Pair of Power Pads for every 4 or 6 Signal Pads
— No. of Core Power Pads required in a design,
50
IO Design
IO Pad Design
• Pad Limited design
— The area of Pad limits the size of Die
— No. of IO pads are more or larger in size
(technology dependent)
— Pad limited designs pose several challenges
for design implementation and to the
backend designers, if Die area is a constraint
— The Solution would be to use Flip Chip or
Staggered IO placement techniques
• Core Limited Design
— The area of Core limits the size of Die
— No. of IO Pads are lesser
— In these designs Inline IOs will be used
— It can be either due to large no. of Macros the design or due to larger logic
• Types of Pads according to Logic directions
— Input Pad
— Output Pad
— Bidirectional Pad
51
IO Design
Types of IO Pad
\emdash Types of Pads according to Logic Styles
— Signal Pads
— Power Pads (Core Power and IO Power)
— Corner Pads
Corner pads contains only connections
in all metal layers defined in technology
These pad used only for IO Ring continuity
and chip metal density on corners and to maintain yield
— Filler Pads
IO Filler Cells contains only the geometrical information of the Power Rings
in all metal layers
Continuity of Power Rings which is responsible for uniform distribution
of power
Electrostatic Discharge protection
52
IO Design
Types of IO Pad
• According to the Pad locations
— Peripheral IO Pads
— Area IO Pads
• Inline IO Pads
— Pads are placed next to each other,
with the corresponding bond
pads lined up against each other
having a small gap in between Inline IO Pads
— Minimum Pitch is determined by foundry/vendor and is technology dependent
53
Courtesy: edaboard.com
IO Design
Types of IO Pad
• Staggered IO Pads
— CUP (Circuit-Under-Pad)
Bonding Pad over the IO body itself Inner PAD Inner PAD
Bonding Pad have to connected to
the PAD Pin of IO
Pad pin is located close to the center Outer PAD
BUMPS
55
Courtesy: eetimes.com
Delay Models
56
Delay Models
Delay Models
• Delay Calculation
— The delay calculation is needed because of complex Input Capacitance,
Voltage Drop, Voltage Islands, High Impedance nets etc.
— Delay calculation parameter data are stored as Lookup-Table format
• Delay Models
— Interconnect Delay Models
Lumped RCL Delay Models
Wire Load Delay (WLD) Model
Elmore Delay Model
Arnoldi Delay Model
— Cell Delay Models
Non-Linear Delay Model (NLDM)
Scalable Polynomial Delay Model (SPDM)
Effective Current Source Model (ECSM)
Composite Current Source (CCS) Delay Model
57
Delay Models
Interconnect Delay Models
58
Delay Models
Cell Delay Models
Non-Linear Delay Effective Current Scalable Polynomial Composite Current
Model Source Model Delay Model Source Delay Model
• Modeled as a linear • Models a unique • Models the delay • Modeled as current
voltage ramp in dataset for each and slew values as a waveform from a
series with a Voltage- function of voltage time varying current
resistor Temperature and temperature source
• Less accurate combination • SPDM is a • More accurate
• Less run time • Improved accuracy polynomial abstract • More run time
• Intermediate values • Easy to characterize • Less accurate • Extra setup required
are interpolated • Increased • More runtime for characterization
• Assumes load is characterization • Extra • CCS libraries are
purely capacitive • Smooth non-linear characterization huge in size
• Variation may range interpolation/extrap setup required • Addresses the
anywhere from 5- olation • Increased effects of deep
10% • Can't use for characterization submicron
• Linear k-factors memory or complex time processes
required for cell characterization • Extrapolation is
handling of IR-drop, • Models IR Drop unreliable
Delay non-linearly • SPDM requires
• Transition time are • Data characterized elaborate curve
functions of Input for three voltage fitting techniques
slew and Output corners for an accurate
load curve fit
59
ECO
60
ECO
ECO
• Engineering Change Order (ECO)
— Technique to add/ remove the logic with minimum modifications in
the design
— To deliver the product to market as fast as possible with minimum
Risk-to-Correctness and Schedule
— For fixing post Synthesis/ Route/ Silicon issues
— Fixing both timing and functionality issues
— Spare Cells placed in the design are used for ECO
— A Logic Gate/ Flip-flop can be realized using these Spare Cells
— Different flavors of Gate, of required drive strength can be realized any
where in the design
— Only Metal/ Contact changes are needed after fixing the defective
design
61
ECO
Types of ECO
• Post Synthesis ECO (ECO after Synthesis)
— ECO with Synthesized Netlist
• Post Route ECO (ECO after P&R)
— During minute change in the design after full Tape-out is over
— Uses Spare cells and metal layers only
— Metal Layer ECO
During minute change in RTL after Active/Base Layer Tape-out is over
Metal Layer changes only
Cleaning-up routing for Signal Integrity (SI)
— Active Layer ECO (Base Layer ECO)
During minute change in RTL just after Routing is over
Uses Spare Cells
NAND Gate (Universal Logic Gate) based Spare Cell can be used to realize the
new ECO logic
• Post Silicon ECO (ECO after Fabrication)
— To recover from minute manufacturing issues
— Uses Spare cells and metal layers
62
ECO
Types of ECO
• Metal Layer ECO (example)
63
Types of Standard Cell
Libraries
64
Types of Standard Cell Libraries
Semiconductor
World-wide Sales
Common Infrastructure
2
Outline
• The Discontinuity and its classification
•
— Issues, Need & Rules
— Resolution Enhancement Techniques
— Optical Proximity Correction and Scattering Bars
— Multiple Patterning
— Phase Shift Masking and Off-Axis Illumination
• MC/MM/OCV
— Corner Analysis
— PVT/RC Corners
— Temperature Inversion & Cross Corner Analysis
— Modes of Analysis
— Multi-corners/ Multi-modes of Analysis
4
Discontinuity: Classification
Discontinuity
Pinching &
Cell Variation No. of Gates
Bridging
Non-preferred
Interconnect Variation direction routing Die size
Unnecessary
Jogs
No. of
Library Modes Applications
Wire Spreading
Design Constraints
5
Courtesy: mentor.com
Discontinuity: Classification
Process and
Operational
Process
Variations
Operational
Variations
Variations Multiple
Design Modes
Constraints Constraint
Multi VT Cell
Merging
Variations
Crosstalk Gardbanding
Library Overdesign
IR Drop
Modes
Interconnect
BTI Variations
PVT Corners,
Multiple .libs Multiple Voltage
Operation
Metal Thickness Width
(SPB) Dishing & Erosion
Multiple RC Tech files Voltage Islands
with OCV
9
Courtesy: eetimes.com
Why ?
\emdash Need for
— Current Lithographic techniques
(193nm Laser) cannot print
deep-submicron technology
patterns without distortion
— Higher design complexity and
shrinking device geometries
— More devices per unit area on
a chip (device density)
\emdash Importance of
— Impact of variations, if not
addressed in the design, will
cause manufacturing issues, such
as poor yields, long yield ramp-up
times and poor reliability
— The chips may completely miss the market window or may hit the market
window but not economically viable
— The chips may still function, but not at the required/expected speed
— The chips appear to be reliable after volume production, but may suffer
catastrophic failures in the field earlier than their expected life-cycle
10
Solutions
\emdash DFM: Recommendations
— Wire Spreading
The wire distribution spreads wires that
are on the same metal layer as well as across different metal
layers
The benefits gained from lower routing
density are in improved manufacturing yield, reduced
crosstalk noise, crosstalk delay and random particle defects
— Metal Fill
Dummy metal fill
Timing aware metal fill
Unbalanced metal density across a chip may cause yield loss, so fill the empty
spaces in the design with metal wires to meet the metal density rules required by most fabrication
processes
Improved surface planarity helps decrease manufacturing variations that
contribute to timing variability
11
Courtesy: si2.org
Solutions
• DFM: Recommendations
— Hot Spots and Critical Area Analysis (CAA)
▪ Hot Spot/ Critical Area is the region at the center of a random defect which
will cause circuit failure (yield loss)
▪ By analyzing the critical areas, defect-limited yield can be estimated based on
the probability of the failures of vias and point defects on routing
▪ The larger the defect size, the larger the Critical Area
▪ Critical area reduction improves yield
12
Solutions
\emdash DFM: Recommendations
— Chemical-Mechanical Polishing (CMP) is a technique for surface smoothing and
material removal process to get globally planar wafer surface
— Simultaneous polishing of copper, dielectric and barrier
— Combination of chemical and mechanical interactions
The chemical effect by pH regulators, oxidizers or stabilizers
The mechanical action by submicron sized abrasive particles contained in the slurry flow
between the polishing pad and the wafer surface
— Dishing
▪ Difference between the height of the copper in the trench and the height of the
dielectric surrounding the copper trench
▪ Copper dishing is higher for wider copper line or the spacing
▪ It can thin the wire or pad, causing higher-resistance wires or lower reliability bond pads
— Erosion
▪ Difference between the dielectric thickness before CMP and after
▪ CMP Dielectric erosion is higher for higher density
▪ Erosion can result in a sub-planar dip on the wafer surface, causing short-circuits between adjacent
wires on next layer
— On-Chip Variation (OCV) from the interconnect thickness variation due to CMP
becomes relatively larger and needs to be taken into consideration in the
post-layout RC extraction and timing flow
— Solution to CMP is CMP hotspot detection and fixing 13
Solutions
• DFM: Recommendations
— CMP aware-design
Various degrees of Copper Dishing and Dielectric Erosion occur at
different densities and metal line widths
In advanced nodes minimal material removal with atomically flat and
clean surface finish has to be achieved
CMP is influenced by line width and pattern density
The dishing and erosion increase slowly as a function of increasing density
and go into saturation when the density is more than 0.7
Oxide erosion and copper dishing can be controlled by area filling and
metal slotting
14
Courtesy: embedded.com
Solutions
\emdash DFM: Recommendations
— Redundant Via
Redundant Vias use two, or more,
Vias to connect the upper and lower routing layers
together
Replacing single Vias with
redundant (or double) Vias on signal nets improves
reliability and reduce yield loss, due to via failures
Critical Area Analysis (CAA) identifies
the requirement of Redundant Vias
— Resolution Enhancement Techniques (RET)
RET are methods used to modify photo-masks to compensate for limitations in the
lithographic processes used to manufacture the chips
Have significantly increased the cost and complexity of sub-micron
nanometer photomasks
The photomask layout is no longer an exact replica of the design layout
As a result, reliably verifying RET synthesis accuracy, structural integrity,
and conformance to mask fabrication rules are crucial for the manufacture of nanometer
regime VLSI designs
15
Solutions
• DFM: Recommendations
— Litho Process Check (LPC)
▪ Problem: Some DRC clean layouts do not print on silicon
▪ Solution: Must-have litho hotspot detection and fixing of design
— Layout Dependent Effects
▪ Well Proximity Effect (WPE)
▪ Poly Spacing Effect (PSE)
▪ Length of Diffusion (LOD)
▪ OD to OD Spacing Effect (OSE)
▪ Layout Patterning Check (LPC )
▪ OD/Poly Density
16
Resolution Enhancement Techniques
\emdash Types of RET
— Optical Proximity
Correction (OPC)
— Scattering Bars (SB)
— Double Patterning (DP)
or Multiple Patterning
— Phase Shift Masking
(PSM)
— Off-axis Illumination
(OAI)
17
Courtesy: eetimes.com
Optical Proximity Correction
• Optical Proximity Correction (OPC)
— OPC is a Photo-lithography Enhancement technique commonly used to
compensate the mask pattern for image errors due to diffraction or
process effects (by reducing the value of the k1 factor in CD equation)
— OPC is an effective way to deal with geometry distortion from design to chip;
however, it does come at a price
— First, there is the cost of the EDA tools you need to implement the OPC
corrections
— Second, you have an exponential increase in volume of the data representing
the chip's layout, along with a huge increase in the time it takes to process
this data and prepare it for photo-mask generation
18
Scattering Bars
\emdash Scattering Bars (SB)
— Sub resolution assist features that improves the depth of focus of isolated
features
— Scattering Bars are added only for the most outer line of the dense pattern
19
Courtesy: tf.uni-kiel.de
Multiple Patterning
\emdash Multiple Patterning
— Involves decomposing the design across
multiple masks to allow the printing of
tighter pitches
— 38-nm features with 193-nm light water
immersion lithography is the limitation
with the current lithographic process
— Multiple Patterning is a technique used in
the lithographic process that can create
the features less than 38nm at advanced
process nodes
— Multiple patterning basically changing the
value of K1 in the Critical Dimension equation
— Double Patterning
⬧ Double patterning counters the effects of diffraction in optical lithography
⬧ Diffraction effects makes it difficult to produce accurately defined deep sub-
micron patterns using existing lighting sources and conventional masks
⬧ Diffraction effects makes sharp corners and edges become blur, and some small
features on the mask won’t appear on the wafer at all
⬧ Double patterning is expensive because it uses two masks to define a layer that was
defined with one at previous process nodes 20
Courtesy: spectrum.ieee.org
Phase Shift Masking
• Phase Shift Masking (PSM) (not considered in PD)
— Phase-shift masks are photo-masks that take advantage of the interference generated by
phase differences to improve image resolution in photolithography
— Controlling the phase enables constructive or destructive interference at desired locations
in the image plane, thus sharpening or dulling the contrast as desired
— These are photo-masks with structures that manipulate not only the amplitude of the
transmitted waves but also their phase
— Etching quartz from certain areas of the mask (alt-PSM) or replacing Chrome with phase
shifting Molybdenum Silicide layer (attenuated embedded PSM) to improve CD control
and increase resolution
— There exist alternating and attenuated phase shift masks
— Types of masks
⬧ Conventional (binary) mask, Alternating phase-shift mask, Attenuated phase-shift mask
21
Off-Axis Illumination
• Off-Axis Illumination (OAI) (not considered in PD)
— Off-axisillumination is one of the practical techniques to enhance resolution of a given
optical system with bigger advantage of improvements in depth of focus
— The specific illumination geometry is designed to enhance the contrast in the wafer plane of
the photo-mask features whose dimensions are most Critical
— With OAI, resolution of a given system can be improved without going for shorter
wavelength or higher numerical aperture (NA)
— This technique basically has no on-axis illumination component as oppose to partial
coherence
— The shape and size of the source plays an important role when different conditions of mask
features such as density and orientation are considered
— To obtain the highest resolution, illumination of the photo-mask is not performed by a disc-
shaped source
— The angular distribution of the illumination beam may have a complex structure, such as an
annulus, a set of off-axis circles, or even a continuously varying profile
23
Corner
• Corner
— Characterizes the physical environment for Timing Analysis
— An extreme point in the PVT/ RC space where cell and net delays
have extreme values
— A particular one cell library and RC-model specified for STA run
— Corners are meant to capture variations in the manufacturing
Process, along with expected variations in the Voltage and
Temperature of the environment in which the chip will operate
— Corners are independent on functional settings
— As technology shrinks, variations increases since smaller
geometries have had a higher variability
— As a result the number of Corners and Derates also grows
24
Corner
\emdash Corner
— It is important to find minimum number of Corners, because run-time
and Turn Around Time increases with increased number of Corners
— E.g. run only slow metal at SS for Maximum Frequency
— Also each Corner need its own OCV timing margins
— The more Corners are used, the more pessimistic the timing signoff
25
Corner
\emdash Corner
— At each global Corner the Die experiences
—External Voltage (like Minimum, Maximum, Typical)
—Temperature (like Minimum, Typical, Maximum)
—Process Shifts in (independent)
▪ Transistors (Slow: SS, Typical: TT, Fast: FF or mixed SF & FS)
▪ Interconnects (4 RC-extremes and RC-typical and Via Minimum, Maximum,
Typical - Capacitance/ Resistance)
— Vias are independent and not practically correlated with RC-wire models
— Possible Vias models: VRCBEST, VCBEST, VRCWORST, VCWORST, VRCTYP
— Total number of Corners =
{P: SS & FF & TT} x {V: Min. & Max. & Typ.} x {T: Min. & Max. & Typ.} x {RC:
RC , C , RC ,C , RC }
BEST BEST WORST WORST TYP
Intra-die Inter-die
Parametric Variations in the Wafer
28
Need for Corner Analysis
Impact in a Wafer
31
Courtesy: abelite-da.com
Corner Analysis
• RC Corners
—CBEST
It has minimum capacitance. So also known as CMIN corner
Interconnect Resistance is larger than the Typical corner
This corner results in smallest delay for paths with short nets and can be used for min-path-analysis
—CWORST
Refers to corners which results maximum Capacitance. So also known as CMAX corner.
Interconnect resistance is smaller than at typical corner
This corners results in largest delay for paths with shorts nets and can be used for max-path-analysis
— RC-BEST
Refers to the corners which minimize interconnect RC product. So also known as RC-MIN corner
Typically corresponds to smaller etch which increases the trace width. This results in
smallest resistance but corresponds to larger than typical capacitance
Corner has smallest path delay for paths with long interconnects and can be used for min-
path-analysis
—RC-WORST
Refers to the corners which maximize interconnect RC product. So also known as RC-MAX corner
Typically corresponds to larger etch which reduces the trace width. This results in largest
resistance but corresponds to smaller than typical capacitance
Corner has largest path delay for paths with long interconnects and can be used for max-
path-analysis
— Typical
This refers to nominal value of interconnect Resistance and Capacitance
32
Courtesy: abelite-da.com
Temperature Inversion
• Temperature Inversion Dependence
— A problem first described by Vassilios Gerousis of Infineon Technologies
in 2003
2
— Current, I = K . μ . ( VGS - VTH) ; where mobility (μ) and Threshold Voltage
(VTH) are functions of Temperature
33
Courtesy: infineon.com
Cross Corner Analysis
• Cross Corners
— The consequence of Temperature Inversion is that the actual worst case for delay
can occur at a temperature different from the highest temperature
— E.g., as high-VT, low-leakage cells get colder they do not speed up in the way that
circuits built around faster low-VT transistors do
— The reason being that unlike the older technologies where Process, voltage,
temperature (PVT) conditions are chosen with highest temperature to be the
worst conditions for synthesis and P&R timing closure which is not true now
— As a result the worst corner is not always easy to predict thus we need Cross
Corners to identify the worst corner
— The designers have to take into account the libraries corresponding to the lowest
temperature PVT due to the temperature inversion effects
• The Two Corner Analysis
—Late (setup) analysis at weak, minimum voltage, high temperature conditions
—Early (hold) analysis at strong, maximum voltage, low temperature conditions
34
Modes of Analysis
• Modes
— A Mode is defined as an operational setting of the chip
— Mode is linked to a unique set of timing constraints
— Mode can be associated with a set of corners to include only real
combinations
— Mode data is found in .sdc
35
MC/MM Analysis
\emdash Scenarios
— A severely limited Corner/Mode views that combines the worst-case
parameters to run multiple extraction/timing analysis
— Mode or Corner or a combination of both analyzed and optimized
— E.g. Functional Mode - Slow Corner (func_setup_ss_0.9v_125c)
— E.g. Logic BIST Mode - Fast Corner (lbist_hold_ff_1.1v_m40c)
36
MC/MM Analysis
• Multi Corner (MC)/ Multi Mode (MM) Analysis (Multi-Scenario)
— A technique intended to provide high confidence results for timing and other
metrics without performing exhaustive simulation of all possible IC
conditions
— MCMM needed because of multiple dominant corners
— MCMM eliminates the situation where a Hold fix in one mode can break the
Setup in the other Modes
— MCMM helps to avoid switching between different Corners/Modes to fix
Setup/Hold violation
— Avoids over fixing/ under fixing a Hold
violation in a particular Corner
— Reduces Hold buffer count
— Reduce number of manual timing ECOs
— Faster design closure
— Helps in reducing the pessimistic margins and
so is also called as Design-for-Variability (DFV)
— Performed as concurrent analysis & optimization
— Multi-corner analysis to examine the effects of process and environmental
variations as well as changes caused by shifts into different operating modes
— MCMM is the terminology by Synopsys & MMMC is the terminology by
Cadence 37
OCV
• On-Chip Variation (OCV)
—On-chip variation (OCV) is a recognition of the intrinsic variability of
semiconductor processes and their impact on factors such as logic timing
—The number of contributors to timing variability has increased and led to
significant variations not just between wafers but across individual wafers
and increasingly intra-die
—ICs from one batch of wafers being ‘slow’ or ‘fast’ relative to nominal
estimates
—Initially, timing analysis accounting for OCV was handled by telling the STA tool to
apply a global margin (derate) across the entire chip using a percentage or delay
estimate that the designer or the foundry considered safe
—Timing variation was primarily a consequence of subtle shifts in manufacturing
conditions that would lead to ICs from one batch of wafers being ‘slow’
or ‘fast’
—OCV provides a single derating factor for all instances, so the results can be
grossly optimistic or pessimistic
—So OCV may led to performance degradation while closing the timing
—OCV handles global variations with Corners (best case, nominal, and
worst-case combinations)
—The biggest challenge in OCV variations is handling the local uncorrelated
variables
38
OCV Derating
• Derating
— Derating is a way to model slow and fast signals in On-Chip-Variation
(OCV)
— It is an extra pessimism added in Static
Timing Analysis, in order to account for
the On -Chip Variation effects
— 10% derate in simple terms means,
over designing the timing by 10%
— So that chip will work at the desired
frequency, even if there is a variation
effect across the die
— Scaling factors can be set independently
for data paths, clock paths, cell delays,
net delays, and cell timing checks
— Early and late derates applied to launch
paths and capture paths depending upon Setup/Hold Analysis
— Maximum and minimum derating means to multiply the original timing
library delay values by the derate value
— Derating decreases as process matures
E.g. For 65nm designs at earlier days 15% derates added but now a days
only 5% derates need to be added 39
Courtesy: eetimes.com
OCV Timing Checks
• Scaling factors can be set independently for data paths,
clock paths, cell delays, net delays and cell timing checks
• Early and late derates applied to Launch Paths and
Capture Paths depending upon Setup/Hold Analysis
• Setup Check with OCV
— Maximum possible data arrival is determined by taking the maximum
delays along the clock path to the start-point register and the maximum
delays along the slowest data path from the start-point register to the
endpoint register
— The earliest possible clock arrival at the end-point register is determined
by taking the minimum delays along the clock path to the end-
point register
• Hold Check with OCV
— For hold check, we use min delays for the clock path to the start-point
register, min delays through the shortest data path, and max delays for
Setup Hold
Late data rise Early data rise
Late data fill Early data fall
Early clock rise Late clock rise
Early clock fall Late clock fall
42
OCV Enhancements
\emdash Statistical OCV (SSTA modeling)
—Statistical OCV (SOCV) is a simplified approach to SSTA that uses a single local
variable as Derate
— It is also referred as Parametric OCV (POCV)
— It takes elements of SSTA and implementing them in a way that is
less compute-intensive
—It solves the major limitations of AOCV, including variation dependency on
slew and load and the assumption that the same cell, or load, is in the path
— It combines delay variations in Cells, Wires and Vias
— It promises near SSTA accuracy for a small additional cost of runtime
and memory compared to AOCV
— It can include signoff-accurate signal integrity (SI) analysis
— Handles DPT and some other dynamic effects in a conservative static
— way It ignores correlations and number of timing paths
— SOCV is much more accurate than AOCV, especially for graph-based
— analysis SOCV can be validated with SPICE Monte Carlo Analysis
43
CRPR/ CPPR
• Common Path Pessimism (CPP)
—Applying different derating for the Launch and Capture Clock is overly
pessimistic
—The Clock Tree will be at only one PVT condition, either as a maximum path or as a
minimum path (or anything in between) but never both at the same time
—CPP is the delay difference along the common portion of the Clock Tree due to
different deratings for Launch and Capture Clock Paths
—Pessimism caused by different derating factors applied on the common part of
the Clock Tree is called Common Path Pessimism (CPP)/ Clock Re-convergence
Pessimism (CRP) which should be removed during the analysis
CRP or CPP = (maximum clock delay or skew) - (minimum clock delay or skew)
• Common Path Pessimism Removal (CPPR) or Clock
Re-convergence Pessimism Removal (CRPR)
—Both CPPR and CRPR are removal of artificially introduced pessimism between
the Launch Clock Path and the Capture Clock Path in timing analysis
—CPPR - terminology by Cadence
—CRPR - terminology by Synopsys
44
Thank You
45
67
67