ECE260B - CSE241A Winter 2017 Floorplanning and Partitioning
ECE260B - CSE241A Winter 2017 Floorplanning and Partitioning
Winter 2017
Website: http://vlsicad.ucsd.edu/courses/ece260b-w17/
ECE 260B – CSE 241A Floorplanning and Partitioning 1 Andrew B. Kahng, UCSD
Physical Design Flow Overview
ECE 260B – CSE 241A Floorplanning and Partitioning 2 Andrew B. Kahng, UCSD
Physical Design Flow – Pictures!
Floorplanning Powerplanning
Placement Routing
ECE 260B – CSE 241A Floorplanning and Partitioning 3 Andrew B. Kahng, UCSD
Step 0 – Architecture analysis
Understand architecture of the target design
Core-limited floorplan
Pad-limited floorplan
From H. Kaeslin textbook
ECE 260B – CSE 241A Floorplanning and Partitioning 6 Andrew B. Kahng, UCSD
Step 3 – Macrocell placement
Macro cell placement
block7
block3 block6
block1
block4
block2 block5
ECE 260B – CSE 241A Floorplanning and Partitioning 10 Andrew B. Kahng, UCSD
Floorplanning Input
Design netlist
Not necessarily final netlist, but including macros, IOs
Area requirements
Die size, package style, BEOL layer stackup
Power requirements
Peak power per block, voltage islands level shifters, power gating /
voltage scaling strategy, block placement guidance (e.g., close to power
supply pins)
Timing constraints (budgeted across top-level blocks)
Logical and/or physical hierarchy information
Datapath, control, memory, …
Structured-custom vs. ASIC; hierarchical vs. flat
IP integration requirements
Supply, isolation,
Pinout (= I/O placement for wirebond; redistribution layer in flip-chip)
SSO, ESD constraints
Sense of whether design is pad-limited or core-limited
Pad-limited: low utilization, muxing of IOs, use of pads for signals vs. power-
ground distribution, …
Core-limited: high utilization, routability, macro placement
ECE 260B – CSE 241A Floorplanning and Partitioning 11 Andrew B. Kahng, UCSD
Floorplanning Output
Per-block areas
In a hierarchical flow
IOs placed
Macros placed
Power domains created
Level shifters and power switches
Power grid designed and pre-routed
Standard cell placement regions defined
Placement guidance
Placement groups; assignment of placement groups to regions
Pre-placements
Soft blockages (forbidden to some types of cells)
Hard blockages (keep-outs)
- E.g., to preserve routability near a hard macro
ECE 260B – CSE 241A Floorplanning and Partitioning 12 Andrew B. Kahng, UCSD
Floorplan Picture
Modern SoC:
many memories,
heavy power
network
ECE 260B – CSE 241A Floorplanning and Partitioning 13 Andrew B. Kahng, UCSD
Blocks
Blocks are inside a pad frame blocks
Hard = defined outline = fixed (H,W)
Soft = defined area, but (H,W) flexible
Semi-soft = discrete set of (H,W) pairs
Shapes: rectangular, L, T, rectilinear
Pin locations defined
Can rotate, mirror std cell row
RAM
Routing inside, between blocks
Sometimes, over blocks
Floorplanning of different-sized blocks
is harder than place and route of
standard cells
Block placement is done by hand
Issues: data path
- access to power supply (power-hungry
blocks)
- alignment of power grid to supply pins
- soft blockage / “halo” to ensure routability
- leave contiguous region for std-cell P&R
- buffer sites for nets that want to get
around a macro
I/O pads
- data flow
Courtesy K. Yang, UCLA
ECE 260B – CSE 241A Floorplanning and Partitioning 14 Andrew B. Kahng, UCSD
Size Estimation in Standard-Cell Blocks
Why we care about size …
If area is too small: P&R will not finish or meet timing, will run too long
Schedule and size are inversely related (size will win out for high-volume
production – and everyone hopes that their chip will be high-volume…)
Performance and size have a complex relationship
Physical Design
Perf Schedule
(design time)
Size
Size
ECE 260B – CSE 241A Floorplanning and Partitioning 16 Andrew B. Kahng, UCSD
Automated Floorplanning
No automated floorplanning tool has ever made it to “prime
time”, but such a tool remains…
… one of several “holy grails” for the implementation flow
… the subject of MANY academic research papers
Issues
How should a floorplan be represented ?
- Completeness: should be able to represent all possible floorplans
- Efficiency: conversion between representation and actual realization
- Redundancy: not good to evaluate two floorplans, only to find they are same
- Nonoverlapping “packing” of rectilinear shapes, or ? (Kahng, ISPD-2000)
How do we search over the space of feasible floorplan representations?
- Often, simulated annealing (= a “metaheuristic”) s used
- Need a ‘perturbation’ or ‘neighborhood operator’ that induces a smooth cost
landscape
What is the optimization objective ?
- Area (whitespace minimization), wirelength, …
- Dataflow aware?
What are the optimization constraints ?
- Pre-placements, timing, routability, …
ECE 260B – CSE 241A Floorplanning and Partitioning 17 Andrew B. Kahng, UCSD
Simulated Annealing (SA)
Kirkpatrick, Gelatt, Vecchi, Science (1983): One of the most cited
scientific papers ever
SA is one of many “metaheuristics” that are used to deal with instances of
intractable (NP-hard) combinatorial problems
Genetic algorithms (Holland, U. Michigan)
Tabu search (Glover, U. Colorado)
Etc.
Combinatorial optimization has a physical analogy to the annealing (slow
cooling) of metals to produce a perfectly-ordered, minimum-energy state:
a “state” is a “solution”, “energy” is “cost”, etc.
Basic idea
Initialize – Start with a random initial solution. Initialize high “temperature”.
Step 2: “Move” – Perturb current solution to obtain a ‘neighbor’ solution
Step 3: Calculate cost change – calculate the change in solution cost due to
the move (minimization: negative change is better, positive change is worse)
Step 4: Accept/Reject – Depending on the cost change, accept or reject the
move. Probability of acceptance depends on current “temperature”.
Step 5: Update – Update temperature, current solution. Go to Step 2.
Continue until termination condition (‘freezing’ or ‘quenching’) is satisfied
ECE 260B – CSE 241A Floorplanning and Partitioning 18 Andrew B. Kahng, UCSD
SA Pseudocode
http://www.ecs.umass.edu/ece/labs/vlsicad/ece665/slides/SimulatedAnnealing.ppt
Algorithm SIMULATED-ANNEALING
Begin
temp = INIT-TEMP;
currentSol = INIT-SOLUTION;
for i = 1 to M
candidateSol = NEIGHBOR(currentSol);
ΔC = COST(candidateSol) – COST(currentSol);
if (ΔC < 0) then
currentSol = candidateSol;
else with Pr = e-(ΔC/temp))
currentSol = candidateSol;
temp = SCHEDULE(temp);
End What happens when temp = +∞ ?
What happens when temp = 0 ?
ECE 260B – CSE 241A Floorplanning and Partitioning 19 Andrew B. Kahng, UCSD
Simulated Annealing Facts
value
Is cooling the best strategy with finite
time? See Boese/Kahng, 1993
ECE 260B – CSE 241A Floorplanning and Partitioning 20 Andrew B. Kahng, UCSD
Slicing Floorplan Representation (Otten, 1982)
A slicing floorplan can be 1
recursively cut in two without
cutting any blocks C
A “wheel” is an example of a A
non-slicing floorplan
3
2
D
16H35V2HV74HV
Chains
ECE 260B – CSE 241A Floorplanning and Partitioning 22 Andrew B. Kahng, UCSD
Realization of Slicing Floorplans
Floorplanning (classically) is difficult for at least two reasons
Blocks have bounded or discrete aspect ratio (AR) = max (H/W, W/H)
Non-overlapping constraint: minimum area = minimum “dead space”
Discrete sizing
ECE 260B – CSE 241A Floorplanning and Partitioning 23 Andrew B. Kahng, UCSD
Realization of Slicing Floorplans
What is the implied area
of a slicing tree?
ECE 260B – CSE 241A Floorplanning and Partitioning 24 Andrew B. Kahng, UCSD
Comments on Scalability, Time Constants
Chip implementation requires substantial resources
Time, hardware, people
Human engineer’s time constants
Interactive (real-time in layout editor, or < 2 minutes to get cursor back)
Cup of coffee (< 15 minutes)
Lunch (< 1 hour)
Overnight (< 12 hours)
Other costs and bounds
Until recently: 4GB addressable memory limit on 32-bit machines
Tool costs: SOC Encounter ($700K), PrimeTime ($100K), Design Compiler
($150K), etc. for 1-year time-based license
Engineer costs: ~$1K per workday
Instance sizes double with each technology node
But processors are not twice as fast per node
Parallel processing, more shortcuts
- E.g., #moves SA must consider per second constrains move set, cost function…
Reminder: internal course document How To Start Using
Tools Efficiently (read this !)
ECE 260B – CSE 241A Floorplanning and Partitioning 25 Andrew B. Kahng, UCSD
Sequence Pair Floorplan Representation
ECE 260B – CSE 241A Floorplanning and Partitioning 27 Andrew B. Kahng, UCSD
Hypergraphs in VLSI CAD
Circuit netlist represented by hypergraph
#vertices = 48
etc
ECE 260B – CSE 241A Floorplanning and Partitioning 31 Andrew B. Kahng, UCSD
Fiduccia-Mattheyses (FM) Approach
Pass:
start with all vertices free to move (unlocked)
label each possible move with immediate change in cost that it
causes (gain)
iteratively select and execute a move with highest gain, lock the
moving vertex (i.e., cannot move again during the pass), and
update affected gains
best solution seen during the pass is adopted as starting solution
for next pass
FM:
start with some initial solution
perform passes until a pass fails to improve solution quality
ECE 260B – CSE 241A Floorplanning and Partitioning 32 Andrew B. Kahng, UCSD
Gain Bucket Data Structure
+pmax
-pmax
1 2 n
-1 0 2
- each object is assigned a gain
- objects are put into a sorted gain list
- the object with the highest gain from the 0
larger of the two sides is selected and 0 -
-2
moved.
- the moved object is "locked"
- gains of "touched" objects are
recomputed 0 0
- gain lists are resorted -2
-1
1
-1
1
-1 0 2
0
0 -
-2
0 0
-2
-1
1
-1
1
0
-2 -
-2
0 0
-2
-1
1
-1
1
0
-2 -
-2
0 0
-2
-1
1 1
-1
0
-2 -
-2
0 0
-2
-1
1
1
-1
0
-2 -
-2
0 -2
-2
1 -1
-1
-1
-2 -
-2 0
0 -2
-2
1 -1
-1
-1
-2 -
-2 0
0 -2
-2
1 -1
-1
-1
-2 1
-2
0
-2 -2
-2
1 -1
-1
-1
-2 1
-2
0
-2 -2
1 -2
-1
-1
-1
-2 1
-2
0
-2 -2
1 -2
-1
-1
-1
-2 1
-2
0
-2 -1
-2
-2
-3
-1
-1
1
-2
-2
0
-
-2 1
-2
-2
-3
-1
-1
1
-2
-2
0
-
-2 1
-2
-2
-3
-1
-1
-1
-2
-2
-2
-
-2 1
-2
-2
-3
-1
-1
Cut
Moves
ECE 260B – CSE 241A Floorplanning and Partitioning 49 Andrew B. Kahng, UCSD
Time Complexity of FM
ECE 260B – CSE 241A Floorplanning and Partitioning 50 Andrew B. Kahng, UCSD
Multilevel Partitioning (since ~1995)
Refinement /
Clustering / Uncoarsening
Coarsening
Near-linear scalability
ECE 260B – CSE 241A Floorplanning and Partitioning 51 Andrew B. Kahng, UCSD
Homework (due Monday Jan 30 in Gradescope)
Q1. (a) What is “fixed-outline floorplanning”? Give a
definition and at least one citation. (b) How does “fixed-
outline floorplanning” differ from the “packing with
minimum whitespace” formulations seen in the academic
literature? (c) Why is “fixed-outline floorplanning” sensible
in the context of modern design of a (large) logic block? (d)
Look at the documentation of Cadence EDI / Innovus. List,
and give explanations of, at least three commands that
define “floorplan” regions that will affect the subsequent
gate-level placement.
Q2. Download and skim U.S. Patent #6,223,329. Think
about Figures 4, 5 and 8. (a) Explain in your own words
how the invention claimed in this patent will define “ports”
of blocks. (b) Why does the patent call the blocks in the
figures “soft blocks”?
ECE 260B – CSE 241A Floorplanning and Partitioning 52 Andrew B. Kahng, UCSD
Readings linked from class webpage
C.M. Fiduccia and R.M. Mattheyses, A linear time heuristic for improving
network partitions, Proc. ACM/IEEE Design Automation Conference. (1982)
pp. 175 - 181.
A. E. Caldwell, A. B. Kahng and I. L. Markov. Design and Implementation of
the Fiduccia-Mattheyses Heuristic for VLSI Netlist Partitioning. Proc.
Workshop on Algorithm Engineering and Experimentation (ALENEX),
January, 1999
C. J. Alpert and A. B. Kahng, "Recent Directions in Netlist Partitioning: A
Survey“, Integration: The VLSI Journal 19 (1995), pp. 1-81.
ECE 260B – CSE 241A Floorplanning and Partitioning Andrew B. Kahng, UCSD