Simplify To Survive Prescriptive Layouts
Simplify To Survive Prescriptive Layouts
ABSTRACT
The time-to-market driven need to maintain concurrent process-design co-development, even in spite of discontinuous
patterning, process, and device innovation is reiterated. The escalating design rule complexity resulting from increasing
layout sensitivities in physical and electrical yield and the resulting risk to profitable technology scaling is reviewed.
Shortcomings in traditional Design for Manufacturability (DfM) solutions are identified and contrasted to the highly
successful integrated design-technology co-optimization used for SRAM and other memory arrays. The feasibility of
extending memory-style design-technology co-optimization, based on a highly simplified layout environment, to logic
chips is demonstrated. Layout density benefits, modeled patterning and electrical yield improvements, as well as
substantially improved layout simplicity are quantified in a conventional versus template-based design comparison on a
65nm IBM PowerPC 405 microprocessor core. The adaptability of this highly regularized template-based design
solution to different yield concerns and design styles is shown in the extension of this work to 32nm with an increased
focus on interconnect redundancy. In closing, the work not covered in this paper, focused on the process side of the
integrated process-design co-optimization, is introduced.
Keywords: Design for Manufacturability (DfM), template-based design, design-technology co-optimization (DTCO),
predictably composable logic, pdBrix
1. INTRODUCTION
It is well known that profitability in the
microelectronics industry is driven by a two
year cycle in which transistor density doubles,
performance noticeably increases, power
consumption drops, and wafer manufacturing
cost remains largely constant. With
wavelength scaling reaching its limit at 193nm
in the 90nm node, patterning resolution
improvement has been achieved through a
series of discontinuous innovations rather than
predictable evolutionary enhancements like
wavelength reduction (Fig.1, top). Both
computational resolution enhancements, such
as off-axis illumination with sub-resolution
Figure 1 CMOS scaling is being challenged by radical innovations assist features and optimized model-based
in patterning, process, device, and interconnect technology, yet optical proximity correction, as well as
the pressure to stay on a 2 year node-to-node cycle remains. physical resolution enhancements, such as
ultra-high numerical aperture lithography
enabled through the use of water immersion, share the common trait of introducing more severe and more complex
layout sensitivities in physical and electrical yield detractors.
Beyond the aforementioned escalation in patterning complexity, the profitability of the microelectronics industry is
further challenged by the fact that dimensional scaling alone is no longer sufficient to achieve the electrical performance
targets of the next technology node. Additional device, interconnect, and process innovations, introduced with every new
node, further add to the unpredictability of layout sensitive detractors to yield ramp (Fig.1, bottom). Yet, even as process
optimization becomes more difficult and layout-specific, the two year node-to-node timetable requires concurrent
technology and design co-optimization. It is simply not possible to develop a new process solution and thoroughly
characterize all its layout sensitivities before starting the respective node’s design work. To have product designs
available when the process is scheduled to yield functional chips, design work has to proceed in parallel to the process
and device optimization work.
Figure 3a Example of lithography variability bands Figure 3b Illustration of an RDR description of the
of a first-metal layout as they would be presented to width-dependent-spacing phenomenon captured in Fig 2.
a designer in interactive model-based DfM. Dots represent specific allowed line placements.
In contrast to model-based DfM, Restricted Design Rule (RDR) based DfM focuses on preserving, even enhancing,
design efficiency while ensuring yield and performance by eliminating unknown layout sensitivities. Best described as a
design approach based on ‘prescriptive’ design rules rather than the traditional ‘prohibitive’ design rules, RDR-based
design seeks to provide clarity in the design-process handoff by comprehensively defining all allowed feature
placements (4). As shown in Fig.3b, RDRs minimize the complexity of width-dependent spacing rules (Fig.2) by
defining discrete placement options for narrow lines, followed by a more traditional continuous design space with a
single conservative design rule for intermediate width lines, and complete elimination of all lines (with limited design
value) of extremely large dimensions. In a process environment where yield and performance no longer improve
monotonically above the minimum design rule, RDRs provide clear targets for aggressive process optimization. Key to
successful implementation of RDR-based DfM is a close collaboration between the design and process teams in the
initial definition of the RDRs to ensure process development for a limited set of design rules that is actually useful to the
designers (5). While currently the only feasible design-process co-optimization solution, RDR-based DfM still leaves
room for improvement in eliminating design conservatism while ensuring competitive performance and yield.
The common element of current DfM solutions is that, similar to the original design rules, they focus primarily on the
‘layout space’ after the actual design and before actual process optimization. Therefore, these approaches can only
provide sub-optimal solutions.
0 34 0 23 0 1634
Figure 7a Diffusion, poly, and contact Figure 7b Range of electrical channel length extracted for each
lithography variability-bands for pdBrix layout transistor in the two layouts of Fig 7a. Increased electrical
(top) and conventional layout (bottom) variability as a response to lithography variation due to dose,
focus, and mask size is seen in the standard layout.
2.5. Area Improvement
While the yield and performance benefits of regularized layouts may be well accepted, the biggest barrier to broader
implementation of regularized layout styles is the perceived impact on layout density. The pdBrix design of IBM’s
PowerPC microprocessor showed that it was possible to contain the extremely regularized layout in the same footprint as
the original layout. Further, the total area occupied by sequential logic was identical in the two layout styles and the area
occupied by combinatorial logic actually decreased by 25%. Further analysis of the specific contributors to this 25% area
reduction indicated that:
- 25% of the reduction was achieved through Template and Fabric co-optimization; i.e. elimination of layout
conservatism by implementing construct-specific design rules.
- 5% of the reduction was achieved through the use of design specific complex gates, referred to as Brix,
synthesized from the primitive logic functions rendered in Templates.
- 70% of the reduction was attributed to optimal construction of application-specific logic cell functions; i.e.
eliminating conservatism in layout variants like power levels by synthesizing an application specific library
from a technology node specific set of Templates rather than using generic set of standard cells designed to
cover all possible power/performance needs.
3. Fabric Adaptability
The 65nm PowerPC design exercise:
a) confirmed the inherent patterning benefits of regularized layouts;
b) demonstrated that an optimally integrated design flow can achieve competitive layout density with a highly
regularized layout style; and
c) showed that it is possible to generate a competitive design from a substantially simplified set of logic
elements.
Based on these positive results, it was decided to continue this work in the 32nm node, driving the project closer to the
leading edge of the technology node to be better aligned with new product design starts. In porting the 65nm fabric to
32nm, three opportunities to align the Fabric more with product specific internal design objectives were identified:
- Running first metal perpendicular to the poly gates and forcing it to be
completely uni-directional eliminates all possibility of multiple diffusion contacts,
as shown in Fig.9A. It is well known that redundant diffusion contacts are
preferred in some designs for device performance and yield benefits. Further, the
strictly unidirectional metal adds additional vias for all ‘wrong’ way connections,
which is not ideal for processes with via yield challenges.
Figure 9 Illustration of 3 - Forcing all metal to be equal width (Fig.9B) eliminates the possibility of using
specific layout considerations wide wires for power distribution, preferred in some designs to improve reliability
in the original pattern-count on these high current carrying constructs.
optimized Fabric: contact
redundancy, narrow power - Linking the contacted device pitch (i.e. minimum poly pitch) to the minimum
wires, tight tip-to-tip space. contacted tip-to-tip spacing of first metal (Fig.9C) creates a scaling problem.
While pitch scaling is driven to 30% reduction per node, tip-to-tip spacing has
been scaling at roughly 20% per node, forcing a decoupling of these two constructs in the overall density scaling.
In a demonstration of the adaptability of the pdBrix design approach, a new fabric with different optimization priorities
was defined, as shown in Fig.10. The new fabric continues to enforce: limited diffusion corners, fixed-pitch
unidirectional poly (vertical), preferred orientation metal (M1 now vertical, M2 now horizontal), contacts and vias on
grid, and relaxed pitch for M1 (25%) and M2 (5%), but also accommodates limited wrong-way metal (on-grid) and
wider power rails. The improvement in via and contact redundancy for different layout options is shown in Fig 11. The
initial improvement is achieved by exploiting opportunities to insert redundant contacts and vias into the original image
(labeled ‘plus redundancy’ in Fig.11), followed by allowing limited use of wrong-way metal in the original image, and
finally by switching to a new cell image. Fig.11 compares the ‘number of non-redundant connections’ for all four cases.
Reporting interconnect improvement in this fashion highlights the fact that both ‘eliminating connections’ and ‘adding
redundancy to connections’ fulfills the stated design intent.
Figure 10 Fabric chosen to optimize Figure 11 Reduction of non redundant connections, relative to the
redundancy, reduce necessary connections, original PowerPC design, based on different fabric trade-offs.
and allow wide power wire.
Figure 12 A nand2 rendered in the ‘low Figure 13 Long range parametric layout sensitivities on diffusion and
pattern count’ (left) and ‘high redundancy’ poly are minimized through macro regularity (left), local yield hotspots
(right) Fabric on first metal are prevented through control of boundary conditions.
The differences in the two fabric styles, one optimized to reduce overall pattern count, the other optimized for
redundancy and wide power rails, can be seen in Fig.12. Shown are two templates representing the same logic gate (a
nand2) mapped onto the two fabrics. Defining the fabric with selective wrong-way metal does not affect the layout
density or logic simplicity as demonstrated above, but stresses the need to actively manage predictable composability of
the logic elements. To ensure predictable circuit performance regardless of placement, layout sensitivities in the active
device parameters have to be minimized. In addition to local proximity effects like poly line-width variation and
diffusion corner rounding, device performance is affected by long range layout sensitivities in processes such as etch,
stress, or rapid thermal anneal. These long range effects are best controlled through macro-regularity; i.e., global pattern
uniformity on diffusion and poly shapes as shown in Fig.13(left). The first metal level, being used for local interconnect
only, is not a major contributor to parametric variability, making local patterning hotspots the primary yield concern.
The small number of logic constructs allows the safe use of optimally complex layout configurations as long as proper
boundary conditions are enforced to ensure that no new complex layout configurations can be formed at cell boundaries.
As illustrated in the right layout of Fig.13, in this fabric the more complicated metal patterns (i.e. staircases or dense T-
shaped line-ends) are constrained to the center of the cell, as are ‘belt buckle’ constructs that require tight dimensional
control since they can be used for connectivity to other layers. Layout constructs within the boundary region of the cell
(i.e. outside of blue outlines in Fig.13) are simpler and more conservative (e.g. line-end connectivity in boundary region
belt buckles is not allowed). Predictable composability is thereby maintained for two layout sensitivities of different
length scales through two different mechanisms: macro regularity to address long range effects and boundary conditions
to address local effects.
5. CONCLUSION
It was shown that profitable CMOS scaling under the time pressures of the established two year node-to-node cycle in an
environment of continuous disruptive technology innovation requires deep design-technology co-optimization. To
overcome the shortcomings of DfM solutions that operate primarily in the physical layout space and do not address
fundamental design optimization or design-aware process optimization, the pdBrix design methodology was evaluated.
The goal of this collaborative work is to establish a comprehensive design to silicon solution that facilitates rigorous
design-technology co-optimization.
The key requirements for an optimal design to silicon solution have been establishes as:
- late binding of logic to physical layout to preserve design-creativity and -performance
- optimally simplified layout to allow construct-driven process development
- targeted characterization and qualification test vehicles to drive yield ramp
The work reported here demonstrated feasibility of design with a simplified set of logic, preserving layout density and
improving patterning yield and variability while achieving the stated power/performance targets with significantly fewer
layout constructs.
ACKNOWLEDGMENTS
The joint IBM and PDF team of authors thanks their many colleagues that were directly and indirectly involved in this
work and acknowledges the specific contributions of Henning Haffner (Infineon) for his work on 32nm design rule
optimization and Dureseti Chidambarrao (IBM) for his work on contour-based extraction.
REFERENCES
1
DFM lessons learned from altPSM design, L. Liebmann, Z. Baum, I. Graur, D. Samuels, Proc. SPIE 6925, 69250C (2008)
2
Convergent automated chip-level lithography checking and fixing at 45 nm, Valerio Perez, et al. , Proc. SPIE 7275, (2009)
3
Hotspot detection and design recommendation using silicon-calibrated CMP model, Colin Hui, et.al., Proc. SPIE 7275, (2009)
4
Layout impact of resolution enhancement techniques: impediment or opportunity?, L. Liebmann, ISPD’03, 2003, Monterey
5
Intel design for manufacturing and evolution of design rules, Clair Webb, Proc. SPIE 6925, 692503 (2008)
6
Regular Fabrics for Nano-Scaled CMOS Technologies, L. Pileggi and A.J. Strojwas, ISSCC, (2006).
7
Maximization of Layout Printability/Manufacturability by Extreme Layout Regularity, Tejas Jhaveri, et al.,, J.
Micro/Nanolithography, MEMS, and MOEMS, Vol 6 (03), 2007.