This document discusses the current state of system-on-chip (SOC) design. It describes how embedded systems are traditionally partitioned between control and data processing subsystems, with control functions implemented in software and data processing in hardware logic. As systems become more complex, this traditional partitioning is no longer adequate. The document examines hardware trends like higher complexity, power concerns, and new deep-submicron issues that increase design risks for SOC developers using traditional hardware design methods.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
66 views20 pages
Chapter 2
This document discusses the current state of system-on-chip (SOC) design. It describes how embedded systems are traditionally partitioned between control and data processing subsystems, with control functions implemented in software and data processing in hardware logic. As systems become more complex, this traditional partitioning is no longer adequate. The document examines hardware trends like higher complexity, power concerns, and new deep-submicron issues that increase design risks for SOC developers using traditional hardware design methods.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 20
Chapter 2
SOC Design Today
To understand where SOC design must go, we first look at the current state of design needs and methodology. Different design teams have different goals and use somewhat different design approaches, but a number of common themes and serious problems stand out. This chapter looks at the basic issues and exposes a set of key unsolved problems.
2.1 HARDWARE SYSTEM STRUCTURE
Traditionally, embedded systems have made a simple distinction between subsystems primarily used for processing application data and subsystems used for housekeeping functions. A simple view of this structure is shown in Fig 2-1. This partitioning into control and data flows works well when the data processing is simple (but very demanding in bandwidth or efficiency) and when the control functions are complex (but not performance demanding). In that case, the control functions can be implemented as software tasks running on a general-purpose embedded processor. The data processing can be implemented as hardwired logic for speed. The control and data-processing subsystems communicate through memory, perhaps in the form of commands passed from the control processor to the data-processing logic, and status or results passed from the data-processing logic to the control processor. When the data-processing tasks are simple, they may be entirely implemented with a processor. For moderate performance embedded applications, 8- and 16-bit microcontrollers and 32-bit RISC processors deliver adequate performance and efficiency. Similarly, digital signal processors (DSPs) broaden the range of tasks that can be implemented with software running on a standard processor. However, the potential parallelism, bandwidth, and power efficiency of hardwired logic makes optimized custom logic the data-processing method of choice. The likelihood of design changes is a key issue when making a partition between control
Figure 2-1 Simple system structure.
and data-processing functions. These changes may be driven by new specifications, evolving market requirements, or the need to fix bugs to correct a mismatch between specification and implementation. Changes in the control subsystem associated, for example, with new user interface functions, can be easily accommodated by changes in the software task loaded when the system powers up. Changes in the hardware-based data-processing function are harder to achieve, because the physical structure of the system must be changed. If the data-processing functions are implemented in an integrated circuit, the chip design must go back through much of the design and prototyping flow. The system architect’s task includes the basic partitioning of all system functions into two buckets: complex functions with a high probability of change are mapped to software; simple stable functions with high computation requirements are mapped to hardware. The tools and techniques of current SOC design are largely built around the assumption that these two buckets are adequate, that complex functions are relatively undemanding computationally, and that high-throughput functions are relatively simple.
2.1.1 How Is RTL Used Today?
In the past 10 years, the wide availability of logic synthesis and ASIC design tools has made RTL design the standard for hardware developers. Reasonably efficient compared to custom transistor-level circuit design, RTL-based design effectively exploits the intrinsic concurrency of many data-intensive problems. RTL design methods can often achieve tens or hundreds of times the performance of a general-purpose processor. This performance advantage arises from two essential characteristics of dedicated logic: 1. Custom-designed logic can precisely implement the sequence of operations required by the desired function. Operands flow from the output of one primitive operation (e.g., add, multiply, compare) directly to the inputs of the next. Some operations, such as bit- field selection or bit reordering, are trivial because they can be implemented with simple wires. Moreover, operations need only be as wide as required by the algorithm. This type of design is efficient because there is no waste. The hardware is only as large as needed by the algorithm. Arbitrary expansion to the next power-of-two in bit width, as required in the design of most fixed-instruction-set processors, is unnecessary 2. Natural concurrency is expressed directly. If one operation is not dependent on another, the two can be executed in separate logic blocks at the same time. Moreover, designers can often restructure algorithms to more clearly expose intrinsic parallelism. For applications with very high intrinsic parallelism, performance is limited only by the throughput of the longest path through the logic and by the hardware budget. On the other hand, the power of RTL-based design heavily depends on the ability of the hardware developer to comprehend and implement the entire functional specification. For a function entirely implemented in hardware, all setup, error handling, and little- used cases must be entirely implemented in logic gates. The numerous interactions within the function block and with other function blocks can overwhelm the hardware designer with complexity. Even if the designer off-loads functions that do not have critical performance requirements to a processor, the hardware interface to the processor and the corresponding software driver must still be designed and verified. This design partitioning may not closely match the algorithm’s intrinsic partitioning. Such artificial interfaces are frequent sources of bugs. For all or most functions, the hardware designer implements the function using some combination of finite state machines, data-path logic, and local memory blocks, as shown in Fig 2-2. It is useful to look at the characteristics of these subcomponents of a hardware design to understand the real partitioning issues. In some cases a single finite state machine controls an entire data path. In other case, the control is implemented as a set of interacting state machines (either state machines interacting as peers or a master state machine controlling slave state machines). In any of these cases, the aggregate state machine complexity creates intrinsic design fragility. Getting the design right in the first place, and updating it as design requirements evolve, becomes increasingly difficult.
2.1.2 Control, Data Path, and Memory
In most RTL designs, the data path consumes the vast majority of the gates in the logic block. A typical data path may be as narrow as 8, 16, or 32 bits, or it may be hundreds of bits wide. The
Figure 2-2 Hardwired RTL function: data path + finite state machine.
data path typically contains many data registers, representing intermediate computational states, and often has significant blocks of RAM or interfaces to RAM blocks that are shared with other RTL blocks. These basic data-path structures reflect the nature of the application data and are largely independent of the finer details of the specific algorithm operating on that data. By contrast, the RTL logic block’s finite state machine contains nothing but control details. All the nuances of the sequencing of data through the data path, all the handling of exception and error conditions, and all the handshakes with other blocks are captured in the RTL block’s state machine. This state machine may consume only a few percent of the block’s gate count, but it embodies most of the design and verification risk due to its complexity. If a late design change is made in an RTL block, the change is more likely to affect the state machine than the structure of the data path. This situation heightens the design risk. Any design method that reduces the risk of state-machine design also reduces the overall design risk for an SOC that contains a significant number of RTL-based blocks. Moreover, the design risk of a hardwired block is not limited to interactions between the finite state machine and the corresponding data path. The complexity of the interface between blocks tends to grow with complexity of the blocks themselves. Design changes in one block commonly cause changes in the interface, which then trigger mandatory changes in other blocks. The discovery of a single state-machine bug late in the design cycle can propagate changes throughout much of a complex chip.
2.1.3 Hardware Trends
As semiconductor device density increases, five trends in hardware design are emerging to increase the difficulty and risk of this traditional development model: 1. Higher complexity: Competitive pressure in end-product markets and more plentiful availability of cheaper transistors conspire together to push the average chip complexity up exponentially. Even with a forecasted moderation in Moore’s law scaling, the industry expects silicon integrated circuit density to increase by almost 30% annually. This increase in transistor capacity will result in larger logic blocks and more blocks per SOC. Moreover, the design and verification effort increases more than linearly with the number and size of the blocks. Without significant changes in SOC design methodology, the cost of logic design on a chip will also increase more than linearly with transistor or gate count. 2. Greater concern over power dissipation: Aggressive scaling of transistor technology has enabled faster, denser circuits, but power dissipation is becoming a critical issue. High power dissipation pushes the limits of heat dissipation in integrated circuit and system packaging, degrades battery life, and compromises overall system energy efficiency. Even in line-power applications such as server farms, the difficulty of pulling heat out of the electronics may limit the total capacity of the facility. Active power—dissipation due to circuit switching—is already a major concern in SOCs but, with 90nm technology, static or standby power also poses major problems. 3. New deep-submicron effects: Transistors and wires scale differently with line width. Even with improvements in the number of routing layers and interconnect materials, interconnect wiring will consume a growing fraction of the clock period. In addition, a growing fraction of the total capacitive wire load will couple to adjacent wires. This increased coupling increases the risk of degraded signal integrity and data- dependent propagation delays. Local variations in fabrication processing cause increasingly important variations in the electrical characteristics of transistors and wires. These issues necessitate more accurate 3D circuit analysis, more pervasive statistical modeling and more sophisticated delay-estimation tools. 4. Heavier simulation load: Changes in the basic function of electronic systems force changes in the design process. More systems are continuously connected to networks, especially wireless networks. These networks typically have complex access protocols. Validating correct system operation when it’s attached to a network may require large numbers of long test sequences. This expanded testing requirement may force the number of simulation cycles to increase by many orders of magnitude to achieve adequate confidence in test coverage and design correctness. 5. More fabrication choices: The structure of the electronics industry is changing, with more focus on manufacturing outsourcing by semiconductor vendors and more silicon foundries appearing around the world. This trend promises to reduce raw silicon cost, but puts new priority on design portability. The SOC designer can make fewer assumptions about the underlying silicon performance and must design for easy migration across foundry suppliers and scalability across process generations. Together these trends will force chip designers to reexamine their basic methodologies and design styles.
2.2 SOFTWARE STRUCTURE
While a discussion of SOC design often focuses on hardware architecture and VLSI design, software structure and flow often has just as much impact on schedule, cost, and performance. Fig 2-3 sketches a typical software environment, including the software components on the actual target system and the development tools running on a host development system. The picture includes analog circuit blocks, which implement important physical interfaces used for network communication and user input and output, even though these blocks may not be directly visible to software. The picture also includes a number of hardware function blocks. These may play an important role in overall system behavior (including erroneous behavior due to hardware bugs), but they are not directly visible to software or controllable by the programmer. In many embedded systems, the path of data flow through the hardware and software determines system efficiency and performance. Fig 2-3 shows a typical data flow through the software layers, highlighted by lines with arrows. Data bits flow into the system through physical input interfaces into hardware input device controllers. Software devices drivers typically copy the data into operating system data structures, and then to the memory space of an application. Communicating applications exchange the data directly through shared global memory or pass the data as messages via the operating system. On output, the application typically passes the data through the operating system to a device driver, which copies it to a hardware output controller, which then moves the data through an analog circuit interface to the outside world. Generally the more layers of software that touch the data, the higher the latency and lower the data bandwidth. The exact software structure of complex systems is highly variable and application- dependent, but typical target-system software components include: • Low-level device reset and exception-handling code. • Standard operating system services for resource management, task initialization, scheduling, and communication.
Figure 2-3 Typical software runtime and development structure.
• Networking and other protocol stacks. • A number of application tasks that implement the major externally visible system functions. All of these software components generally couple to the underlying hardware architecture in two ways. First, all the software is generally written or compiled specifically for the instruction-set architecture of the processors used in the hardware platform. Second, many software components may explicitly depend on the particular I/O interfaces, memory map, system peripherals, and processor-control functions of the hardware. This is especially true for lower level software such as exception handlers and real-time operating systems. In fact, the abstraction or hiding of these important implementation details from the application tasks is a key role of these layers. In some systems, certain software layers are more truly hardware-independent. Intermediate, architecture-neutral software formats such as Java bytecodes or scripting languages run on top of processor-specific interpreters to provide additional platform- independent functions. In most cases, the binary machine code of these software components will ultimately be loaded into flash memory on the SOC or elsewhere in the system, or made available over a network. The relatively low cost of changing boot-time software stored in flash memory or on a network makes software the natural vehicle for any function that is complex or likely to change rapidly, and thus more susceptible to design errors. Such functions are best implemented in software as long as the software-based design running on the available processor meets performance and efficiency requirements. Given the complexity of real-time system software, the development environment for software creation, debugging, performance tuning, and system verification becomes an important engineering consideration. The development system includes the tools needed to build and test individual software components and to run and debug the combination of components on target-system simulation models. The nature of individual SOC models varies widely, but some of the common forms include: • Physical breadboards built with discrete off-the-shelf integrated circuits (processors, memories, and FPGAs) corresponding to the intended subsystems of the target SOC. • FPGA prototypes implementing the SOC’s logic and memories, albeit in a slower and more expensive form. Typically, only a subset of the full SOC can be implemented in one FPGA due to large differences in capacity between FPGA and ASIC implementations. Hardware emulators, which use arrays of FPGAs or special logic-emulation chips, serve a similar role for SOC designs with higher gate counts. • Simulated SOC hardware built from a mix of RTL logic models and higher level models for processors (instruction-set simulators) and memory subsystems. Hardware simulation accelerators have similar characteristics. • Analog interface functions are not typically modeled as separate design elements. Instead, analog functions are modeled as part of the on-chip I/O controller logic, as part of the external environment that produces or consumes simulated incoming and outgoing data streams, or ignored altogether for this level of modeling. • Fast, purpose-built simulators that model subsystems at a high abstraction level. Without good modeling, many of the most critical flaws in specification, architecture, and hardware or software implementation will not be discovered until final product integration. The cost in time and money of fixing these errors can be so great as to put the entire product-development project at risk. The potential benefits of good hardware modeling and early software testing are widely accepted, but the actual adoption of robust methods still lags behind the recognition of the problem. These problems are magnified by the rapid increases in both hardware and software complexity. Increasing hardware complexity makes models more expensive to develop and expands the number of hardware/software interfaces to consider. Increasing software complexity drives the need for sophisticated layering of operating systems and system services, and mandates more sophisticated debugging tools that can provide more insight into interactions within and among software subsystems. Ironically, the growth in system complexity tends to both increase the number of execution cycles necessary to cover all the end system’s interesting operating modes and to decrease the performance of the platform on which the models are tested. 2.2.1 Software Trends SOC complexity is growing, and software generally handles complexity more gracefully than hardware. Nevertheless, two software trends that have long been on “just on the horizon” bear watching: 1. Virtual prototyping: Higher integration and tighter coupling between hardware and software subsystems are forcing a fundamental change in prototyping techniques. While some isolated software components can be developed on standalone development boards, SOC prototypes are increasingly implemented as simulations. Simulated prototypes are slower than hardware prototypes, but they are much more convenient and reliable. As workstation and PC processors get faster, so do prototype simulators. In addition, prototype simulators can model more of the system and more easily allow gradual refinement of the SOC during design. The tools for debugging simulated prototypes, for interfacing them to realtime peripherals, and for analyzing simulated- prototype performance are consequently undergoing rapid improvement. 2. Multiple processors: Even without the advent of application-specific processors, the drive for high integration is putting multiple processors, often of diverse architectures, together on a single chip. Managing software development on a platform with, perhaps, both a RISC core and a DSP core, presents new problems for SOC designers. Developers of software for multiprocessor SOCs must generally work with multiple compilers, multiple debugging interfaces, incompatible simulators, and tricky software migration across architectures. The trend toward greater software content is inexorable, but the broadening role and complexity of software systems stretches the limits of current methodology.
2.3 CURRENT SOC DESIGN FLOW
The typical SOC design flow, shown in Fig 2-4, reflects a historical separation between hardware and software development. Many of the key architectural decisions are made quite early in the design process, long before any detailed performance or implementation feedback is available. The long delay from key algorithm selection and system-performance measurement on hardware sharply increases the risk of surprises and cost of fixing issues. Moreover, the likelihood of many iterations and delays between hardware partitioning and VLSI timing closure also leads either to substantial overdesign of blocks (higher cost and power) or to unexpected performance and yield problems. Three dimensions of this flow are worth noting: • Architecture design: The process of refining high-level product requirements into detailed technical requirements for all hardware and software. One key phase of architecture design is the partitioning of the design into a collection of hardware blocks and software tasks, including the specification of interfaces between components. As the development proceeds, the architecture must almost always evolve to overcome limitations in the initial design. The evolving architecture must also accommodate changing end-product requirements and exploit new system-design and architectural insights to improve performance, cost, reliability, and functionality. • Software design: The process of implementing, testing, tuning, and integrating functions implemented as programs running on processors. Often the software team’s early participation in the design is focused on development and validation of only the most important algorithms or new software components. Limitations in model capacity and authenticity typically delay all hardware/software system-integration until hardware prototypes are available. • Hardware design: The hardware design process really consists of two interacting flows—the design, verification, and integration of the various hardware blocks and the integration of all hardware components into one final, physical VLSI design. The pace of growth in both system complexity and silicon capacity demands a substantial increase in the reuse of existing hardware blocks, even though many hardware blocks are too rigid and specialized to allow wide reuse. The necessity of getting hardware blocks and VLSI implementation absolutely correct makes the design and integration a long, iterative process. Designers want an SOC that is as small, fast, and efficient as possible. Thus, hardware engineers are motivated to include every necessary feature, but no more than are absolutely necessary. Designers must evolve block designs to allow the VLSI implementation to meet cycle-time targets; to fit gate-count and area goals; and to satisfy placement, routing, signal-integrity, power-distribution, and total power-dissipation requirements. Preliminary estimates from block diagrams, floor plans, and initial RTL code are typically crude and optimistic, so block design and block interfaces must change in response to simulation results from actual block layouts. Changes in the interface, size, timing, and power characteristics of one block will propagate to other blocks, triggering even more design changes. The inevitable growth in complexity of system functions and silicon has particularly dire consequences on the hardware design flow. Increased block complexity means that the SOC designer faces a greater challenge in understanding and implementing each function. Block-level tools such as block-level simulation, logic synthesis, placement, and routing tools run slower. Typically, the speed degradation is worse than linear as a function of block size.
Figure 2-4 Today’s typical SOC design flow.
Growing chip complexity also mandates more blocks, with more interfaces to document and test. When different designers are responsible for different blocks, the increased communication requirements impede problem resolution and increase the risk of misunderstandings among members of the design team. Increasing complexity also has a cascade effect on VLSI design characteristics. More complex blocks require more gates and wires; higher gate and wire counts increase average wire length; longer wire means that more of the total clock cycle is spent in wire-delay; longer wire-delay means that placement and routing optimization increasingly influence overall circuit size, performance, and power dissipation. The dominance of physical design issues for megagate SOCs mandates a new set of placement-aware design methods and a new generation of more complex and expensive design-automation tools because more effort is required to meet clock-frequency goals. Longer wires on the chip also increase the likelihood of greater capacitive coupling between wires, which produces more crosstalk. New signal-integrity tools and design checks push out VLSI design schedules and increase design budgets still further. All these factors—growing block complexity, slower simulation, new interactions between logic design and physical layout—create a vicious cycle of development delay. Each complication in the design phase stretches out the start date for VLSI implementation. Each complication in the VLSI flow delays feedback on possible enhancements or bug fixes in the logic. Increasing overall complexity multiplies the number of interactions required between logical and physical design before a chip design can be confidently released for prototyping. These basic design trends are well understood by SOC design teams and their technology suppliers. A number of useful tools and techniques have emerged to smooth this flow, though without changing the basic process and its liabilities. Important incremental tool enhancements include the following: • Hardware/software cosimulation: Simulation languages such as SystemC and tools such as Coware’s ConvergenSC and Mentor’s Seamless provide an environment for running fast high-level simulation models (especially for standard processor cores and memories) in conjunction with more detailed models of logic implemented in RTL or C. These tools typically allow fast simulation of code running on the processor in support of software development. They also provide basic verification of the hardware and software interface between the processor and other system logic. • Floorplanning and physical synthesis: The growth in hardware design size and the growing role of physical effects, especially wire delay, crosstalk and other deep- submicron silicon characteristics, mandates a new generation of physical design tools. Synopsys’ Physical Compiler and Astro place-and-route system, Cadence’s SOC Encounter flow, and Magma’s Blast all represent important incremental product improvements, especially for 90nm designs. • Integrated software development environments: As the software content of embedded systems grows, software development productivity becomes an important factor in overall system-development cost and timeliness. Integrated code development, project management, source-level debug, and execution visualization help software teams develop, tune, and maintain complex real-time software. Wind River’s Tornado environment, for example, smoothly integrates software development with operating- system-aware debug. Multicore debug interfaces, based on JTAG (Joint Test Action Group) hardware interfaces, such as ARM’s MultiICE, permit software debug of chips with more than one embedded processor core. Despite the value of these improved tools, they rely heavily on the same basic building blocks: general-purpose processor cores and hardwired logic blocks developed as RTL. 2.4 THE IMPACT OF SEMICONDUCTOR ECONOMICS The business context directly influences countless technical decisions in SOC design. The electronics industry is characterized by fierce global competition and rapidly changing market requirements across a diverse set of applications and product types. In addition, the semiconductor segment of the electronics business is highly cyclical due to the heavy investment in fabrication equipment and the resulting waves of overcapacity and product shortage. Between 1998 and 2003, the semiconductor industry endured the most dramatic boom and bust of its history. The material results and painful memories of that recent experience form the context for current SOC business thinking. The decision to design an SOC is an investment decision. The design team hopes that the future profit from the sale of chips, or the sale of systems that require the chip, will well exceed the cost of designing the chip. The cost of SOC design is dominated by the costs of deploying the right engineering manpower with the right design tools. The good news for design teams is that the global market for semiconductors is currently expected to grow by an average of 10% per year over the next few years, reflecting substantial new opportunities for volume chip shipments. The bad news is that the costs of design are also fated to grow sharply as well. The research firm International Business Strategies (IBS) has looked at engineering effort for hardware and software for major SOC design projects. IBS recognizes that the growing capacity of silicon enables much more complex hardware platforms and encourages development of much richer software for these new platforms. In the absence of a significant change in design approach, this combination drives the total cost up by more than a factor of two with each major technology generation, as shown in Fig 2-5. If this trend continues, the profit per design—volume times profit per chip—must grow at the same rate to maintain the viability of design. Clearly, not all conceivable designs can meet this standard of return on investment. Teams are choosing not to design new SOCs for functions that do not meet this economic standard. Such functions are either implemented with off-the-shelf devices or are not implemented at all.
Figure 2-5 Total SOC design cost growth.
Already, the availability of off-the-shelf, programmable chips has affected the number of low-complexity, application-specific chip designs. New designs for low-end ASICs have fallen sharply in the past five years. FPGAs and microprocessors can sometimes be good substitutes for ASIC designs when logic complexity is not high (a few hundred thousand logic gates) and volume is modest (such that chip cost is swamped by design engineering cost). Programmable general-purpose chips are typically larger and more expensive per unit than chips focused on a narrower application segment, but building a system by programming an off-the-shelf device has lower development cost than developing a chip design from scratch. For example, suppose a $10 ASIC is be able to perform the same function as a $200 FPGA, but requires $10 million to design and prototype, compared to just $2 million for the FPGA-based design. In high volume, this unit cost advantage is compelling—design cost plus chip cost for two million units is $30M for the ASIC compared to $602M for the FPGA. In low volume, however, the situation is reversed. For 20,000 units, the total ASIC cost is $10.2 million, but the total FPGA cost is just $6 million. These economic pressures are vivid for a system design team facing a “make/buy” decision. If the team takes on an SOC design, they may invest tens of millions of dollars in engineering time, tools, fees, and prototyping costs to get a silicon and software platform ideally suited to market needs. The target chip’s cost and specification may be compelling, but the team faces the risk that sales will be too low to adequately recoup expenses. On the other hand, if the team abandons the new SOC design and instead employs off-the-shelf components, high component costs, inadequate performance, and high power dissipation may cause the resulting system to be merely undifferentiated or completely uncompetitive. A great deal is at stake for the SOC design team today. 2.5 SIX MAJOR ISSUES IN SOC DESIGN The challenges of current SOC design methods are diverse and interlocking, but it is possible to distill the hard problems of SOC design today into just six essential issues.
2.5.1 Changing Market Needs
Translating market needs into a manufacturable electronic product takes a long time. Existing customers and new prospects contribute requirements. Market experts offer data on trends. Strategic visionaries throw out novel product ideas. Product teams distill these inputs into product requirements. Engineering teams design hardware, develop software, build prototypes, and verify the solution. As a result, few product-development projects take less than a year. The time between concept and available product might stretch to two or three years, or more for complex, high-volume products. Three years is a very long time in the life of an electronic product’s market these days. Assumptions about features, pricing, performance, and form factor can all change significantly during that time. Entire product segments can virtually die out, and new categories rise in their places. Even modest changes in the supported data formats and communication interfaces can obsolete a product before it’s ready for sale. If the market does change, how can product developers recover? Almost all electronic system products contain some sort of control processor, so certain features are changeable via software. Any features implemented in hardware, especially those features cast in silicon inside an SOC, present a bigger design risk and may cause a long delay in bringing the reworked product to market. Ironically, many painstaking hardware optimizations made to improve performance and power efficiency, intended to make the product more attractive in the market, ultimately increase the design risk and delay product introduction if those optimized features must change. Product definition is fragile.
2.5.2 Inadequate Product Volume and Longevity
The real cost of building a system includes both the direct manufacturing costs—parts plus manufacturing labor and services—and the indirect costs of defining, designing, and marketing the product. To achieve sustained profitability, a product developer must choose a design and implementation strategy that balances the design and manufacturing costs appropriately for the projected manufacturing volume, manufacturing processes, and design capabilities of the enterprise. High-integration, nanometer CMOS technology offers spectacular performance, power, and density opportunities for digital electronics and can achieve remarkably low unit manufacturing costs. The up-front design costs for large SOCs, on the other hand, can also be spectacular. Much has been made of the rising cost of photolithographic masks for SOC design—often exceeding $1 million for 90nm technology—and the capital costs of semiconductor fabs—exceeding $2 billion for a high-capacity facility. These numbers tell only part of the story. The emergence of silicon foundries and standard design representation formats has allowed fab costs to be spread across a large number of different designs from different design teams sharing common design rules. The high mask costs are associated with a single design are real, but represent just the tip of the design-cost iceberg. The sheer logic complexity of leading-edge chips and the rising demands of chip-level verification and physical design require design efforts that routinely consume many tens of engineer-years and often top 100 engineer-years. When combined with the cost of design tools, prototyping charges, and productization, design costs for a single SOC development project typically exceed $10 million. High silicon-design cost seems to go hand-in-hand with spectacular silicon functionality. If an SOC is used in just one system, then the full SOC design cost must be amortized across one system’s manufacturing volume. Some products have the cost margin and volume to comfortably accommodate $10 million of additional design costs, but many do not. SOC development may become economically untenable unless the design costs can be reduced or the manufacturing volume can be increased. Increasing the manufacturing volume is especially difficult for an inflexible chip, because the SOC may not be sufficiently adaptable to serve the needs of multiple products in a product line or multiple customers in an industry. Sometimes the cost of optimizing of an SOC design creates serious economic problems because of inadequate volume.
2.5.3 Inflexibility in the Semiconductor Supply Chain
The wild success of electronic products over the past quarter century has been driven, in part, by the uniformity and ubiquitousness of digital MOS semiconductor technology. Commonality of processing equipment, engineering methods, and design tools has driven substantial commonality in design representation, circuit techniques, and physical-design rules. This design standardization gives the silicon designer commodity fabrication pricing and increases the security of supply by enabling multiple sources for any chip. The trend toward design commoditization flies in the face of the economic interests of semiconductor suppliers. Suppliers seek to differentiate products and lock in customers by offering unique capabilities such as higher performance, extra design support, and proprietary building blocks. Their customers must delicately balance the technical advantages of these proprietary offerings against the loss of business leverage that comes from adopting a sole-source technology. Historically, microprocessors have been recognized as the building block with the most potent lock-in value. This characteristic stems both from the difficulty and effort required to create state-of-the-art processors and software tools and from the rising cost of switching architectures as the customer develops a growing software library wedded to that processor architecture. Multiple-source processors do not fully address this inflexibility. Common techniques, such as using a synthesizable logic implementation, make the processor design independent of the specific semiconductor supplier. However, when compared to hardened processor cores, synthesizable processor designs typically run at lower operating frequencies with greater power dissipation and silicon area. Moreover, availability of processor implementations is generally restricted to a small number of lowest-common-denominator architectural configurations. Some system designers respond to the lock-in risk of sole-source processors from semiconductor suppliers by developing their own processors. While this approach returns control to the SOC designer, the difficulty and distraction of from-scratch processor development creates a raft of other problems such as the need for development tools and development-tool maintenance, documentation, and application support. Moving more functions into synthesizable RTL blocks can also be seen as a move to reduce lock-in risk—though at the cost of increased risk of design fragility. Ultimately, the SOC designer wants seemingly incompatible powers—the freedom to use commodity silicon fabrication for minimum cost and maximum supply flexibility and the leverage to create highly differentiated performance, silicon efficiency, functional richness, and flexible reprogrammability in optimized system designs.
2.5.4 Inadequate Performance, Efficiency, and Cost
The pace of technology improvement creates an opportunity for electronic product companies, but rampant competition transforms opportunity into necessity—build faster, cheaper, lower-power products… or die. The current standard partitioning between processors and hardwired logic blocks impedes success by forcing significant compromises in SOC throughput, efficiency, and cost. Any task too complex to implement exclusively in hardware must run on a control processor. Generally, these processors rarely exceed a few hundred MIPS of performance in leading-edge 130nm or 90nm process technology. The resulting performance of these processors may be adequate for user interfaces and simple system-control tasks, but essential features for network, security, signal, and multimedia processing often require much higher throughput, both individually and in aggregate, for the whole set of product features. Processors embedded in an SOC may have better power dissipation specifications on a per-MHz basis than standalone processor chips, but often they must run at high clock frequencies to satisfy system-performance requirements. High clock frequency affects power dissipation in two ways. First, the active power in any given CMOS VLSI circuit is proportional to the operating clock frequency. Second, designers must often adopt aggressive circuit-design methods and a higher operating voltage to increase the maximum clock rate. Aggressive design methods that boost clock rate also increase power dissipation. Consequently, a performance increase typically requires a more- than-proportional increase in power dissipation. When operating-frequency and operating-voltage effects are combined, the power often scales rapidly—roughly with the cube of operating frequency. When product requirements demand very high performance, or even just headroom in available performance, the design also requires more silicon. High-end processors capable of simultaneous speculative execution of many possible instructions increase area out of proportion to the increase in useful throughput. Higher clock rates often require larger transistors and longer wires. The push for performance often drives a transition to more advanced process technology, increasing manufacturing costs. Together, the demand for higher processor performance typically means larger die size, higher silicon costs, and higher power dissipation.
2.5.5 Risk, Cost, and Delay in Design and Verification
SOC design is full of uncertainties. The complex interactions of subsystems, the variety and complexity of design representations, and the long layout and fabrication process all combine to increase the risk and the cost of errors. Typical SOC design teams consist of architects, software developers, hardware designers, VLSI engineers, design-tool experts and other technical support engineers. Verification alone often consumes 70% or more of the entire effort. Typical SOC designs consume tens of people working for two to three years from design concept to manufacturing release. Total development costs, including design tools, prototyping charges, and outside services routinely top $10 million today, with estimates that complex SOC designs using 90 nanometer technology may cost $30 million. The most ambitious SOC efforts may reach even higher. SOC hardware-design errors are particularly dangerous. The discovery of a hardware bug in prototype silicon may require six months to fix—three months to identify the proper modification and finalize the design change plus three months to respin the silicon. The risk of error grows disproportionately with the complexity of the design because the number of interactions between design elements increases far faster than the number of elements. Moreover, as system functionality increases, the number of test cases needed to verify correct behavior explodes. As a result, SOC design is caught in a bind. The size of the blocks increases, the number of blocks increases, the number of test cases per block increases, simulation time for each test case increases, and the cost of overlooking a bug increases. The consequences of bug-related delays and costs for the hardware-design and verification teams are daunting. Finally, product developers increasingly understand that the impact of bug fixes and silicon respins is more than just program delay and direct engineering costs. Late market entry also reduces the time over which the product generates revenue and reduces profitability due to increased competition. The two critical questions are • How can developers reduce the risk of “killer bugs”? • How can the entire design and prototyping strategy reduce the time to incorporate fixes?
2.5.6 Inadequate Coordination Between Hardware and Software Teams
The software content of embedded systems is increasing rapidly because many system functions are too complex for direct hardware implementation. As the total software content increases, the breadth and importance of interactions between the hardware and software teams becomes critical. Historically, systems could be rigidly partitioned between relatively simple control functions and implemented with standard microprocessor chips and relatively simple hardware functions. They could also be implemented with data-path ASICs or programmable, off-the-shelf ICs. This traditional design approach has allowed embedded software programmers either to work independently of the hardware development or to start software development only after the hardware team delivered prototype systems. Increased system complexity, tighter schedules, narrower end-market windows, and growing use of SOCs all put pressure on this loosely coupled model. Too often, the hardware architecture and software architecture efforts are separated in either space or time. Hardware and software departments often occupy different buildings; sometimes, different sites; occasionally, different continents. Even with good electronic communications, project coordination can be the weak link. Complex, ad hoc hardware/software interfaces often confuse design team members, and confusion leads to expensive rework. Often, design issues caused by ad hoc interfaces is only resolved during final system integration, when it may be too late for effective or economical redesign. Misunderstanding of performance issues can create processing bottlenecks that can only be corrected by substantial hardware redesign. Hardware architectures designed without sufficient software consideration often lack adequate visibility and controllability. When hardware features are insufficiently controllable by software, workarounds become difficult or impossible. Even when hardware and software teams work together in close physical proximity, dependence on hardware prototypes for early software development forces a pipelined development model in which the software team is still finishing the last project while key hardware architecture decisions are made on a new project. Pipelined development can create a vicious cycle of delay, where the late discovery of inadequate interfaces and hardware shortcomings forces lengthy and elaborate software delays, ensuring that the software team has even less opportunity to apply their experience to the next project’s architecture. Without some means to realign and improve the hardware/software interfaces, project schedules and product efficiency will continue to suffer.
2.5.7 Solving the Six Problems
These six issues represent a critical hurdle for more universal adoption of SOC designs. If adequate solutions to these problems are not found, SOC design costs will climb along with SOC transistor densities until only a handful of large SOCs can be built each year. With only a few basic platforms to choose from, end-product designs will become less differentiated, compelling, and efficient. If, on the other hand, new design methods can make these chips easier to develop and more easily reused, then new electronic products will continue to proliferate.
2.6 FURTHER READING
• Source data for the SOC design costs discussed in Section 2.4 can be found in Dr. Handel Jones, IBS, Inc.. Economics of time-to-market in chip design, IBM Engineering & Technology Services,” June 2003; and Analysis of the Relationship Between EDA Expenditures and Competitive Positioning of IC Vendors, A Custom Study for EDA Consortium by International Business Strategies, Inc., 2002. • The ARM family of conventional RISC cores serves as the most common control processor for SOCs today. A good reference text is: Steve Furber’s ARM System Architecture, Addison-Wesley, 1996. • The Semiconductor Industry Association tracks historical trends in semiconductor markets and makes annual forecasts looking out three to four years. See http://www.semichips.org. • A number of the basic ideas of SOC design were developed in Felice Balarin, Editor. Hardware-Software Co-Design of Embedded Systems: The POLIS Approach. Kluwer 1997. • A Very good introduction to the current SOC processor based on fairly rigid hardware- software partitioning are found in this book, especially in Chapter 7 (“Hardware/Software Co-Design” by Dirk Jansen): Peter Marwedel. Embedded System Design. Kluwer Academic Publishers, Dordrecht, 2003. • Pragmatic examples in applying current SOC design methodology can also be found in Grant Martin and Henry Chang, Editors. Winning the SOC Revolution: Experiences in Real Design. Kluwer Academic Publishers, Boston, 2003. Vendor Web sites give good overviews of recent improvements in hardware and software design tools: • Synopsys: http://www.synopsys.com • Cadence: http://www.cadence.com • Magma: http://www.magma-da.com • ARM: http://www.arm.com • Wind River: http://www.windriver.com • Coware: http://www.coware.com • Mentor: http://www.mentor.com