0% found this document useful (0 votes)
638 views381 pages

Tredennick Microprocessor Logic Design

Uploaded by

vaneetmahajan10
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
638 views381 pages

Tredennick Microprocessor Logic Design

Uploaded by

vaneetmahajan10
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 381
Microprocessor Logic Design The Flowchart Method Nick Tredennick IBM Research Staff cial Digital Press Copynght € 1987 by Digital Equipment Corporation. All nghts reserved. No part of this publication may be reproduced, stored ina re tweval system, oF transmitted, in any form or by any means, electronic, mechanical, Photocopying, recording, or otherwise. without written permission of the publisher 987654321 Printed inthe United States of America (Order number EY-6707E-DP. Book and jacket design by Sandra Calet Diagrams designed ty Carol Keller and produced by ANCO Boston Composed in Mergenthaler Univers by DEKR Corporation. Printed and bound by Murray Printing Company. Chapter 3 ws adapted from Nick Tredennick, "How to Flowchart for Hardware. Computer. vol 14 (December 19811, pp. 87H. all rights reserved, © 1981 by the Institute of Electnical and Electronics Engineers. Appendix A appeared as Nick Tredennick. “The Impact of VLSI on Microprogramming.” in MICRO 19: The 19th Annual Workshop on Microprogramming, 1986, pp. 2-8, all ghis reserved. © 1986 by the Institute of Electrical and Electronics Engineers. Both appear herein by permission of the IEEE Appendix Eis adapted trom a report prepared by the International Business Machines Corporation and is printed by permission of IBM. Burroughs 81700 1s a trademark of Lnisys: Cray s a trademark of Cray Research Inc. IBM System 360 and IBM Systern 370 are trademarks of International Busi ness Machines Corporation: »APX286, 8048. 8086. 8088, ana 80386 are trademarks of Intel Corp , MC88000. MCBB008, MCB8010, MCBB020, MC68030, M6800, M6801, and M6BOB are trademarks of Motorola, Inc.; VAX is a trademark of Digital Equipment Corporation, 28000 is a trademark of 20g, Inc Library of Congress Cataloging: Publication Data Tredennick, Nick Microprocessor Logic Design Bibhography: p 11 Mecroprocessors—Design and construction 2 Logic design | Tile 1K7895 MST74 1987 621.295 -86-29982 Preface My objective in this book is to present a computer design method called the flowchart method. | do not survey computer designs or computer design methods. | do describe the hard- ware in a single-chip microprocessor, but | believe that the concepts apply to computers in general. The examples are from my design experiences with the IBM Micro:370 and the Moto- rola MC68000 design teams. This book is for graduate-level electrical engineering or computer engineering students or for practicing computer designers. | as- sume the reader knows basic logic design, Karnaugh maps, and Boolean algebra. The book is organized so that a semester-long design project can be undertaken in parallel. | introduce the design method us- ing a simplified processor example, then add details, in the order @ computer designer must deal with them, to design a single- chip microprocessor. Nick Tredennick January 5, 1987 Contents Chapter 1 Chapter 2 Chapter 3 Chapter 4 Here’s the Deal 1 Why I Think This Description Is Important 2 Using Computers 2 The Right Structures 4 Defining a Microprocessor 5 Microprocessor Operations Overview 8 Instruction 10 Hardware Flowcharts 16 Prerequisites 18 Illustrated Flowchart Method Overview 19 Flowchart Objectives 23 Making a Flowchart 25 Level 1 Flowcharts 30 Level 2 Flowcharts 32 Doing Level 1 Flowcharts 36 Doing Level 2 Flowcharts 39 Implementing from Flowcharts 50 Relationship between Flowcharts and Hardware 55 Sample Design Chronology 57 Sample Implementation Procedure 57 State Sequencer 72 Summary 76 Chapter 5 Chapter 6 Chapter 7 Chapter 8 Chapter 9 Howa roprocessor Works 77 intemal Clocking 84 Timing between the Execution Unit and the External Bus 85 Exceptions 91 Control Store Address Selection 100 Control Store 104 Nanoword Decoder 105 Communication between Execution Unit and State Sequencer 114 Communication between Bus Controller and State Sequencer 116 Microcode 117 Bus Interface 123 Mode Control 124 The IBM Micro/370 Microprocessor 126 Execution Unit 127 Control Store Organization 130 Decoders 133 Next Address Control 133 Interface to Bus Controller 133 Execution Overlap 140 Prefetching 140 Clocking and Timing 148 Bus Sense Amp Control 148 Shifter and Shifter Control 151 References 154 Hardware Flowcharts for Micro/370 — 155 Practice Level 2 Flowcharts 156 Level 1 Flowcharts 158 Official Level 2 Flowcharts 158 Sample Instructions 160 Implementing Micro/370 from Flowcharts = 190 Implementation 194 VLSI Design Method(ologie)s = 221 Method A: The Machine Partition Method 222 Method B: The Commercial Microprocessor Method 225 Method C: The Logic Replacement Method 231 Summary of Methods 232 Contrasting Folkiore with Reality 233 Conclusion 237 References 237 Epilogue Appendix A Appendix B Appendix C Appendix D Appendix E Appendix F Appendix G A Final Word by T. A. Welch = 239 What Did He Say? 240 What Did He Miss? 240 Alternative Processor Styles 242 Design Process 243 Conclusions 244 The Cultures of Microprogramming 245 Design Chronology 255 IBIM/370 Architecture Notes 269 The Micro/370 Flowcharts 275 IBM Micro/370 Microprocessor General Information 301 A One-Semester Design Project 328 Glossary 341 Index 359 Here’s the Deal This book is for graduate level electrical engineering or computer engineering students. “Graduate level” means | assume you are proficient at, not just knowledgeable of, basic logic design. You can reduce a five-variable Karnaugh map to minimized logic in a few minutes. You can “read” multilevel NAND-NAND logic diagrams for output func And you can look at the logic diagram of a simple machine and see how it works. | would like you to read this book and somehow go beyond a procedure. Steal ideas. | de- scribe the Micro/370 microprocessor in detail, but | am really talking about design ideas, using Micro/370 as an example. | want you to add these ideas to your design repertoire. Make them the stepping-off point to your own design experiences. I present an industrial logic design method for single-chip microprocessors, called the flowchart method. | do this us- ing a real example. The case study is chip Systerm/370 microprocessor designed usi chart method. Micro/370 consists of about two hundred thousand transistors (sites). | wrote this text as | did the logic for Micro/370. | also used the flowchart method when designing the logic for the Motorola MC68000. Microprocessor Logic Design Books describe methods as if they are step by step in practice. But methods are not step by step. There are always problems. Students lose confidence when they are unable to apply a method as cleanly as it is described. | present the flowchart method both ways—the tutorial and the dirty reality. | discuss mistakes in this book because mistakes are an integral part of what you do. | intentionally repeat things from chapter to chapter: | do this for emphasis and to gradually introduce detail Why I Think This Description Is Important This description tells how an engineer actually works. If you are a new logic designer, think of this as a way to get started, an or ganized way to develop your own style. This is a documented industrial logic design method, which means | wrote what | think you should do, in detail, to design the logic of a microprocessor. The problem with many texts is that we lie about details. We are sloppy in areas that are not our primary concern. Academics are method fanatics. Practitioners are solution fanatics. in school. we glorify the methods and lie about the sophisticated problems we solved. In industry, we glorify the problems and lie about the sophisticated methods we used. Each side loses credibility the minute one side reads the other's literature. The academic knows that the practitioner's “method” is (ugh!) arbitrary, just as the practitioner knows the academic’s “solution” is (ugh!) not applicable. Because we oversell method and solution, it takes too long to figure out what really works. This book puts what | think really works for microprocessor logic design in one place. Using Computers Eventually, you will enter your design into computer files. Lots of people have tried to make this part “easier.” These people are called design automation (DA) experts. Designers work on de- signs, and DA people work on automating design. As a designer, my view is that DA should support design, not be design. After | have designed something, | think, “Boy, it would be nice if this part (of the way | design) were automated in this particular way.” But sometimes | think DA people automate things and Here's the Deal want designers to design in terms of inputs and outputs to their design tools. You may feel frustrated when you ask a DA per son, “Why doesn’t your program let me do this?” and the response is, “Why are you doing it that way?” Imagine this: You start a new job at a company, and on day 2 they say, "Here’s where you enter your logic—in our Humon- gous Design System (HDS).” Surprised, you say. “But I've only had eight courses in digital de- sign principles. How do you actually design a microprocessor? | mean... come up with the logic for something that complicated in an organized way?” “Well, you partition the problem into manageable pieces,” they reply. Stubbornly you ask, “But how do you know what the right pieces are?” Perspiration is forming on your brow. They've found you out You were supposed to have learned this in one of those courses. But your host merely replies, “Oh, that’s easy, you just piece the logic together from the structures that are good for this technology. They're right here. See, here's a sixteen-way NOR, and here's a three-input NAND and . ..” You don’t hear much of the speech. You have no choice: You must use HDS because it automatically verifies your logic; se- lects, places, and wires the circuits; and generates test patterns using @ fault model that has been accepted companywide. Be- sides, output from HDS is the only output manufacturing will accept. Period. In a case like this, designers start thinking of solving problems in terms of how to express the solution in a particular notation They structure the solution out of only the conceptual constructs supported by that notation. (If the tool does not support pass gates, guess what—no pass gates in the design.) In this way. DA becomes design This book is partly in response to the growing presumption that computers are an essential part of logic design. They are not. | want to describe here the essence of a logic design method Microprocessor Logic Design Computers are not an essential part of this flowchart method. | think of computers as an expensive and awkward alternative to pencil and paper. Even so, | don’t think all DA is bad. When DA's good, it’s very, very good, but when it’s bad, it’s horrid. The Right Structures The key to performance in microprocessor design is finding the right conceptual structures. How do you best represent the con- cepts present in the architecture document? If you are constrained by notation, by the contents of a circuit library, or by the fact that wiring must be on a grid, then you lose a lot of perfor- mance. {I make this flat statement with no proof.) | believe that basing your design on the right conceptual structures—ones that make the design “flow”—is the key to high performance. ! do not believe it matters (in performance) whether these structures are implemented in programmable logic arrays (PLAs), read-only memory (ROM), or random logic, ROM: or PLA-based logic is, certainly easier to change physically, but neither is inherently faster or slower than random logic You will frequently encounter statements such as, "Microcode leads to slower control paths and adds to interpretive overhead, and "Hardwired control provides for the fastest possible opera- tion.” These statements are not true, What is true is that micro: coded solutions tend to be used for interpretive structures, and interpretive structures are slower. For those who believe “micro- coded” means “interpretive,” think of what | am discussing as microcoded implementations that are not interpretive (a seeming contradiction) Speed comes from using the right logical structures for the job Structured garbage logic is still garbage logic. How do you find the right logical structures? Use the flowchart method. It gives you a framework (notation and procedure) that organizes design details so that you can see logical patterns. After that, every- thing depends on how good you are at logic design. Chapter 8 shows you where the flowchart method fits in relation to other methods of logic design. | prefer microcoded designs. (See Appendix A for my definition of microcoded. To me, the word “microcode” is interchangeable with “microprogram.") The examples | use, real and contrived, are mainly microcoded ones Defining a Microprocessor A microprocessor is a computer's central processing unit (CPU) implemented on a few (say, fewer than four) silicon chips. The processor has two parts: a control part and a data part. The control part says what to do, and the data part does it. The control part decodes instructions and guides the processor through its internal states. The dat part (or execution unit) contains the registers, arithmetic units, shifter, and other pieces that directly store or ma- nipulate data. The control part directs operations in the execution unit. It consists of the clock-phase generators, bus controller, and processor controller. The processor controller consists of a control store (with all the microcode), state s quencer, instruction decoders, and control word decoder. See figure 2.1. A single-chip microprocessor is a silicon chip containing all (and only) the parts of a CPU. The chip must include all the parts mentioned above (clock-phase generators, bus con- troller, processor controller, and execution unit) to be a single-chip microprocessor. Otherwise, it is one chip of a multichip microprocessor. Figure 2.1 is a block diagram of a n of a single-chip microprocessor. Figure 2.2 is a block diagram of a PLA implementation of a roprocessor Logic Design Bus Controller Control Word Decoder Control Registers Program a Counter Execution Unit Figure 2.1. Microprocessor (microcoded implementation) single-chip microprocessor. Figure 2.3 is a block diagram of a random logic implementation of a single-chip microprocessor. From here on, when | use the term “microprocessor,” | mean single-chip microprocessor unless | say otherwise. Since | prefer microcoded implementations, | will use those as examples from now on. “But,” you say, “I want to design a high-performance micro- processor, so | want to know how to do a random logic implementation. Your book will do me no good.” Our technical folklore says that random logic implementation is faster. That is not necessarily so. if the random logic implementation is faster, it is not because it is done in random logic. Figures 2.1, 2.2, and 2.3, for example, have the same execution unit (exactly what | expect if they implement the same architecture). Where is the critical path? Suppose it’s in the execution unit. (A common criti cal path in an execution unit is the path from a register, through the arithmetic and logic unit [ALU], and into an ALU condition code register) If so, all three implementations perform equally. There are many microprocessors commercially available today. You can’t tell which are microcoded, PLA, and random logic im- Defining a Microprocessor Bus Controller ——] Instruction Decoders Control Word Decoder Controller - J f Registers Program RI Counter ro L Execution Unit Figure 2.2 Microprocessor (PLA implementation) Bus Controller Random Logic TP rests Program) | po || at Rn | | Shitter | ALU counter | Execution Unit Figure 2.3 Microprocessor (random logic implementation! Microprocessor Logic Design plementations based on their performance. Differences in architecture swamp differences in implementation (and make comparisons unsound}. The Motorola MC68000 family, which are microcoded designs, are among the fastest microprocessors available Why am | harping on this? Because | think that what the imple mentation looks like (microcoded, PLA, or random logic) when it is done is more neatly related to design method than to perfor mance (or any other input constraint). What you do is start with a specification and design a computer to some goal—size, cost, performance. Don’t worry about the form of the implementation (yet. I'll show you how to develop your own design method Begin by building your method from the things that matter—the specification and the goals. I'll show you how to build a fast microprocessor (or a cheap one or something in-between), and I'l do it with microcoded examples. Microprocessor Operations Overview This is an operational overview of the microcoded microproces- sor in figure 2.4. Instruction decoders look at the instruction bit pattern to decide which control word sequence in the control store is appropriate. The instruction decoders send the address of the control word sequence to the control store. The control store contains the control word sequences for all the instruc- tions, The state sequencer steps the control store through each control word in the sequence for the instruction. The control word decoder transforms each of the control words into specific control signals for each execution unit element. The execution unit contains the resources for holding and manipulating data Execution unit pieces (elements) are connected by one or more common internal buses. Transfers between execution unit ele- ments are controlled by the control words. Transfers between the microprocessor internals and the external world (the world beyond the pads) are controlled by the bus controller. (There is a simple connection from the Data In/Out register to bus trans- ceivers connected directly to the pads. Similarly, the Address Out [AO] buffer in figure 2.4 goes to drivers connected directly to the pads.) The bus controller responds to commands imbed- ded in the control words. It runs the external bus protocols that result in instruction fetches (for the instruction decoders) and in ‘operand loads and stores (for the execution unit) Defining a Microprocessor Pads for Bus Control, Clock, Interrupts, Reset, Testing, and Power OOOOOOOOOOOAOOOODOD oO oO 5 = oO Oo Oo o oO oO oO oO oO Soe Sequencer Decoders” O oO oO a Encoded Control Word Fields — oO ees Pret ao 0 TT TIT o a defoteg See Uae = o | | o Oo we ao a Inert A Bus oO OJ ex form |sniter’ | aw | invour | |] | | eater ' j || Raster || Pa earelene o mee eT o OL OOOOocoOoOO | Ooooagonnczto ‘Address Pads Data Pads Figure 2.4 Microprocessor chip, with more detail (microcoded implementation) Microprocessor Logic Design Instruction Figure 2.5 is an example of an instruction in a typical instruction reference manual. In this example, 5A (hex) is the operation (op) code, the R1 and 82 fields designate registers, and the D2 field holds a displacement. | describe how the microprocessor exe- cutes the instruction. (| assume a microcoded processor controller.) Instructions can be either one or two halfwords long (depending on whether they include a displacement). A single halfword instruction would nave a different op code from the in struction example in figure 2.5 and would have no D2 halfword. The microprocessor fetches the instruction. The first byte of the instruction gives it away as an ADD instruction. The first instruc- tion byte drives an instruction decoder, whose output is the control store address of the control word sequence for the ADD instruction, The control word sequence for the ADD instruction knows the ADD instruction format. The control word sequence directs the execution unit to fetch the operands, add them, and store the result. First, the control word sequence fetches the rest of the instruction (in this case, three more bytes). Then it finds the operands. It finds the first operand in the register des- ignated by the RI field. It finds the second operand by adding the contents of the register designated by the B2 field to the displacement of the D2 field. The control word sequence calcu- lates the address, puts the address on the pads [the external bus), and captures the data returning from memory. It adds the operands and stores the result in the register designated by the R1 field Here are the steps for the ADD instruction Fetch the first instruction halfword. . Find the ADD control word sequence. Fetch the remaining instruction halfword. Calculate the operand address. . Fetch the operand Add 7. Store the answer. It isn’t quite that simple. This works for one instruction, but you must be able to execute a program (sequence of instructions) How do you get to the next instruction? How did you get here from the last one? The processor controller does this. One way to execute a sequence of instructions is to have the current in- struction fetch and decode the next instruction. In a micro- Pr eenn 10 Defining a Microprocessor The second operand is added to the first operand, and the sum is placed in the first-operand location. The operands and the sum are treated as 16-bit signed binary integers. The first operand is in the register specified by the RI field, The second operand is in memory. The address of the second operand is formed by adding the displacement specified by the D2 field to the contents of the base register specified by the B2 field ‘An overflow causes a program interruption when the fixed-point overflow mask bit is 1 Resulting Condition Code Program Exceptions 0 Sum is zero ‘Access (fetch) 1 Sumis less than zero Fixed-point overflow 2 Sum is greater than zero 3 Overtiow Figure 2.6 The ADO instruction programmed controller, this is necessary to find the location of the control word sequence to execute the next instruction. Assume that you have just begun execution of the ADD instruc- tion. Here are the steps for the instruction: Fetch the remaining instruction halfword. Calculate the operand address. Fetch the operand Add Store the answer. Update the program counter (PC! Fetch the first halfword of the next instruction Find the address of the next instruction’s control word sequence. 9. Branch to the next instruction’s control word sequence The steps in this sequence have been renumbered from the pre vious list of steps to reflect a change in instruction execution strategy. The first two steps of the initial sequence became the last four steps of the current sequence. Instead of each instruc- tion being an independent sequence, as it is in the first set of steps, each instruction connects to the next instruction by doing its fetch and decode. These steps can execute a stream of ADD. instructions. If you have a series of ADD instructions, you would execute the above steps multiple times. The first five steps do PAYOR EONS " Microprocessor Logic Design the ADD instruction, and the last four steps connect it to the in- struction stream. A “step” is not a control word (this isn’t really the control word sequence); it’s only what the control word sequence must do. The control word sequence defines a series of states, and it may take several states to do each of the steps for the ADD in- struction. (In a microcoded implementation, a state corresponds to one control word.) Assume that the microprocessor in figure 2.4 has a 16-bit exter- nal data bus and 16-bit internal data buses (along with a 16-bit ALU). The following steps execute the ADD instruction 1. Fetch the remaining instruction halfwords. One state to fetch the second halfword of the ADD instruction 2. Calculate the operand address One state to add the D2 displacement and the contents of the B2 register. 3. Fetch the operand. One state to fetch the data halfword (put the address on the pads and wait for the operand halfword) 4, Add One state to add the operands. 5. Store the answer. One state to store the result in the R1 register. 6. Update the PC. One state to increment the PC One state to save the incremented value. 7. Fetch the first halfword of the next instruction, One state to put the PC value on the pads and wait for the first half of the next instruction, 8. Find the address of the next instruction’s control word sequence One state to put the next instruction into the instruction decoder, 9. Branch to the next instruction’s control word sequence. Zero states—this step is accomplished as a part of the previous step For a halfword (16-bit) external data bus and halfword (16-bit) internal buses, the sequence is nine states. How does this com- pare with the execution time (in states) for a commercial 2 Defining a Microprocessor microprocessor? The Motorola MC68000 has an internal struc- ture similar to figure 2.4 with 16-bit internal buses (and a 16-bit ALU). It has a halfword external data bus. Its control word se- quence for the memory-to-register ADD instruction is five states. “What happened? Didn’t those guys have to do every: thing?” They did everything. My description is simplified, as it doesn’t account for concurrent actions. In this chapter, | have begun to tell you how a microprocessor works by defining one. | described how the microprocessor runs an instruction—that is, what's going on inside the chip. In chap- ter 5, I will continue in the same fashion. | repeat the steps in the execution of an instruction but add more detail each time. More detailed explanations of what is happening require more details about how the microprocessor works. As | add detail to the explanation, problems keep popping up. Solving each prob- lem requires more information about how a particular part of the microprocessor works. Here are the parts in the order they are explained in chapter 5. The upper left corner of the block diagram in figure 2.6 desig- nates the clock-phase generators. They use the externally supplied clock signal to generate clock phases required by the rest of the chip. Both Micro/370 and the MC68000 use a four- phase clocking scheme. Power-on reset and interrupts are next (top center to right in figure 2.6). Any microprocessor has to have power-on reset cir- cuitry so it will do something predictable when you turn on the power. Interrupts provide a way for devices outside the micro- processor to get the microprocessor's attention. One type of interrupt telis the microprocessor when a device needs service (for example, keyboard service, display buffer update or more lines to print). Another type of interrupt informs the (on-chip) bus controller that something is wrong on the external bus and the current bus access will not complete (for instance, a bus error or page fault). Interrupts lead naturally to the next state control (a part of the pro- cessor controller's state sequencer). Normally, the micro- processor is just running a user's program. An interrupt comes in and changes what the processor controller does (usually at an instruction boundary). If there is no interrupt, the next state con- trol selects control store addresses from the output of the control store, the branch control unit, or the instruction decoder. 13 Microprocessor Logic Design Clock-Phase Generators [Reset and Power-On Logic interupt o-r_—-_ toate Bus Controller Next State Control Contiol Store in Decoders Branch Control Unit C___ Encoded Execution Unit Control | Instruction Prefetch Registers, Decoded’ Execution Unit Control Execution Unit Figure 2.6 Microprocessor block diagram The branch control unit provides the means for decision-making in the microcode. Both Micro:370 and the MC68000 provide four-way branches using 2 partial address from the control store and altering some of the control store next address bits based ‘on conditions in the execution unit The control store holds the control wards. Part of each control word 1s decoded to control the execution unit elements, and part helps run the state sequencer (by saying where to get the next control store address or even by providing it) 1% Defining a Microprocessor ‘The control word decoder translates the (compact) control word into the exact lines needed to control each execution unit ele- ‘ment. The control word decoder also mixes the information from the control store with information from the instruction register and with timing information. In the MC68000, the part of the control word that helps run the execution unit is 66 bits wide. It is decoded into about 180 bits to run the execution unit control points. In Micro/370, the part of the control word that helps run the execution unit is 71 bits wide. It is decoded into about 300 bits to run the execution unit control points. ‘An execution unit control point is @ single control line leading to a macro in the execution unit. For example, a single control line might gate the value of a single register onto a bus. The load signal for a register would be another single control line. The op code control for the ALU might be four control lines. Each of these lines becomes a control point entering the execution unit from the processor controller. The instruction prefetch registers allow the microprocessor to overlap the execution of the current instruction with decode of the next instruction and with the prefetch of the halfword after the next instruction. There are three registers. One holds the currently executing instruction and 1s used by the control word decoder. Another holds the next instruction and drives the in- struction decoder. The last register receives the halfword follow: ing the next instruction, when the halfword arrives from the ex: ternal bus, The bus controller runs the electrical protocol to communicate with the outside world. The bus controller detects and synchro- nizes external interrupts, runs memory access cycles, and arbitrates control of the external bus. This is a preview of the explanation of how a microprocessor works, presented in chapter 5. 6 Hardware Flowcharts The Flowchart Method is the procedure and notation | use to design the CPU of a computer. The method works for general-purpose and for special-purpose CPUs. ACPU has a “controller” and an “execution unit cution unit is a collection of fast but latent capabilities (regis- ters, ALUs, shifters, and data paths). The controller controls the execution unit by telling the execution unit what to do when. The controller determines the CPU’s “personality.” Designs often begin with an appeal: “We need a CPU that’s twice as good as any rival's.” Computer architects turn the appeal into an English description of the machine (in IBM's System/370, this is the Principles of Operation manual, form no. GA22-7000). Engineers implement from the English de- scription, using logic design and circuit design methods. We have lots of books to help us with logic design and circuit design, but nobody says how to transform the English de- scription into the kind of formal description circuit designers need. It’s much like a mathematical word problem. The hard part is getting the equations from the written description of the Hardware Flowcharts problem. Once you have the equations, you can apply docu: mented methods to find the solution. The English description of a chip is like a book-length mathematical word problem, Hard- ware flowcharts are a bridge between English and the logic designer; they are a compact formal description of what the CPU does. The method | describe was used to design the controller for the Motorola MC68000 and IBM Micro’370 microprocessors. The flowchart method is both procedure and notation. The designer follows the procedure to express the design in the particular form I call flowcharts. Unlike most procedures, this one does not start out by presuming a block diagram for the controller. (Doing this imposes a structure on the English specification; the prob- lem is to find an efficient structure.) The block diagram is one of the procedure’s outputs Flowcharts show the design as the flow of simple actions. An example is RX A— ALU, which means “put the contents of register RX on the A bus to the ALU.” (That also exemplifies the notation; it doesn't get more complicated than that.) One of these statements is called a task; states can be one (really zero) or more tasks. | depict the flow of states by boxes (one for each state); | draw these in a specific format, and it is important that you draw the states precisely the way | say. With the flowchart method, you see major flow (a complicated microprocessor can fit on six 8Y%-by-11-inch pages) without losing important detail AX—A—ALU is uncluttered by the usual hardware details that hide significant controller structure issues. The hardware is de- bugged using the flowcharts; they are the authoritative reference for the design. The procedure is carried out with a particular technology in mind (flavors of bipolar, MMOS, CMOS). Decisions in the procedure are based on the capabilities of the particular technology. The procedure does not depend on the implementation method. This means that the same flowcharts are used to implement the chip with combinational logic, PLAs, or microcode. In chapter 4, | show how to implement a simple microprocessor using flowcharts. | tell how to flowchart hardware using just pencil and paper. | describe flowcharting using such simple tools because: 1, The method is useful whether the designer has just a desk and wastebasket or several million dollars’ worth of comput- ers and fancy equipment 7 Microprocessor Logic Design 2. Design automation should be subservient to the design method. It should support the design procedure, not be the design procedure. (Often. engineers’ methods are solely the result of available design automation tools; | think that's bad.) Prereq Flowcharts tell how to get from the architecture to the imple- mentation. They link the programmer's (external) model and the hardware (internal) implementation. Flowcharts specify exactly how commands from the instruction set are carried out using execution unit hardware. You must have the instruction set sum- mary and an execution unit specification before you begin flowcharting Instruction Set Summary The instruction set summary is published as a necessary part of the user's manual. (See, for example, the MC68000 User's Man- ual for the Motorola MC68000 or the /APX Book for the Intel 8088.) The instruction set summary describes: 1, Instruction formats 2. Operations (ADD, AND, SUB, and so on) 3. Addressing modes (Base Plus Displacement, Register Indi- rect, Indexed, and so on! 4, Registers (as seen by the programmer Execution Unit A microprocessor's execution unit (or data path) details are not usually published for several reasons: Users do not want to know, users should not know, or manufacturers want competi- tive advantages kept secret. You need a block diagram of the execution unit that shows the following 1. Programmer's register set 2. Additional registers (such as the instruction register, program counter, and temporary registers) 3. ALU and any speci 4. Internal data paths 5. Rules of operation All this information (except maybe some rules of operation) should be in the execution unit block diagram. The rules of oper- ation tell what can and cannot be done with the execution unit ! function units (such as a shifter) 18 Hardware Flowcharts pieces (registers, buses. arithmetic units, and so on). The rules of operation also tell clock phases, timing, and electrical load constraints for the pieces. These rules are imposed by circuit design limits, If you are responsible for the flowcharts, you should do the ex- ecution unit first. To design the execution unit, | recommend doing trial flowcharts for ten frequently used instructions to de- termine an initial execution unit structure. | think a simple bus- oriented structure is best. so | start with that. In a current (1987) very large scale integration (VLSI) implementation, some limits on your interconnect scheme will come from the circuit design- ers. For example, having no more than three buses allows bus wiring to pass right over the registers and arithmetic units with- out using extra chip area The execution unit will evolve. | proposed the initial execution unit for Micro’370 in January 1981. It went through about twenty-three major revisions before | completed the flowcharts. These changes are expected—and are supposed to happen. For example, in writing flowcharts for the instructions, you find an instruction you cannot implement efficiently. You can’t do a Booth’s algorithm multiply efficiently because you can’t “see” the low-order bits in the multiplier. Since the multiplier normally resides in the shifter, you just wire the low-order shifter bits to the branch control unit. Perhaps you need a special direct path from the ALU to the Data Temporary register (DT). You can just move the DT next to the ALU and wire the direct path. If you need something, add it. The circuit designers will tell you when you're not being reasonable. lustrated Flowchart Method Overview Figure 3.1 shows the development of the implementation using the flowchart method. To avoid confusing details, | illustrate the method with a simple microprocessor, called MIN. Figure 3.2 shows the instruction format and register set; figure 3.3 shows part of the instruction set summary. This subset is adequate to demonstrate flowchart construction. Figure 3.4 shows a suffi- ciently detailed block diagram of the execution unit. It also includes some rules of operation; others will be added as | progress. 19 Microprocessor Logic Dé Clock-Phase Generators Execution Unit Clock-Phase Generators Control Control Store Execution Unit Clock-Phase Generators Control Store Execution Unit Figure 3.1 Development of implementation using the flowchart method 20 ign The architectur input * Begin with a guess for the execution unit = Do flowcharts for the instructions. specification is the only '= This modifies and refines the execution unit and develops the control store and control strategy. * The final execution unit is derived output. = Once the flowcharts are fairly complete, derive the control word format using the flowchart states. ‘= When the flowcharts are complete, so is the execution unit. * Control word format is derived output. * After defining the control word format, you assign bit patterns to the control fields in a way that minimizes control word decoders between the control store and the execution unit Hardware Flowcharts Clock-Phase Generators Control Store Execution Unit Clock: Phase Generators Bus Controller Control ere Instruction Decoders, Control Word Decoders Ee Execution Unit mS Control Word Decoders Execution Unit Clock-Phase Generators Bus Controller Control St er Instruction | | Ld Decoders. | | Figure 3.1 a (continued) ‘ Instruction decoders are defined by the flowcharts and the architecture specification. = Completed flowcharts, control word format, and the initial bus specification define the bus controler. * Last is the logic of the state sequencer, the part of the chip that says what to do next. ("Where's the next control word?"") * Once everything around it is defined, you build exactly what you need! (The state sequencer is derived output.) Instruction Format First Word op Operation code ‘Second Word Microprocessor Logic Design Programmer's Register Set RO First operand register ‘Second Second ‘operand operand address register mode R2 Displacement Rn Optional, depending on second operand address mode Figure 3.2 MIN instruction format and register set Second Operand Address Care] 2, Brachit zee is et, —_‘Socmnd Sead Lovo Second operands Source, Susan Tan’ ese enmnaion mode por Poser wn sus Prederanentn” ——_AB\Baee RY plus alacant cron crews evel ooaatl Sires AR. Register direct. The result is stored in RY. For two operand instructions, RY also is an operand Figure 3.3 MIN instruction set summary Figures 3.2, 3.3, and 3.4 do not include the usual details about word length, instruction length, address length, bus width, ALU size, and register size. Although you know this information. it doesn’t change the sequence of operations for the execution unit, The sequence of operations depends on relative values of these parameters and not on their absolute values. You imple- ment the design from the flowcharts with a particular word length, instruction length, address length, and so on. Don't clutter your flowcharts (or your notation} with details you don't need. 22 Hardware Flowcharts Flowchart Objectives Now you have ample information to construct flowcharts, but you face some difficult questions: What are the design objec- tives? Which objective is most important? Next? Least? Here are some reasonable design objectives: = Limit controller size to some fraction of a single chip. Since profit goes up as die size goes down, there will be pressure to make the controller smaller even when it fits m= Make the CPU as fast as possible (certainly faster than its contemporaries) = Complete the project early to give the product an early start in the market = Make the flowcharts easy to translate into hardware This illustrates the value of a good project manager: He or she ranks the objectives. we kf ne 1 SE “re A Bus: ao||ec}| jz} | ro] |) ar} Rn Internal B Bus: i ae External Address External Data Bus (EAB) Bus (EDB) ALU Arithmetic and Logic D0 Data Out buffer kK Constant generator Unit IRF Instruction Register for Fetch PC_——Program Counter AO Address Out buffer IRE_ Instruction Register for RO-Rn Programmer's registers DI Data Input register Execution 11, T2 Temporary registers. Example Rules of Operation 1. A transfer from source to bus to destination 4. When the ALU is a destination, T1 is automati- takes one state time. cally loaded from the ALU output. 2. A source can drive up to three destination loads. 5. A transfer to AO activates the on-chip external 3. Inputs to the ALU are from the A (internal) bus bus controller. and either K (values 0, +1, ~1) or the B (internal) bus. Figure 3.4 MIN execution unit block diagram 23 Microprocessor Logic Design have chosen an execution unit with a simple two-bus structure for the MIN CPU example. You talk to the circuit designers about what structures are reasonable for the technology. You arrive at the proposed execution unit by doing some trial flow- charts. Figure 3.4 is the proposed execution unit. Here are some rules of operation: 1. A transfer from source to bus to destination takes one state time. {It takes one flowchart state to execute the task AX—A—ALU, for example.) 2. A source can drive up to three destination loads. (For exam- ple, the task T1-+B—+ALU,AO,PC has three destination loads: ALU, AO. and PC.) The circuit designer will tell you how many destination loads each source can drive 3. Inputs to the ALU are from the A (internal) bus and either K (values 0, +1, -1) or the B (internal) bus. One side of the ALU has one input source (A) and the other has two input sources (K and B). 4, When the ALU is a destination, T1 is automatically loaded from the ALU output at the end of the state time, 5. A transfer to the AO buffer activates the on-chip external bus controller. This bus controller postpones the next state until the external transfer is complete Picking the initial execution unit requires some knowledge of implementation cost for the technology you use. The circuit de- signers should help you with this. It is much better to start with too little than too much. (It's easier to add things than to figure out whether you can throw them away.) Start with a simple ex: ecution unit and add resources as you need them. If you begin with an extravagant guess, you may build something fancier (bigger and slower) than you need. The flowchart method can help you identify features that improve performance. It will not tell you what you don’t need. And it will not tell you when you have an overkill (too much hardware for the problem you are solving). The circuit designers should warn you when you are asking for more than they can do. if they trust you, however, they will try to build what you want—even if it is too much. It all comes down to this: You are an engineer (the logic designer). You have to use restraint, common sense, and judgment. | can’t find a procedural substitute for you. I can only tell you what helps me. Hardware Flowcharts Making a Flowchart When itis time to begin the flowcharts, you will be plagued with all sorts of questions. How do | begin? What do | write? How do | write it? | suggest methods that work for me, Use a register transfer notation to describe the operations of the execution unit. Each statement in this notation is called a task in the flow- charts. Each state comprises one (really zero) or more tasks. Use rectangles for states. (In a microcoded implementation, each state becomes a control word.) A control word sequence is @ succession of states. Work on large sheets of high-quality graph paper (preferably 17-by-22-inch vellum with ten lines per inch). Large sheets make it simpler to see and to plan large seg- ments of the control flow. and high-quality paper lasts through many changes. It can take years to complete the flowcharts for a complicated CPU. (It took me a year to complete flowcharts for the Motorola MC68000 and about two and a half years to complete flow: charts for Micro’370,) To avoid copying several generations of flowcharts, observe these rules © Work in pencil. (Use a 5mm Pentel with F lead.) Work on the back of the vellum so you won't erase the grid. Use an erasing shield and an electric eraser. Always use a cover sheet to prevent smearing Plan changes on scratch paper and transcribe them to the vellum = Always use reproductions for work and reference. (I reduce the copies to 8% by 11 inches for easier use.) = Accumulate changes {in red ink} on a reproduction. Do trial level 2 flowcharts (level 2 flowcharts are explained later) on 8Y/-by-11-inch scratch paper with 1%-inch-high, 2.inch-wide penciled-in rectangles as guides. (I load the copier with junk memos and copy the grid on the back.) Figure 3.5 shows flowchart sequences for the register-to-register ADD instruction, the register-to-memory ADD instruction using the MIN execution unit (figure 3.4), and a simple register trans- fer notation. Each box is a state. Each line entry in a state is a task, Tasks are expressed in the register transfer notation; the notation has a source-bus-destination format. Alphabetize tasks 25 Microprocessor Logic Design ADD RX AR RY ADD RX Al (RY) Register-to-Register __‘Register-to-Memory R ADD Ro ADD =k M oxa-valy edb-rdi ry—b—alu ryb-*a0 Task i | Bus Destination Source Figure 3.5 Execution of register-to-register ADD and register-to-memory ADD instructions (partial description — operation tasks only) in each state by source [if there are multiple destinations on a single line, alphabetize them, too); you will use this to compact the flowcharts later on. In figure 3.5, time advances from the top of the page to the bot- tom of the page, except within a state. Within a state, tasks appear to be concurrent but are governed by rules-of-operation timing, In a microcoded implementation, each state is one microcycle (and may have phases such as source, transfer, des- tination, and precharge) In the register-to-register example (the left flowchart in figure 3.5), | transfer both operands to the ALU in the first state, The output of the ALU is saved in T1 any time there is an ALU operation. In the second state, the result is sent from T1 to RY. Look at the register-to-memory ADD example (the right flowchart in figure 3.5) The first state fetches the memory operand; the second state adds the operands; the third state sends the result to memory. Something doesn’t look right. In the first state, DI is loaded from the external data bus (EDB), but RY is sent to AO after this hap- pens. Wrong! | consider these tasks concurrent (with some implicit timing). They are in alphabetical order. Sending RY to AO initiates the external bus activity that results in the DI transfer from EDB. The tasks are listed in the same state because as far as the state sequencer is concerned, they happen at the same time 26 Hardware Flowcharts In a microcoded controller, the control word specifies the tasks. in a single state. The tasks are commands to the external bus controller, the execution unit, and the state sequencer. Whatever timing is added later, the commands all come out of the control store at the same time. In the case of a read, the transfer to AO initiates the external bus cycle. If the external bus is synchro- ous, then DI must be valid by the end of the state time. If the bus is asynchronous, the state sequencer “hangs” in the current state until the transfer to DI is completed. In the case of a write, the address and data are transferred to the external address bus (EAB) and EDB, respectively. The state sequencer “hangs” until the external bus controller signals the state sequencer that the external transfer is completed. There is no explicit notation for transfer from AO to EAB or from the Data Output register (D0) to EDB (or for memory to EDB). | have elected to let them be implied by the context. | view AO and 00 as amplifiers (not registers). Because AQ is not a regis- ter, it does not remember the address between state 1 and state 3 of the register-to-memory sequence (the right flowchart in figure 3.5). Transfers from EDB to the execution unit are not implicit because they can be to the instruction register for fetch (IRF), DI, oF both. Notation Keep the register transfer notation simple. It must capture the ‘essence of what the CPU is doing without all the details. You may think this is a simple notation invented for just this one case. Well, that’s somewhat true. | modify the notation to fit the. problem. | want the notation to be a simple, natural, readable way to express what the CPU is doing, That is why the notation is not formally defined. In a formal notation, constructs might prevent natural expression of tasks and hinder the design Flowcharts are graphic notations that depict the CPU in two ways 1. Flowcharts visually emphasize changes in sequence and con- currency for whatever the controller is doing. You see branching and merging in the flow of control. You see how the address calculation sequences and operation sequences are shared. You see all the instructions sharing one common set of address calculation sequences. You see ten instruc- tions sharing the standard dual operand execution sequence 2 Microprocessor Logic Design (as the MC68000 register-to-register operand execution se- quence does, for example}. You see which instructions have an execution sequence all to themselves (multiply or divide in Micro'370, for example}. 2. Flowcharts visually communicate the relationship of sequence to concurrency for whatever the controller is doing. You see exactly what is concurrent (tasks) and what is sequential (states), and you see how they are related, Flowcharts show sequential state flows made up of concurrent tasks. Each task is a sequential source-bus-destination flow. Flowcharts ate a flow-intensive notation showing you the con- current and sequential nature of operations. Execution Speed The flowchart sequences in figure 3.5 are incomplete. They do not include the instruction fetch and the PC increment. The PC increment and instruction fetch could be added to the beginning or end of both sequences (with different consequences). Which leads to the fastest controller? Just what is the fastest control- ler? How about this definition: The most efficient controller executes a given instruction with the least number of states “That's kind of a truism. Give me something | can use—that tells me what to do.” You are designing something (a micropro- cessor} that will be part of a larger system (a board, a personal computer, an instrument), What limits system performance? Is it always your part? Sometimes your part? If your part is the sys- tem bottleneck, you did not design it very well. If your part is, never the bottleneck, perhaps you spent too much on hardware. The best engineering design achieves the effect of infinite re- sources (never the bottleneck) at minimum cost. Microprocessor design is a good example. | believe that useful external bus ac tivity in every state is evidence of sufficient controller efficiency. Therefore, | use the following definition for controller efficiency The controller is efficient if execution never delays external bus cycles. (If some other part of the system is the bottleneck, the controller design is good enough.) Measuring the microprocessor performance at the pads will not reveal whether you implemented a Cray supercomputer or a controller barely sufficient to make external bus transactions the bottleneck. This is not a measure of bus efficiency or system effi- ciency; it is a measure of how well you do the controller design. 28 Hardware Flowcharts I can't give a useful general definition for the fastest controller because it depends on what the controller does. | have given a definition that works for a microprocessor, but an applications engineer would not use this definition because he would not want the external bus tied up by the CPU all the time. Figure 3.6 improves the examples in figure 3.5 with the PC in- crement and instruction fetch. | removed the lines connecting boxes because they are unnecessary and doing so saves space. The more states you fit on a page, the more of the design you take in at a glance. (I still use lines to show the next states of sequences with internal branches.) To make a quick measure of efficiency possible, | put a shaded box in the upper right-hand comer of states with external bus activity. Assuming states of equal duration, the overall efficiency of the execution unit is 20 percent for the register-to-register instruction and 50 percent for the register-to-memory instruction. Our competitors will be pleased. What can | do about it? In some states of each flow- chart sequence, the major internal buses (A and B) are not both occupied. That's not good. It should be possible to merge tasks for greater efficiency. We must find a way to squeeze more per- formance out of the execution unit ADD RX AR RY ADD RX Al (RY) Register-to-Register aoa to-Memory ADD ADD ‘edb -irt edb —irf peb+20 pe—b—-a0 rxaalu edb-di ry-+b~alu tyb--a0 ditb-ralu rxva-valy po-a~alu +1-alu Bm indicates external bus activity Figure 3.6 Revised execution of ADD instruction examples, Microprocessor Logic Design Level 1 Flowcharts Separate an instruction’s execution into operation tasks and housekeeping tasks and treat each differently. Operation tasks are transfers required to perform the instruction. These tasks (such as accessing operands, storing results, and moving data to and from the ALU} must occur in a specific order and may be unique to particular instruction. Figure 3.5 shows the operation tasks for two types of ADD instructions. Housekeeping tasks, such as PC increment and next instrument fetch, are common to all instructions. You have some leeway in deciding when these tasks are accomplished. The tasks are essentially indepen- dent for all instructions, so you should treat them separately (initially). Separate kinds of tasks so you can optimize the execu- tion of the operation tasks. Figure 3.7 shows the flowcharts in a format designed to aid later merging of operation tasks and housekeeping tasks for maxi- mum execution efficiency. This is the level 1 flowchart format For each instruction, operation tasks are in the left sequence, and housekeeping tasks are in the right sequence. Do level 1 flowcharts for most of the instructions and then begin level 2 flowcharts. You do not have to do level 1 flowcharts for all in- structions. If you have instructions for which housekeeping tasks are an insignificant portion of the execution time, it is a waste of time to do level 1 flowcharts. For example, the System/370 MVCL (Move Character Long} instruction may take several thou- sand states to execute and has only a couple of states of housekeeping tasks. It isn't worth doing twice (once in level 1 flowcharts and once in level 2 flowcharts). Level 1 flowcharts find the best execution sequence for the operation tasks and identify the housekeeping tasks. Level 2 flowcharts merge the housekeeping tasks with the operation tasks. The direction of the merge is into the operation tasks. (You want to make the housekeeping tasks “disappear” into the operation task se- quence.) The state order of each column must be preserved in the final sequence (called the execution sequence, but house- keeping tasks can be merged with operation tasks wherever reasonable. (We shall see consequences of this merging later.) You would achieve the most efficient execution (for this execu- tion unit) if you merged the housekeeping tasks with the operation tasks without increasing the number of states in the operation task sequence. Usually, it is adequate to have the number of 30 Hardware Flowcharts ADD RX AR RY ADD RX Al (RY) Register-to-Register RoR aay rysb—alu Operation tasks ADD ry=b-*a0 po-tb--a0 di=b—alu pea—alu esa +1-alu ry—"b=a0 tire ta-+do, tb: Housekeeping Operation Housekeeping tasks, tasks tasks. 1 indicates external bus activity Figure 3.7 Level 1 flowcharts for two types of ADD instruction states in the final execution sequence be significantly less than the total states in housekeeping task and operation task se- quences. Since the microprocessor allows only one extemal bus access per state, you have done enough when you merge housekeeping tasks into an operation task sequence in a way that produces useful external bus activity (including housekeep- ing accesses) in every state. Increased speed may not be the only objective of the merge. You also should merge the tasks to create as many identical states (across instruction types) as possible. | assume that a controller with fewer unique states is smaller. Be careful merging housekeeping tasks into operation task se- quences. You would not want to increment the PC before you computed a PC-relative branch address. Merging housekeeping tasks into operation task sequences is challenging and fun be- cause it requires skill and care. You may reorder tasks, change the execution unit, and try dozens of combinations and se- quences to get the most efficient execution sequence. This is design. You are working to find the best execution unit for the instruction set and the best controller for the execution unit. If you like puzzles, it won't even seem like work. This is how you are creating the controller. | will show (later) how assumptions you make in the flowcharts translate to hardware in the control- ler. You get the controller you need, which is better than choosing a controller and trying to make it do what you need 31 Microprocessor Logic Design | added one more thing in figure 3.7. IRE is the instruction regis- ter for execution (see figure 3.4), It allows a rudimentary pre- fetch. IRE holds the current instruction and drives the register selection decoders (for RX and RY). IRE is loaded at the begin- ning of a state, and decoding will be stable within one state time. It must not be changed until after the last RX or RY refer- ence in the flowchart sequence for the current instruction. Each instruction (sequence of operation and housekeeping tasks} is associated with a particular register pair (RX, RY] established by IRE. The instruction register for fetch (IRF) can be used to hold the next instruction until the current instruction is done. It can be loaded anytime during the current instruction—this is the simple prefetch. More accurately, IRF gets the word following the current instruction. ‘It may not be the next instruction if the current instruction is @ branch or a two-word instruction.) Level 2 Flowcharts Figure 3.8 shows the housekeeping tasks merged with the oper- ation tasks to form what I call level 2 flowcharts. The efficiency of the register-to-register sequence is 33 percent, and the effi- ciency of the register-to-memory sequence is 75 percent. (You could do better with a more complicated execution unit and con- troller.) Register T2 saves the operand address in the register-to- ADD RX AR RY ADD AX Al (RY) Register-to-Register Register-to-Memory RoR ADD Rom ADD xa alu ‘edb =i ryob—alu pe-ra-ralu, a0 +1-alu edb—irt edb di pe-ra-salu, 0 ry-tb=a0, 12 tab-ry tape +1—alu itire tbe © indicates external bus activity Figure 3.8 Experimental reduction of the level 1 flowcharts 32 Hardware Flowcharts memory ADD example. Because T2 contains the memory ad- dress (for the second operand) and the static decoders {which are driven by the IRE) are available, there are no more RX or RY references, The last state can change IRE and store the result Feedback on Execution Unit Design Do a level 2 flowchart of the fastest instruction. This will point to inadequacies in the execution unit design. In general, you will discover inefficiencies in the structure of the execution unit as you merge the housekeeping tasks with the operation tasks. In the register-to-register ADD example, if the AO buffer had not been accessible from the A bus (see figure 3.4), | would not have been able to do the instruction in fewer than four states Less than full use of the A and B buses in the resulting se- quence would signal the need to improve the execution unit, Figure 3.9 shows a register-to-register ADD sequence for an ex- ecution unit with no path from the A bus to the AO buffer Beware! The increased complexity of the execution unit can in crease the number of unique states and result in a larger con- troller. Increasing the complexity of the execution unit implies more execution unit hardware, too. Only after carefully studying the flowcharts and the execution unit would | suggest execution Unit changes to improve the efficiency of the overall design ADD RX AR RY Register-to-Register RoR ADD oa aly ry—b—alu ‘edb —irt po-rb--a0, Msasty Figure 3.9 ADD sequence for an execution unit in which AO is connected only to the B (internal) bus 33 Microprocessor Logic Design The flowchart method helps design hardware. | believe that it is 4 good, workable method. It is not a rote recipe for good design: you still have to know something about what you are doing, For example. if you start with a fancy execution unit, there is noth- ing that tells you to throw away expensive hardware. A little artow drawn on your execution unit may imply a 32-bit data path (lines, space, power) with pass transistors and control signals. Be sure you need it. You should know the cost of what you ask for. Start with a simple execution unit and add what you need The circuit designers should tell you when you ask for too much. Feedback on Controller Design Use the format in figure 3.7 to create level 1 flowcharts for the entire instruction set. How many sequences is that? The upper bound is 2” if w is the instruction length in bits. That is too many, however, because | write only one sequence for each instruction—independently of which registers are specified. (This is an advantage of static decoders for the register fields.) | need only decode the op code and the mode bits in the effective ad- dress field (see figure 3.3). Suppose the simple MIN CPU has k ‘operations (ADD, AND, OR, SUB, and so on) and a address modes (Register Indirect, Base Plus Displacement, Indexed, and so on). If any address mode is valid for any operation, | would need k*a instruction sequences. Clearly, this number can be large. For example. the Motorola MC68000 has about 14 address modes and more than 50 in- struction types. If an average instruction has 8 states, then | must implement more than 5,600 (50*14*8) states in the con- troller. Such a chip would make @ good office partition. Note that | have segmented the flowchart sequences for execut- ing instructions. The sequence of flowchart states for an address mode calculation (which may or may not include operand fetch) is called an address mode sequence. The sequence of flowchart states that completes instruction execution (once the address mode sequence is finished) is called an execution sequence. The combination of an address mode sequence and an execution se- quence forms a control word sequence. If an instruction (such as a register-to-register ADD) does not need an address mode sequence, then the execution sequence and the control word sequence are the same. | will use this terminology throughout the book. 34 Hardware Flowcharts If most address modes can be used with most operations, why not share address mode sequences? (Address mode sequences calculate the operand address, fetch the operand, and place it in DIN.) ADD Register Indirect and OR Register Indirect, for exam- ple, would share a common Register Indirect address mode sequence. Also, the Register Indirect address mode sequence and the Base Plus Displacement address mode sequence could branch to the same execution sequence for the OR lor any other) instruction. The operand will be in DIN; the execution se- quence doesn’t care how the address was calculated to put it there. If you share the address mode sequences among the ex: ecution sequences, you need only k + @ sequences, and that is in keeping with the goal to reduce controller size. This is a good idea, but what will it cost? It’s not free. Suppose you enter the execution sequence, jump to an address mode se- quence (subroutine), then return to the execution sequence to complete execution of the instruction. Such a subroutine call costs time (branching to and returning from the address mode sequence), but it lets the controller be much smaller (since the address mode sequences are shared by the execution se- quences). The size and speed goals conflict, so a trade-off is in the offing How important is the time lost in these subroutine calis? To find out, have the instruction set designer rank the instructions in or- der of importance. The designer could base the ranking on static or dynamic frequencies of occurrence. However the designer does it, if she designed the instruction set, she must take the stand on what is important. A ranking for the sample MIN in- structions is shown in figure 3.10 Sharing sequences reduces controller size. From the ranking you see that slow subroutine calls are costly because at least three LOAD ‘Most important BZ, .. . (other branches) STORE ADD, AND, SUB, Test PUSH POP Least important Figure 3.10 Ranking of MIN instructions 35 Microprocessor Logic Design of the four most important instruction types can use any address. mode (hence, would have to branch to and return from an ad- dress mode sequence). You will not use subroutine calls. You assume that address mode sequences can be shared by initially entering the address mode sequence and branching directly to the appropriate execution sequence. One way to do this in a mi- crocoded controller is to have the instruction decoder provide more than one control store address—one for the address mode sequence and one for the execution sequence. Flowcharting has led us to a functional requirement for the con- troller. (The instruction decoder is to provide more than a single output.) This shows how controller requirements come from the procedure. You have not, however, constrained the implementa- tion of the controller to be combinational or microcoded; that choice lies in the future. You do not even have a block diagram of a controller, and you do not want one yet because you want the procedure to give you the requirements for the controller in- dependently of what you think a controller should look like. The flowchart method finds requirements for the controller that best fits what the CPU wants to do (the architecture specification) Doing Level 1 Flowcharts The level 1 flowcharts for a subset of MIN instructions are shown in figure 3.11 (pages 37-38). In a real CPU, the flow- charts have many more address made and execution sequences. Note the following things in figure 3.11 1. At the beginning of instruction execution, IRE is assumed to contain the current instruction. It must be loaded by the pre- vious instruction. Each instruction's control word sequence will, therefore, have to fetch the next instruction and load it into IRE. 2. Instruction execution begins with the address mode se- ‘quence (if the instruction has one) and implicitly branches to the appropriate execution sequence for completion. (We will figure out how to build the hardware to support this branch- ing later.) 3. The execution sequences for register-to-register instructions cannot be shared with execution sequences for memory ref- erence instructions. This reduces the savings from sequence sharing 4, The execution sequences for standard dual operand instruc- tions (ADD, AND, SUB, and so on) are identical except for the Address Mode Sequences Base Plus Displacement Register Indirect —— edb-+di edb-di po-a-alu, 30 ry~b120, 12 +1—alu tape dibalu ryaalu Branch Instruction Bz no edb—di ‘edb vir tb-+120, 12 rya~alu, 20 +1—alu Z=1 (branch) (no branch) ietire edb—irt 4 tib—-pe po-ra-talu, a0 + 1—alu tire t-b=pe Execution Sequences with a Memory Operand Reference Load STORE diborx, 12 edb rt 4 Tx-a—alu, do") edb—it 4 pe-ra-ralu, ao 12-b-a0 po-ra-ralu, ao + 1alu O-alu +1salu t2-a—alu irtire it -vire O-alu tlb=pe t1+b-pe ADD AND 7 dimbalu edb—irt dimb—alu edb-irt rxa-valu pema~alu, 30 txsasalu po-ra-alu, a0 +1—alu +1-alu +}—_—_—_—__—___ t1a—do irf-vire t1a-do it ire 12-b~a0 tb-pe 12+b-a0 t1+b~+pe. Test dimb~alu edb irt dimb 12 rxsa~alu pe-a-alu, ao +1-raly ado irftire 12-a-alu if-sire 12+b=a0 t1+b~pe O-alu tb=pe ‘igure 3.11 Typical level 1 flowcharts for the MIN CPU (continues) 37 Microprocessor Logic Design Execution Sequences for Register-to-Register and Special Instructions Loap STORE tysa—alu,x —] edb-irt reaanalu, ry | edb-virt O-ralu pe-ta-ralu, a0 O-alu pe-ra-ralu, 0 alu + 1—alu tire infire tape tape ADD suB xmaalu edb rt ry—=b—alu po-ra~ralu, a0 + 1a rxa—aly edb -iet ry-tb-valu pe~ra-alu, 30 +1-alu infire tiasry irfire tiaspe inc pel POP. PUSH ‘edb—di edb viet ryasalu edb ict rysa-alu, a0 | pe-a—alu, a0 “1 pe~a-alu, a0 +1alu + T-valu =1salu dimborx infire rxado toasty tape t1b-a0, ry infire tha pe ‘17 (continued) (implied) ALU function, They can use a common execution sequence if the op code directly specifies the ALU operation (the same way register fields select the registers}. 5. Unfortunately, the Store instruction reads the word at the store destination location because it shares the address mode sequences with other instructions, This slows down opera- tion, but | decided to sacrifice speed to make the controller smaller. (There are other reasons besides size why you may not want to do the Store instruction with a read first. Some systems want locations that are read protected. Other sys- tems have memory-mapped peripherals that change states upon a read.i 6. The Branch on Zero (BZ! instruction is a special case. (The Z bit is set to one when a result operand is 2er0.! Since the condition code (Zi may be set as late as the last state of the previous instruction tin the Test instruction, for exampiel, it may not be available in time to be used at the onset of the 38 Hardware Flowcharts next instruction —in this case, BZ. (Because of the simple pre- fetch, the instruction decoder is operating concurrently with the execution unit. Information that can change in the execu- tion unit cannot, therefore, be used by the instruction decoder As a result, use of the condition code must be de- ferred at least one state time in the instruction sequence. The branch appears between the first and second states of the task sequence for the branch instruction. Because | need a delay state for the condition code to settle, | must decide how to use the state. The example in figure 3.11 (page 37) shows an anticipated branch prefetch iit is discarded if the branch is not takeni. An alternative would be to fetch the ext Sequential instruction and discard it if the branch is taken. The instruction set designer should be able to tell you which alternative to use Note that there are no conditional tasks. If you get to a flowchart state, you always execute all the tasks in the state. There are two types of conditional branches. visible and invisible. One con- ditional branch is explicitly used in the instruction. Examples are BZ (Branch on Zero), BN (Branch if Negative), and BP (Branch if Positive). Another conditional branch is available to the micro- code but not visible to someone using the instruction set. If, for example, you have to implement a Multiply instruction but you do not have a hardware multiply unit, you will do the multiplica- tion with a shift and add or subtract algorithm. You will need conditional branches in the microcode to test the multiplier bits and to detect the end of the algorithm. Each instruction’s flowchart further shapes the design. The func- tions of the controller eventually will be completely defined by the flowcharts, | have still not constrained the design to be either combinational or microcoded. Once | combine the standard dual operand instructions (ADD, AND, SUB, and so on) into one execution sequence for register-to-register and another for regis- ter-to-memory, the level 1 flowcharts are complete. Then | can work with them, merging housekeeping tasks with operation tasks to produce the level 2 flowcharts. Doing Level 2 Flowcharts Figure 3.12 shows the housekeeping tasks merged into the op- eration task sequences for the instructions in figure 3.11. | also integrated the standard dual operand instructions (ADD, AND, SUB, .. into a single sequence. (This assumes IRE will select the ALU operation, in the same way it selects the registers.) 39 Microprocessor Logic Design Address Mode Sequences Branch Instruction Base Plus Displacement Register Indirect Bz ‘edb—di ‘edb—di edb —irt pe-a~alu, a0 ryb-a0, 12 ry—a-ralu, a0 +1-aly + 1=alu abdmt brat ti=a—pe z abdm2 (branch) (no branchi di-b—alu edb rytasalu pewa-alu, a0 +1—alu abdm3 brx3 edb—di tire taao, 12 tb pe abd brazd Execution Sequences with a Memory Operand Reference Loap STORE ADD, AND, SUB Test 7 dib—rx, 12 mxasalu, do di-b—alu di-b—12 4 bit (2b~a0 feva—als abrir pe-ra-alu, 30 Omala peas, 90 Oe ate ‘cern stmt pirat int ire edb —irt infire t14b-tpe po-ta-alu, 20 tsb pe 1Q-asalu +1-alu t2-a-alu O-alu O~alu ldem2 strm2 test2 ‘edb —irt pe-ra-alu, a0 1k itire t1b=pe oprm3 ite tb-tpe oprm4 Figure 3.12 Merged level 1 flowcharts for some MIN instructions (continues) 40 Hardware Flowcharts I tried to merge the housekeeping tasks into the operation se- quences without increasing the number of states in the oper. ation task sequence. | was not always able to do this (it's a goal, not a requirement. | did reduce the number of states from a potential fifty-four (the number of states in figure 3.11) to an actual thirty-five. (I merged twenty-eight housekeeping states with twenty-six operation states.) Normally, if | cannot reduce the number of states significantly (a matter of judgment). | try to improve the execution unit. Be careful in merging because operation tasks can use the same resources (such as buses, registers, and arithmetic units) as the housekeeping tasks. Arbitrary interleaving is not possible. For ex ample. if there are PC relative address modes, the PC update (a housekeeping task} must consistently precede (or follow! the ad: dress calculation. If a problem during an instruction execution (such as an arithmetic overflow or divide fault) causes an inter: rupt that stores the old PC value, that value must be consistent Execution Sequences for Register-to-Register and Special Instructions Load ADD, AND, SUB Pop PUSH ‘edb rir rxsasalu ‘edb ai ty-a—aly pe-a~alu, 20 ry=b-alu rya-alu, 20 Tala ry—b—x, 12 +1alu +1-alu Idee opel poprt infire edb —irf rxa—do t1b-pe pca-alu, a0 t1b=20, ry 12-a-alu toby O-ralu +1salu nr oper popr2 push2 intire ‘edb = if edb rt STORE t1+b-pe po-a-alu, ao pe-a-ralu, a0 aie + 1=alu Tatu pe-a-alu, 20 bry, 12 oprr3 poprs push +1-alu 1: itaire edb —irt eae t=b=pe t1b~pe intire. tb-pe t2-a~alu pushd O-alu str2 Figure 3.12 (continued) a Microprocessor Logic Design for all instructions. (The instruction set designer should tell you if you have to store the old PC for interrupts and what the PC should point to.) Before you can transform the flowchart description to a hard- ware implementation, you must identify the states. In figure 3.12, | put state identifiers in the lower right-hand corner of each state If you add descriptive information. you can ease the transition from flowcharts to hardware, and you make the flowcharts eas- ier to use. But what descriptive information will help? What information do you need? Listed below are some useful kinds of descriptive information for translating flowcharts into hardware. | listed information useful for implementing the MIN controller: more complicated controller requires more information (for regis- ter decoder substitutions or operand sizes, for example. Refer to figures 3.13 and 3.14 ipages 42.451 Label B Label A ‘Access Type ‘ALU and CC Duplicates Page and Location State ID | Synonym Next State Sequence Label A ALU and CC (ALU function Opetaton formato instucton and Condon Code sting ‘pe Example: ADD, AND, or OP Barnle: t 9p y-ry condion codes mop earn mam Sse Sequence Label 8 aS Operon seuance or arte mods seqvonce Example: ADD. AND. SUB Register tnivect Duplicates (Not used in MIN) Page and Location Indexed {Not used in MIN) ‘Access Type Next State DR Data Read Bc Branch Conditionally DW Data Write 8 Instruction Branch IR Instruction Read SB ‘Sequence Branch NA No Access State ID. Direct transfer Figure 3.13 Format for a level 2 flowchart state Access Width (Not used in MIN) Synonym {Not used in MIN} State ID State Identification Windicates external bus activity Hardware Flowcharts Address Mode Sequences Branch Instruction (RY+a@ edb-di ie edb-di edb —rirf pe-ta-alu, a0 + 1alu ry+b~a0, 12 ryta—alu, a0 +1=alu tape dimb—alu rytaalu na ifire edb int Caan t1=b=pe pe-ra-alu, a0 +1—alu Z=1 (branch) (no branch) edb—di 11 +a+a0, 12 | brzz3, ‘abama igure 3.14 Format for final version of level 2 flowcharts (continues) 1. Sequence labels. These labels, associated with each exe: cution sequence or address mode sequence, identify the Instructions or address modes using that sequence. They also describe the transfer path. You will use this information later to build the instruction decoders. You can relate an instruction op code bit pattem to the first state ID in the address mode and execution sequence. For example, if IRE contains the bit pattern for register-to-register ADD, you want to begin in: struction execution with state oprrl isee page 45). Ina microcoded controller, the instruction decoder will translate the bit pattern of the register-to-register ADD into the control store address of the control word oprrt. In a PLA decoder, the instruction bit patterns will form the AND array, and the control word state ID addresses in the control store will form 43 Microprocessor Logic Design Execution Sequences with a Memory Operand Reference RX OP MEM ADD, AND, MEM—RX LOAD RX=MEM STORE © —MEM SUB di bor, 2 i rea—alu, do dimb—alu na edb -rief 12+b-a0 revaalu petaalu, ao | addx O-aly 1=alu f drt fire edb —irf ti=a—do t1—b-pe po-ra-alu.a0 12-+b-+a0 W2-a—alu +1-aly O-alu edb=irt ir MEM--ALU Test tl+b~pe pe-a~ralu, a0 dimb—12 ee edb -rirf | pe-ra~alu, a0 | copia fire na t1+b~pe irt=ite t1b-*pe. analy O-aly Figure 3.14 (continued) the OR array. The PLA definition is derived from the labels in the flowcharts, The label abbreviations are: a associated preceding quantity is an address 6 displacement ADD {for example} instruction using the sequence MEM — MEMory oP OPeration RX source operand register RY address or operand register (see figure 3.3) Execution Sequences for Regist RY--RX edb—irt pesa-alu, 20 ryb—rx, (2 +1-alu LOAD Hardware Flowcharts ‘edb —di ry+a-alu, a0 +1alu -to-Register and Special Instructions RY@—RX RY+1-RY RY-1-RY RX —-RY@ Tya—alu Isai inte t1+b--pe analy O-alu x80 tb-ra0, ry RX OP RY “RY xa alu ry~b~alu edb icf pe-a~alu, a0 t1+b~pe +1-alu Ant 1D, AND, SUB edb rt po-ra-alu, a0 +1-alu ‘edb —irf pe-a—alu, ao +1—alu opr fire tb= RX—RY edb—irt pe-ra-alu, a0 ixmb-ry, 12 +1-alu intire tb-pe 12-a-alu O-alu Figure 3.14 (continued) tire na intire na tt=b-+pe tbe FY pop a roprocessor Logic Design . Access type. This description says whether the controller is using the extemal bus for an instruction fetch or for a data read or write. . ALU function and condition code setting. The ALU func- tion determines the operation for the ALU for a particular state, ADD, SUB, OR, and AND mean just what they say, OP means that the value in IRE determines what the ALU opera- tion will be. The condition code setting tells whether a condition code is to be set. . Duplicates. This box is not used for the MIN CPU example. Itis used by a flowchart drawing program to indicate how many other states contain exactly the same set of tasks. (We will use it later to help reduce control store size.) . Page and location. This box is not used for the MIN CPU example. It 's used by a flowchart drawing program to place the associated flowchart box on a printer page. (For example, the Micro 370 flowcharts are twenty-five pages containing about a thousand states. Each state has its own page number and location coordinates assigned by the designer) . Next state transition. The next state transition tells how the controller determines the next state. In a microcoded controller, the next state might be reached by a conditional branch, a sequence branch (a new address from an instruc- tion decoder), or a direct branch (address from the current control word). For the MIN example = BC ibranch conditionally) denotes that the next control store address depends on the value of a condition code (generated by the ALUI. A base address is supplied by the microword and altered (or augmented) by the branch condition = SB (sequence branch) denotes a transition from an ad- dress mode sequence to the corresponding execution sequence for the current instruction © IB {instruction branch) denotes a transition to the first state of the next instruction sequence. (In a microcoded control- ler, IB would tell the controller to access the control word at the control store address specified by the instruction decoder) = State ID denotes a direct branch in the control store. The address of the next control word is in the current contro! word. Access width. This box is not used for the MIN CPU exam- ple, but | would use it to indicate the size of the external bus Hardware Flowcharts transaction. Micro'370 uses w, h, and b in this box to indicate external word, halfword, and byte accesses, respectively. 8. Synonym. This box is not used for the MIN CPU example. If several states have exactly the same tasks, then one is considered the original and the rest have the state ID of the original in the synonym box. (The duplicates box in the origi- nal state tells how many times the original state ID appeared in a synonym box.) 9. State identification (state ID). Each state has its own iden: tifier. | use descriptive identifiers. For example, STRM1 is the state ID for the first state in the store-register-to-memory ex ecution sequence. In a microcoded controller, the state ID will be @ mnemonic representation of the control store address Once you assign the control store addresses to the control words, you can use a program to translate the flowcharts’ state IDs into control store address bit patterns, For the Micro/370 project, | used one program to assign the control store addresses and another to translate the flowcharts into the control word bit patterns. Figure 3.14 shows some sample level 2 flowcharts with the above information. | used one method to reduce the number of states: sharing address mode sequences. A second method is to eliminate duplicate states at the ends of sequences by speci- fying a direct branch to a common sequence. This merges the ‘ends of flowchart sequences. Do this by comparing the ending states of each sequence in figure 3.14 with all ending states below and to the right of the current sequence. Alphabetic orga- nization of tasks, comer shadings, and access indicators make it easier to compare states. The result is figure 3.15 ipages 48- 49), which has one-third fewer states than the flowcharts in fig ure 3.14, This is the most direct method of reducing controller size using flowcharts. When | did this for a CPU with hundreds of states (the MC68000), | wrote each state on a separate IBM card and alphabetized the deck. | then compared each card with the cards below it to find duplicate or similar states. Similar states have some tasks that differ. If you can make them the same without adversely affecting the associated sequences, you may eliminate some states. When you merge the level 1 flowcharts to make level 2 flow- charts, consider moving operands into temporary locations early 0 later states in the sequence are more independent of the in- struction parameters. (I did this in the ADD instruction example a7 Address Mode Sequences ‘edb-di Microprocessor Logic Design po-ta-talu, a0 +1-raly Branch Instruction (ny+a@ Bz ‘edb—-di edb —irt ryb~*a0, 12 ry-raralu, a0 +1-alu ‘admit Ls reat tape wand] di--b—alo ryra—alu 1 (branch) iet-vire tb pe {no branch) edb ir pe-a-ralu, a0 +1-alu add-n Execution Sequences with a Memory Operand Refi MEM RX ib —0%, 12 edb -irt pe-a-ralu, a0 +1-alu RX--MEM 12b-ra0 O-alu RX OP MEM MEM ‘ADD, AND, ‘SUB dimb—alu reva-ralu itire t1-+b=pe 12-a—alu o-alu stmt MEM—ALU dimb=12 edb-irt pe-a-salu, a0 +1 alu test idem Figure 3.15 Merged level 2 flowchart examples 48 (continues) Execution Sequences for Register-to-Register and Spe RY--RX fedb-*irt po-ra~alu, a0 fyb-rr%, 12 +1alu Hardware Flowcharts Instructions. RXOPRY ADD. AND. RX=RY, STORE =RY suB =— edb—irt a xaalu pe-a-alu, a0 xb, 12 +1alu ry=b~alu RY@—RX RY+1-RY edb—di set Tema edb—irt RY=*1>RY pe-a-alu, RX-RY@ PUSH tab-ry Te +1alu ysa—aly ry-raalu, 0 va rae “ aden opr? push push 7 tra do ow Hob-a0,y pope bred push? bred Figure 3.15 (continued) for MIN.) Similar states occurring at other than the ends of the sequences cannot be merged. States adrm1, brz21, and poprt in figure 3.15 could be replaced by a universal state, but this state would have to exist for each sequence. It is possible to share common states at other than the ends of the sequences if some sort of microcade subroutine mechanism is provided Although a real CPU is much more complicated than the MIN CPU, the flowcharts would look just like those in figure 3.15. | implement my design from flowcharts such as these 49 Implementing from Flowcharts In this chapter I will explain how to implement a micro- processor from the flowcharts using the MIN processor example. Although | do not explain how to make the choice between (for example) a microcoded and a combinational design (see chapter 9 for the discussion of implementation methods), | will tell you how to implement the flowcharts for a microcoded, a PLA, and a combinational design. Figure 4.1 is a block diagram of a simple microcoded controller, consistent with the function implied by the flowcharts. | show only enough detail for you to see the relationship between the controller and the flowcharts. This is how the controller operates. An instruction is fetched (we'll worry about how some other time) and eventually placed in IRE. Translation of part of IRE’s contents provides the control store address of the first word in the control word sequence for the instruction. Figure 4.2 shows the control store word format. Each flowchart state (see figure 3.15, pages 48-49) corresponds to a control word. The con- trol word can specify register transfers (data and control registers), the ALU function and condition code setting, the source of the next control store address, and the next con- trol store address. Implementing from Flowcharts Next State trol Control Store Control Instruction Decoders. ‘Contrat Word Register Branch Control Control Fields Unit (dynamic) Control Word Control Fields Decoders. (static) ‘Control Lines: Register Select ALU Function Bus Select Condition Codes Execution Unit Figure 4.1 Microprocessor block diagram (microcoded controller) The control word contains the address of the next control word for direct branches. For conditional branches, the next address would be modified by information from the execution unit (through the control store address modifier!. For branches from address made sequences to execution sequences or between whole instructions, the next address is a decoded IRE value (possibly modified by the control store next address [NA] field from the control word) Each control word has fields that are decoded to drive the con- trol lines in the execution unit and controller, The control word decoders (see figure 4.1) drive the control lines by mixing static information and timing signals with the control word fields. Static information does not change during the instruction execu- tion and can go directly to the control word decoders. The register fields in a register-to-register ADD instruction, for exam ple, do not change during instruction execution, The control 51 Microprocessor Logic Design oP tv NA Control fields Control store Next address address select \_— execution Unit Control ——1— State Sequencer Control — OP: Control Fields IB: Instruction Branch —next control store address. ‘Small fields of bits are decoded (by the control is from the control word decoders using IRE word logic decoders) to drive control lines in the for the next instruction) rece SB: Sequence Branch—next control store address TY: Control Store Address Select is from the control word decoders using IRE Next address (type) select for the next sequence to help execute the BC: Branch Conditionally —next control store current instruction) address is NA modified by a condition code NA: Next Address from the execution unit Next state (direct) address DB: Direct Branch—next control store address is NA Figure 4.2 Control store control word format word tells when to move values to and from the registers, but IRE fields tell which registers will be used, That means ADD register 3 to register § shares the same execution sequence with ADD register 1 to register 7. for example. The fields in IRE that select the registers are the static information. (Remember that static information does not change during execution of an instruction. The register designators are static information. How they are used in each cycle is dynamic information, You might send the register contents to the ALU in the first state and store a result in the same register in state 3. When to do what is dy- namic information; it changes with each state.) In a simple microcoded controller implementation, each state in the level 2 flowcharts corresponds to one control word. (Ina more complex controller, there may be more than one control word for each state. In the Motorola MC68000, for example, there is one control word for the execution unit and another for the state sequencer.) Since each state in the level 2 flowcharts maps to one word in the control store, the fewer states you have, the smaller the control store will be To personalize the control store, you must transform the flow: charts into control store bit patterns. Here are the trans: formations you need @ The tasks become bits in the control fields (OP) = The next state becomes the control store address select (TY) and next address (NA) 82 Implementing from Flowcharts m= The state ID becomes the location of the control word in the control store. These transformations can be done on a computer. You translate all the states into control word bit patterns. Each of the control word bit pattems (representing a flowchart state) occupies @ unique location in the control store, It is possible to have the computer assign control words to control store locations. For the Motorola MC68000 project, | assigned control store addresses, but for Micro'370, the computer assigned them. You may want to assign control store locations manually to make the control store address decoder smaller, Some control store addresses must be reserved for reset, interrupt, and other special sequences Figure 4.3 is a block diagram of a simple PLA controller. Note the strong similarity between the PLA controller in figure 4.3 and the microcoded controller in figure 4.1. | consider PLA control lers to be a variation of the microcoded controller. If the control PLA or Array PLA and Array Control Word Register Instruction Decoders. Contror Fields (dynamic) Control Word Control Fields Decoders. (static) Control Lines: Register select, ALU function Bus select Condition Codes Execution Unit Figure 4.3 Microprocessor block diagram (PLA controller! 53 Microprocessor Logic Design store address logic and the control store are implemented as an AND-OR PLA, the address logic would be the AND array and the control store would be the OR array. Another way to see the similarity between a microcoded imple- Mentation and a PLA implementation is to consider the control store to be an orderly decode of an input address into a control word. If the control store address logic (address decoder, branch control unit, and multiplexer) of figure 4.1 produced the control word directly (instead of the address of the word in the control store), it would behave exactly like a PLA. The flowcharts are used the same way except you may now be able to combine like states at other than the ends of sequences (provided the states can be made to lie logically next to each other for the AND decoder). Program the PLA OR array using the same flowchart transformations used for the microcoded controller, The PLA OR array contains the same bit patterns as the control store for the microcoded controller. Unused contro! store locations will be left out of the PLA. An apparent reduction in controller size may be possible using methods for splitting or folding a PLA. Although | will not discuss these methods here (they do not directly relate to using flowcharts), | do cover PLA folding briefly in chapter 8. | do not think it is a good idea to split or fold PLAs. There are many ways to design a combinational controller (also called a combinatorial, random logic, or hardwired controller), but | will describe only one, First, design a state sequencer to dupli- cate the state transitions in the flowcharts. The flowcharts contain a complete state diagram. Techniques for converting state diagrams to state sequencers are known (see references at the end of this chapter), so | will not discuss them here. Next, make as many copies of the flowcharts as there are different tasks in the flowchart sequences (each line in a state box is one task). Each copy will be assigned to a different task. On the copy. mark all occurrences of the assigned task, then write an equation for the task using state IDs. As an example, take the transfer of EDB to DI in figure 3.15. As- Sign this task to one copy of the flowcharts by highlighting all occurrences of the EDB-to-DI transfer. Write the equation for the ED8-to-DI transfer. If you chose to implement the controller from the level 2 flowcharts of figure 3.15, the equation for the EDB-to-DI transfer would be 54

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy