Data Path
Data Path
ARCHITECTURE
Datapath
Introduction
The chapter begins by outlining three primary factors that determine computer performance:
Instruction Count
01 The number of instructions
executed by the program.
This chapter focuses on how to build the datapath and control unit for two MIPS processor implementations.
It starts with a high-level abstract overview and then progresses to building a simple processor capable of executing
a subset of MIPS instructions.
The core of the chapter is dedicated to constructing a pipelined MIPS implementation,
and later sections address the design considerations needed for implementing more complex ISAs like x86.
Basic Implementation
The simplified MIPS processor in this chapter includes the following instructions:
This subset omits some operations like shift, The discussion emphasizes how the instruction set
multiply, divide, and all floating-point instructions, architecture heavily influences implementation
but is sufficient to illustrate the fundamental decisions, and how implementation strategies in
principles of datapath construction and control design. turn impact clock rate and CPI.
Key design principles from Chapter 1—such as “Simplicity favors regularity”—are revisited here
in a practical context. Importantly, the design techniques introduced for this MIPS subset are applicable
across a wide range of processors—from high-performance servers to embedded systems.
Despite their differences, all instructions follow a similar initial sequence of execution:
4 Fetch
Figure 4.1
The process of reading the next instruction
Add Add from memory, using the address stored
in the Program Counter (PC).
Arithmetic-logical instructions: Operands are processed by the ALU, and the result is written back to the register file.
Memory-reference instructions (load/store): The ALU computes a memory address. For loads,
data is read from data memory and written to the register file; for stores, data from registers is written to memory. Interconnections
Branch instructions: The ALU compares operands to determine the next instruction address, which is either:
* The branch destination (PC + branch offset, computed by the ALU). * The next instruction (PC + 4, computed by an adder).
Multiplexors and Control
Figure 4.2: Refines Figure 4.1 by adding multiplexors and control lines to manage data source selection
and unit operations for the MIPS subset.
Branch
1 M
u 01 Top Mux
PC Source
x
4
2 M
Add Add
u
x 02 Middle Mux
Register Write Source
ALU operation
Data
PC Address Instruction
Register #
Registers
3 M ALU Address
MemWrite
Data
03 Bottom Mux
ALU Input
Register #
u Zero
Instruction memory
memory x
Register # RegWrite Data MemRead
04 Control
Figure 4.2
Control
4
Multiplexors and Control (Cont.)
Key Insight
Top Mux (PC Source) The control unit and multiplexors
D Q
Combinational logic operates between state elements within a single clock cycle:
• Inputs come from a set of state elements (values written in a previous clock cycle).
• Outputs are written to another set of state elements (for use in the next clock cycle).
Signals must propagate from one state element (state element 1), through combinational logic,
to another state element (state element 2) within one clock cycle.
The clock cycle length is determined by the time required for signals to travel from state element 1 to state element 2.
Clocking Methodology (Cont.)
Timing Requirements
Inputs to a state element must reach a stable value (unchanging until
after the clock edge) before the active (rising) clock edge triggers the state update.
This ensures reliable operation in a synchronous digital system,
where the clock dictates when state elements write values to internal storage.
State State
element Combinational logic element State
1 2 Combinational logic
element
Depicts combinational logic sandwiched between two state elements: • Illustrates how edge-triggered clocking enables safe
• State element 1 provides inputs to combinational logic. read-process-write operations within one clock cycle.
• Combinational logic processes the inputs and sends outputs to state • Shows a state element feeding combinational logic,
element 2. with the output written back to the state element on
• The clock synchronizes the write operation to state element 2 on the
rising edge.
the rising clock edge, avoiding feedback issues.
• Emphasizes that the clock cycle must allow enough
time for stable inputs.
Building a Datapath
Begins by identifying the major components required for each MIPS instruction class
(e.g., R-format, memory-reference, branch).
Instruction
Instruction
Datapath Element
A processor unit that operates on or holds
Add data, including instruction/data memories,
4
register file, ALU, and adders.
Figure 4.6
Read
PC
address
Instruction
Instruction
memory
Forms the initial datapath for fetching instructions and updating the PC.
R-Format Instructions
Include arithmetic-logical instructions (e.g., add, sub, AND, OR, slt), also called R-type instructions.
Execution steps
Register File
(e.g., for add $t1, $t2, $t3):
A state element containing the
• Read two registers ($t2 and $t3) from the register file. processor’s 32 general-purpose registers,
• Perform an ALU operation (e.g., addition) on the register contents. allowing any register to be read or written
• Write the result to a register ($t1). by specifying its register number.
• These instructions share a common pattern, requiring the register file and ALU.
Operation
• Compute a memory address by adding the base register ($t2) to a 16-bit signed offset from the instruction.
• Load (lw): Reads data from memory and writes it to the register file ($t1).
• Store (sw): Reads data from the register file ($t1) and writes it to memory.
MemWrite
Data Memory Unit (Figure 4.8a) Read
Address
• Has read (for lw) and write (for sw) control signals. data
• Inputs: Address (from ALU) and write data (from register file for sw). Data
• Output: Read data (to register file for lw). Write memory
data
• Control: Data memory requires explicit read/write control signals,
unlike instruction memory. MemRead
• Register File (from Figure 4.7a): Reads $t2 (base address) for both instructions; extend
reads $t1 (data to store) for sw, writes $t1 (loaded data) for lw.
• ALU (from Figure 4.7b): Adds the base register and
sign-extended offset to compute the memory address.
Figure 4.8b
Branch Instruction
(e.g., beq $t1, $t2, offset): PC + 4 from instruction datapath
Branch
Add Sum
target
Shift
left 2
RegWrite
Figure 4.9
16 Sign- 32
extend
opcode address
31:26 25:0
Branch Instruction (Cont.)
(e.g., beq $t1, $t2, offset):
Operation
• Compares two registers ($t1, $t2) for equality.
• Computes the branch target address by adding the sign-extended 16-bit offset
(shifted left by 2 bits for word alignment) to the PC + 4 (address of the instruction following the branch).
• If equal (branch taken), the PC is set to the branch target address;
if not equal (branch not taken), the PC is set to PC + 4.
Jump Instruction
• Operation: Replaces the lower 28 bits of the PC with the instruction’s
26-bit offset shifted left by 2 bits (effectively appending 00).
• Datapath: Requires concatenating the offset with appropriate bits.
Creating a Single Datapath
Single-Cycle Datapath Design
• Objective: Create a datapath that executes all instructions (R-format, load/store, branch) in one clock cycle.
• Constraint: No datapath resource can be used more than once per instruction,
requiring duplication of elements needed multiple times (e.g., separate instruction memory and data memory).
• Sharing Elements: Elements used by multiple instruction classes can be shared using
multiplexors with control signals to select the appropriate input.
01 Instruction Fetch
(from Figure 4.6)
Includes instruction memory, program counter (PC),
and an adder to compute PC + 4 (next instruction address).
03 Branch Instructions
(from Figure 4.9)
• Uses the register file (to read two registers), ALU (for comparison), sign-extension unit,
shift-left-2 logic, and an adder (to compute branch target address).
• The main ALU is reused for branch comparison (checking equality),
so the branch target adder (from Figure 4.9) is retained separately.
Creating a Single Datapath (Cont.)
PCSrc
M
Add u
x
4 ALU
Add
result
Shift
left 2
Read
register 1 ALUSrc 4 ALU operation
Read MemWrite
PC address
Read
Read MemtoReg
data 1
register 2 Zero
Instruction Registers
Write Read ALU ALU Address Read
Instruction register data 2 M result data
memory M
u u
Write Data
Data
x x
memory
RegWrite Write
data
16 32
Sign- MemRead
extend
Figure 4.11
Creating a Single Datapath (Cont.)
Integration (Figure 4.11)
• Combines components from Figures 4.6, 4.9, and 4.10 into a single datapath for the core MIPS architecture.
• Supports load/store, R-format (ALU operations), and branch instructions in one clock cycle.
Adds one additional multiplexor to select between:
• PC + 4 (sequential instruction address).
• Branch target address (for taken branches).
Control Unit Requirements
The control unit generates:
• Write signals for state elements (e.g., register file, data memory, PC).
• Selector controls for multiplexors (e.g., for PC source, ALU input, register write source).
• ALU control signals (to specify operations like add, subtract, or compare).