0% found this document useful (0 votes)
9 views21 pages

Data Path

This chapter discusses the fundamental components of computer architecture, focusing on the MIPS processor's datapath and control unit. It outlines key performance factors such as instruction count, clock cycle time, and clock cycles per instruction, while detailing the construction of a simplified MIPS processor capable of executing basic instructions. The chapter also emphasizes design principles applicable across various processors and provides insights into the operation of the ALU, instruction fetching, and control mechanisms.

Uploaded by

yoosefelbooz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views21 pages

Data Path

This chapter discusses the fundamental components of computer architecture, focusing on the MIPS processor's datapath and control unit. It outlines key performance factors such as instruction count, clock cycle time, and clock cycles per instruction, while detailing the construction of a simplified MIPS processor capable of executing basic instructions. The chapter also emphasizes design principles applicable across various processors and provides insights into the operation of the ALU, instruction fetching, and control mechanisms.

Uploaded by

yoosefelbooz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

COMPUTER

ARCHITECTURE
Datapath
Introduction
The chapter begins by outlining three primary factors that determine computer performance:

Instruction Count
01 The number of instructions
executed by the program.

While instruction count is influenced by the compiler


Clock Cycle Time and instruction set architecture (ISA) (as explained in
02 Time for a complete clock cycle. Chapter 2), clock cycle time and CPI are determined
by the processor's implementation.

Clock Cycles per Instruction (CPI)


03 The average number of clock cycles each
instruction takes to execute.

This chapter focuses on how to build the datapath and control unit for two MIPS processor implementations.
It starts with a high-level abstract overview and then progresses to building a simple processor capable of executing
a subset of MIPS instructions.
The core of the chapter is dedicated to constructing a pipelined MIPS implementation,
and later sections address the design considerations needed for implementing more complex ISAs like x86.
Basic Implementation
The simplified MIPS processor in this chapter includes the following instructions:

Memory-Reference Arithmetic-Logical Control Flow


01 lw (load word), sw (store word) 02 add, sub, and, or, slt 03 beq (branch if equal), j (jump)

This subset omits some operations like shift, The discussion emphasizes how the instruction set
multiply, divide, and all floating-point instructions, architecture heavily influences implementation
but is sufficient to illustrate the fundamental decisions, and how implementation strategies in
principles of datapath construction and control design. turn impact clock rate and CPI.

Key design principles from Chapter 1—such as “Simplicity favors regularity”—are revisited here
in a practical context. Importantly, the design techniques introduced for this MIPS subset are applicable
across a wide range of processors—from high-performance servers to embedded systems.
Despite their differences, all instructions follow a similar initial sequence of execution:

Instruction Fetch Register Read


The program counter (PC) is used to fetch the instruction from memory. One or two registers are read, determined
PC: The register holding the address of the currently executing instruction. by fields in the instruction.
ALU Usage
The Arithmetic Logic Unit (ALU) is used by all instructions except jump:

Before the ALU After the ALU


Memory-reference: ALU calculates Memory-reference: Access memory
memory addresses. (read for lw, write for sw)

Arithmetic-logical: ALU performs Arithmetic-logical / Load: Write


operations (e.g., add, sub). ALU/memory result to a register.

Branch: ALU performs comparison Branch: Possibly update PC to the branch


for branching. address; otherwise, increment PC by 4.

Arithmetic Logic Unit (ALU)


Hardware that performs addition, subtraction,
and usually logical operations such as AND and OR.
High-Level Datapath
Figure 4.1: Provides a high-level view of the major functional units and their interconnections
for executing MIPS instructions (arithmetic-logical, memory-reference, and branch).

4 Fetch
Figure 4.1
The process of reading the next instruction
Add Add from memory, using the address stored
in the Program Counter (PC).

Data Instruction Fetch: The program counter (PC)


Register # ALU Address supplies the instruction address to the
PC Address Instruction instruction memory to fetch the instruction.
Registers Data
Instruction Register # memory
Register Access: Instruction fields specify
memory
Register # Data register operands, which are fetched from
Instruction Memory the register file.
Data Memory
Register File
Unit

Arithmetic-logical instructions: Operands are processed by the ALU, and the result is written back to the register file.
Memory-reference instructions (load/store): The ALU computes a memory address. For loads,
data is read from data memory and written to the register file; for stores, data from registers is written to memory. Interconnections
Branch instructions: The ALU compares operands to determine the next instruction address, which is either:
* The branch destination (PC + branch offset, computed by the ALU). * The next instruction (PC + 4, computed by an adder).
Multiplexors and Control
Figure 4.2: Refines Figure 4.1 by adding multiplexors and control lines to manage data source selection
and unit operations for the MIPS subset.
Branch

1 M
u 01 Top Mux
PC Source
x
4
2 M

Add Add
u
x 02 Middle Mux
Register Write Source
ALU operation

Data

PC Address Instruction
Register #
Registers
3 M ALU Address
MemWrite

Data
03 Bottom Mux
ALU Input
Register #
u Zero
Instruction memory
memory x
Register # RegWrite Data MemRead
04 Control

Figure 4.2
Control
4
Multiplexors and Control (Cont.)
Key Insight
Top Mux (PC Source) The control unit and multiplexors

01 Selects between PC + 4 (next instruction) or the branch destination address.


Controlled by the AND of the ALU’s Zero output (indicating equality for beq)
enable dynamic data routing and
unit operation, tailored to the
instruction type, leveraging MIPS’s
and a control signal for branch instructions. simplicity for straightforward control
signal generation.
Middle Mux (Register Write Source)
02 Chooses between the ALU output (for arithmetic-logical instructions)
or data memory output (for loads) to write to the register file.

Bottom Mux (ALU Input)


03 Selects the second ALU input: either a register value (for arithmetic-logical
or branch instructions) or the offset field from the instruction
(for load/store address calculations). Multiplexer M
u
Control Digital circuit that selects one of x
04 ALU operation (e.g., add, subtract, compare).
Data memory behavior (read for load, write for store).
several input signals and forwards
the selected input to a single output line,
Register file write (enabled for load or arithmetic-logical instructions). based on the value of control signals.
Logic Design Conventions
Clock
Establishes conventions for discussing computer hardware design, Synchronizes state element updates,
including how logic operates and how the system is clocked. ensuring data is written at specific times,
critical for predictable operation in the
MIPS implementation.

D Q

Combinational Element Clk

An operational component (e.g., Contain internal storage, allowing


AND gate, ALU) whose output them to retain data across clock cycles.
depends only on current inputs.

A memory component (e.g., register,


Produce the same output for the memory) that stores data, with outputs
influenced by both inputs and prior state.
same inputs (no internal storage).
A
State (Sequential) Element
Y
B
Clocking Methodology
Clocking Methodology
Defines when signals can be read and when they can be written The approach that determines when data
to ensure predictable hardware behavior. is valid and stable relative to the clock.
Prevents unpredictability from simultaneous read/write operations,
which could result in reading the old value, new value, or an undefined mix.
Edge-Triggered Clocking
Edge-Triggered Clocking A scheme where state changes occur
Assumes an edge-triggered clocking methodology, where state elements only on a clock edge.
(e.g., registers, memory) update their stored values only on a clock edge
(a rapid transition from low to high or high to low). 1

In this case, all state elements are positive edge-triggered, 0


meaning updates occur on the rising clock edge.
Interaction of Logic and Clock Clock Edge

Combinational logic operates between state elements within a single clock cycle:
• Inputs come from a set of state elements (values written in a previous clock cycle).
• Outputs are written to another set of state elements (for use in the next clock cycle).
Signals must propagate from one state element (state element 1), through combinational logic,
to another state element (state element 2) within one clock cycle.
The clock cycle length is determined by the time required for signals to travel from state element 1 to state element 2.
Clocking Methodology (Cont.)
Timing Requirements
Inputs to a state element must reach a stable value (unchanging until
after the clock edge) before the active (rising) clock edge triggers the state update.
This ensures reliable operation in a synchronous digital system,
where the clock dictates when state elements write values to internal storage.

State State
element Combinational logic element State
1 2 Combinational logic
element

Clock cycle Figure 4.4


Figure 4.3

Depicts combinational logic sandwiched between two state elements: • Illustrates how edge-triggered clocking enables safe
• State element 1 provides inputs to combinational logic. read-process-write operations within one clock cycle.
• Combinational logic processes the inputs and sends outputs to state • Shows a state element feeding combinational logic,
element 2. with the output written back to the state element on
• The clock synchronizes the write operation to state element 2 on the
rising edge.
the rising clock edge, avoiding feedback issues.
• Emphasizes that the clock cycle must allow enough
time for stable inputs.
Building a Datapath
Begins by identifying the major components required for each MIPS instruction class
(e.g., R-format, memory-reference, branch).

Instruction

Instruction Fetch Components address

Instruction

Instruction Memory (Figure 4.5a) Instruction


memory
A state element that stores the program’s instructions and provides Instruction Memory
read-only access during execution, given an address.
Figure 4.5a
Treated as combinational logic for simplicity (outputs reflect
the addressed location without a read control signal).
Writing to instruction memory (e.g., loading a program) is ignored for simplicity but can be added. PC

Program Counter (PC) (Figure 4.5b)


Program Counter (PC)
A 32-bit register (state element) that holds the address of the current instruction.
Figure 4.5b
Written at the end of every clock cycle to point to the next instruction,
so no explicit write control signal is needed.

Adder (Figure 4.5c) Adder


Add
Adds 4 to the PC to compute the address of the next sequential instruction (since MIPS instructions are 4 bytes).
Figure 4.5c
Instruction Fetch Datapath
Figure 4.6: Combines the instruction memory, PC, and adder to:
• Fetch the instruction from memory using the PC’s address.
• Increment the PC by 4 (via the adder) to prepare for the next instruction.

Datapath Element
A processor unit that operates on or holds
Add data, including instruction/data memories,
4
register file, ALU, and adders.
Figure 4.6
Read
PC
address

Instruction

Instruction
memory

Forms the initial datapath for fetching instructions and updating the PC.
R-Format Instructions
Include arithmetic-logical instructions (e.g., add, sub, AND, OR, slt), also called R-type instructions.
Execution steps
Register File
(e.g., for add $t1, $t2, $t3):
A state element containing the
• Read two registers ($t2 and $t3) from the register file. processor’s 32 general-purpose registers,
• Perform an ALU operation (e.g., addition) on the register contents. allowing any register to be read or written
• Write the result to a register ($t1). by specifying its register number.
• These instructions share a common pattern, requiring the register file and ALU.

Register File (Figure 4.7a) Function


Inputs (total of four): Stores the register state of the computer,
• Two 5-bit inputs for Read register numbers (to select two registers from 32). critical for R-format instructions.
• One 5-bit input for the Write register number.
• One 32-bit input for the Write data. 5 Read
register 1 Read
Outputs (two): data 1
Register 5 Read
• Two 32-bit outputs for the data read from the two selected registers. numbers register 2
Registers Data
5 Write
register Read
data 2
Write
Data
Data

RegWrite Figure 4.7a


R-Format Instructions (Cont.) 5 Read
register 1 Read
data 1
Control (RegWrite): Register 5 Read
numbers register 2
• Always outputs the contents of the registers specified by the Registers Data
Read register inputs (no read control signal needed). 5 Write
register Read
• Writes are edge-triggered, controlled by a write control signal
data 2
that must be asserted on the clock edge. Data
Write
Data
• All write inputs (write data, register number, write control)
must be valid at the clock edge. RegWrite Figure 4.7a
Edge-Triggered Advantage:
Allows reading and writing the same register in one clock cycle:
• Read retrieves the value from a previous clock cycle. 4 ALU operation
• Write updates the register for use in a subsequent clock cycle.

ALU (Figure 4.7b)


ALU
ALU
Function: Performs arithmetic or logical operations on two 32-bit inputs result
for R-format instructions, producing a 32-bit result.
Inputs (three): Figure 4.7b
• Two 32-bit data inputs (from the register file).
• A 4-bit control signal (ALU operation) to specify the operation (e.g., add, subtract, AND).
Outputs (two):
• One 32-bit result.
• A 1-bit Zero signal (indicates if the result is zero, used for branch instructions).
Load/Store Instructions
(e.g., lw $t1, offset($t2), sw $t1, offset($t2)):

Operation
• Compute a memory address by adding the base register ($t2) to a 16-bit signed offset from the instruction.
• Load (lw): Reads data from memory and writes it to the register file ($t1).
• Store (sw): Reads data from the register file ($t1) and writes it to memory.
MemWrite
Data Memory Unit (Figure 4.8a) Read
Address
• Has read (for lw) and write (for sw) control signals. data
• Inputs: Address (from ALU) and write data (from register file for sw). Data
• Output: Read data (to register file for lw). Write memory
data
• Control: Data memory requires explicit read/write control signals,
unlike instruction memory. MemRead

Sign-Extension Unit (Figure 4.8b) Figure 4.8a


Extends the 16-bit offset to a 32-bit signed value by replicating the sign bit.

Datapath Elements (Figure 4.8) 16 Sign- 32

• Register File (from Figure 4.7a): Reads $t2 (base address) for both instructions; extend
reads $t1 (data to store) for sw, writes $t1 (loaded data) for lw.
• ALU (from Figure 4.7b): Adds the base register and
sign-extended offset to compute the memory address.
Figure 4.8b
Branch Instruction
(e.g., beq $t1, $t2, offset): PC + 4 from instruction datapath

Branch
Add Sum
target
Shift
left 2

Read 4 ALU operation


Instruction register 1 Read
data 1
Read
register 2 To branch
Registers ALU Zero
control logic
Write
register Read
data 2
Write
Data

RegWrite
Figure 4.9
16 Sign- 32

extend

opcode address
31:26 25:0
Branch Instruction (Cont.)
(e.g., beq $t1, $t2, offset):

Operation
• Compares two registers ($t1, $t2) for equality.
• Computes the branch target address by adding the sign-extended 16-bit offset
(shifted left by 2 bits for word alignment) to the PC + 4 (address of the instruction following the branch).
• If equal (branch taken), the PC is set to the branch target address;
if not equal (branch not taken), the PC is set to PC + 4.

Datapath Elements (Figure 4.9)


• Register File (from Figure 4.7a): Reads $t1 and $t2 for comparison (no write needed).
• ALU (from Figure 4.7b): Performs subtraction on the two register values;
the Zero output indicates equality (used for beq).
• Sign-Extension Unit (from Figure 4.8b): Extends the 16-bit offset to 32 bits.
• Adder: Adds the sign-extended, left-shifted offset to PC + 4 to compute the branch target address.
• Shift Logic: Shifts the offset left by 2 bits (multiplying by 4) to align with word boundaries.

Jump Instruction
• Operation: Replaces the lower 28 bits of the PC with the instruction’s
26-bit offset shifted left by 2 bits (effectively appending 00).
• Datapath: Requires concatenating the offset with appropriate bits.
Creating a Single Datapath
Single-Cycle Datapath Design
• Objective: Create a datapath that executes all instructions (R-format, load/store, branch) in one clock cycle.
• Constraint: No datapath resource can be used more than once per instruction,
requiring duplication of elements needed multiple times (e.g., separate instruction memory and data memory).
• Sharing Elements: Elements used by multiple instruction classes can be shared using
multiplexors with control signals to select the appropriate input.

Combining Datapath Components

01 Instruction Fetch
(from Figure 4.6)
Includes instruction memory, program counter (PC),
and an adder to compute PC + 4 (next instruction address).

02 R-Format and Memory Instructions


(from Figure 4.10)
Combines the register file, ALU (from Figure 4.7),
data memory, and sign-extension unit (from Figure 4.8).

Uses two multiplexors to:


• Select the ALU’s second input (register or sign-extended offset for load/store).
• Select the register write data (ALU result for R-format or memory data for load).
Creating a Single Datapath (Cont.)
Read
register 1 4 ALU operation
Read MemWrite
Read MemtoReg
data 1
Instruction register 2 Zero
ALUSrc
Registers
Write Read ALU ALU
0 Address Read 1
register data 2 M result data M
Write
u u
x Data
Data memory x
1 0
RegWrite Write
data
16 32
Sign- MemRead
extend
Figure 4.10

03 Branch Instructions
(from Figure 4.9)
• Uses the register file (to read two registers), ALU (for comparison), sign-extension unit,
shift-left-2 logic, and an adder (to compute branch target address).
• The main ALU is reused for branch comparison (checking equality),
so the branch target adder (from Figure 4.9) is retained separately.
Creating a Single Datapath (Cont.)
PCSrc

M
Add u
x
4 ALU
Add
result
Shift
left 2

Read
register 1 ALUSrc 4 ALU operation
Read MemWrite
PC address
Read
Read MemtoReg
data 1
register 2 Zero
Instruction Registers
Write Read ALU ALU Address Read
Instruction register data 2 M result data
memory M
u u
Write Data
Data
x x
memory
RegWrite Write
data
16 32
Sign- MemRead
extend
Figure 4.11
Creating a Single Datapath (Cont.)
Integration (Figure 4.11)
• Combines components from Figures 4.6, 4.9, and 4.10 into a single datapath for the core MIPS architecture.
• Supports load/store, R-format (ALU operations), and branch instructions in one clock cycle.
Adds one additional multiplexor to select between:
• PC + 4 (sequential instruction address).
• Branch target address (for taken branches).
Control Unit Requirements
The control unit generates:
• Write signals for state elements (e.g., register file, data memory, PC).
• Selector controls for multiplexors (e.g., for PC source, ALU input, register write source).
• ALU control signals (to specify operations like add, subtract, or compare).

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy