0% found this document useful (0 votes)
8 views37 pages

05 Pipelining

The document discusses pipelining in computer architecture, focusing on its impact on performance, specifically through the MIPS five-stage pipeline. It outlines different types of pipeline hazards, including structural, data, and control hazards, and explains their implications on instruction execution. Additionally, it covers techniques for managing these hazards to maintain efficient processing within the pipeline stages.

Uploaded by

xlraltius
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views37 pages

05 Pipelining

The document discusses pipelining in computer architecture, focusing on its impact on performance, specifically through the MIPS five-stage pipeline. It outlines different types of pipeline hazards, including structural, data, and control hazards, and explains their implications on instruction execution. Additionally, it covers techniques for managing these hazards to maintain efficient processing within the pipeline stages.

Uploaded by

xlraltius
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

PIPELINING: HAZARDS

Mahdi Nazm Bojnordi


Assistant Professor
School of Computing
University of Utah

CS/ECE 6810: Computer Architecture


Overview
¨ Announcement
¤ Homework 1 submission deadline: Jan. 30th

¨ This lecture
¤ Impacts of pipelining on performance
¤ The MIPS five-stage pipeline

¤ Pipeline hazards
n Structural
hazards
n Data hazards
Pipelining Technique
¨ Improving throughput at the expense of latency
¤ Delay:D = T + nδ
¤ Throughput: IPS = n/(T + nδ)

Combinational Logic D=
Critical Path Delay = 30 IPS =

Combinational Logic Combinational Logic D=


Critical Path Delay = 15 Critical Path Delay = 15 IPS =

Comb. Logic Comb. Logic Comb. Logic D=


Delay = 10 Delay = 10 Delay = 10 IPS =
Pipelining Technique
¨ Improving throughput at the expense of latency
¤ Delay:D = T + nδ
¤ Throughput: IPS = n/(T + nδ)

Combinational Logic D = 31
Critical Path Delay = 30 IPS = 1/31

Combinational Logic Combinational Logic D = 32


Critical Path Delay = 15 Critical Path Delay = 15 IPS = 2/32

Comb. Logic Comb. Logic Comb. Logic D = 33


Delay = 10 Delay = 10 Delay = 10 IPS = 3/33
Pipelining Latency vs. Throughput
¨ Theoretical delay and throughput models for
perfect pipelining

Delay (D) Throughput (IPS)


20
Relative Performance

15
10
5
0
0 50 100 150 200
Number of Pipeline Stages
Five Stage MIPS Pipeline
Simple Five Stage Pipeline
¨ A pipelined load-store architecture that processes
up to one instruction per cycle

Write Back

PC

Inst. Register Data


ALU
Memory File Memory

Inst. Fetch Inst. Decode Execute Memory


Instruction Fetch
¨ Read an instruction from memory (I-Cache)
¤ Usethe program counter (PC) to index into the I-
Memory
¤ Compute NPC by incrementing current PC
n What about branches?

¨ Update pipeline registers


¤ Write the instruction into the pipeline registers
Instruction Fetch

clock

Branch Target

NPC = PC + 4

NPC
clock PC +

4 Why increment
by 4?

Instruction
Memory

Pipeline
Register
Instruction Fetch

clock
P3
Branch Target

NPC = PC + 4

NPC
clock PC +

P2
4 Why increment
by 4?

Instruction
P1
Memory

Critical Path = Max{P1, P2, P3} Pipeline


Register
Instruction Decode
¨ Generate control signals for the opcode bits

¨ Read source operands from the register file (RF)


¤ Use the specifiers for indexing RF
n How many read ports are required?

¨ Update pipeline registers


¤ Send the operand and immediate values to next stage
¤ Pass control signals and NPC to next stage
Instruction Decode

NPC target

NPC
reg
Register
Instruction

File

reg
ctrl
decode

Pipeline Pipeline
Register Register
Execute Stage
¨ Perform ALU operation
¤ Compute the result of ALU
n Operation type: control signals
n First operand: contents of a register
n Second operand: either a register or the immediate value

¤ Compute branch target


n Target = NPC + immediate
¨ Update pipeline registers
¤ Control signals, branch target, ALU results, and
destination
Execute Stage

Target
NPC

Res
reg

ALU

reg
reg

ctrl
ctrl

Pipeline Pipeline
Register Register
Memory Access
¨ Access data memory
¤ Load/store address: ALU outcome
¤ Control signals determine read or write access

¨ Update pipeline registers


¤ ALU results from execute
¤ Loaded data from D-Memory

¤ Destination register
Memory Access
Target

Res
Res

addr

Dat
reg

Memory
data data

ctrl
ctrl

Pipeline Pipeline
Register Register
Register Write Back
¨ Update register file
¤ Controlsignals determine if a register write is needed
¤ Only one write port is required
n Write the ALU result to the destination register, or
n Write the loaded data into the register file
Five Stage Pipeline
¨ Ideal pipeline: IPC=1
¤ Isthere enough resources to keep the pipeline stages
busy all the time?

Inst. Fetch Decode Execute Memory Writeback

+
PC +
Reg. ALU Reg.
4
File Mem File
Mem
Pipeline Hazards
Pipeline Hazards
¨ Structural hazards: multiple instructions compete for
the same resource

¨ Data hazards: a dependent instruction cannot


proceed because it needs a value that hasn’t been
produced

¨ Control hazards: the next instruction cannot be


fetched because the outcome of an earlier branch is
unknown
Structural Hazards
¨ 1. Unified memory for instruction and data

R1ß Mem[R2]

R3ß Mem[R20]

R6ß R4-R5

R7ß R1+R0
Structural Hazards
¨ 1. Unified memory for instruction and data

R1ß Mem[R2]

R3ß Mem[R20]

R6ß R4-R5

R7ß R1+R0

Separate inst. and data memories.


Structural Hazards
¨ 1. Unified memory for instruction and data
¨ 2. Register file with shared read/write access ports

R1ß Mem[R2]

R3ß Mem[R20]

R6ß R4-R5

R7ß R1+R0
Structural Hazards
¨ 1. Unified memory for instruction and data
¨ 2. Register file with shared read/write access ports

R1ß Mem[R2]

R3ß Mem[R20]

R6ß R4-R5

R7ß R1+R0

Register access in half cycles.


Data Hazards
¨ True dependence: read-after-write (RAW)
¤ Consumer has to wait for producer
Loading data from memory.

R1ß Mem[R2]

R3ß R1+R0

R4ß R1-R3
Data Hazards
¨ True dependence: read-after-write (RAW)
¤ Consumer has to wait for producer
Loaded data will be available two cycles later.

R1ß Mem[R2]

R3ß R1+R0

R4ß R1-R3
Data Hazards
¨ True dependence: read-after-write (RAW)
¤ Consumer has to wait for producer
Inserting two bubbles.

R1ß Mem[R2]

Nothing

Nothing

R3ß R1+R0

R4ß R1-R3
Data Hazards
¨ True dependence: read-after-write (RAW)
¤ Consumer has to wait for producer
Inserting single bubble + RF bypassing.

R1ß Mem[R2]

Nothing

R3ß R1+R0

R4ß R1-R3
Load delay slot.
SW vs. HW management?
Data Hazards
¨ True dependence: read-after-write (RAW)
¤ Consumer has to wait for producer
Using the result of an ALU instruction.

R1ß R2+R3

R5ß R1+R0

R3ß R1+R0

R4ß R1-R3
Data Hazards
¨ True dependence: read-after-write (RAW)
¤ Consumer has to wait for producer
Using the result of an ALU instruction.

R1ß R2+R3

R5ß R1+R0

R3ß R1+R0

R4ß R1-R3

Forwarding ALU result.


Data Hazards
¨ True dependence: read-after-write (RAW)
¨ Anti dependence: write-after-read (WAR)
¤ Write must wait for earlier read

R1ß R2+R1

R2ß R8+R9
Data Hazards
¨ True dependence: read-after-write (RAW)
¨ Anti dependence: write-after-read (WAR)
¤ Write must wait for earlier read

R1ß R2+R1

R2ß R8+R9

No WAR hazards in 5-stage pipeline!


Data Hazards
¨ True dependence: read-after-write (RAW)
¨ Anti dependence: write-after-read (WAR)
¨ Output dependence: write-after-write (WAW)
¤ Old writes must not overwrite the younger write

R1ß R2+R3

R1ß R8+R9
Data Hazards
¨ True dependence: read-after-write (RAW)
¨ Anti dependence: write-after-read (WAR)
¨ Output dependence: write-after-write (WAW)
¤ Old writes must not overwrite the younger write

R1ß R2+R3

R1ß R8+R9

No WAW hazards in 5-stage pipeline!


Data Hazards
¨ Forwarding with additional hardware
Data Hazards
¨ How to detect and resolve data hazards
¤ Show all of the data hazards in the code below

R1ß Mem[R2]

R2ß R1+R0

R1ß R1-R2

Mem[R3] ß R2
Data Hazards
¨ How to detect and resolve data hazards
¤ Show all of the data hazards in the code below

R1ß Mem[R2]
WAR

WAW R2ß R1+R0

R1ß R1-R2 RAW

Mem[R3] ß R2

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy