This Study Resource Was: Pipelining Analogy
This Study Resource Was: Pipelining Analogy
5 An Overview of Pipelining
Pipelining Analogy
m
Four loads:
er as
Speedup = 8/3.5 = 2.3
co
eH w
Non-stop:
o.
rs e = number of stages
ou urc
32 CS/ECE 3330 – Fall 2009
o
aC s
vi y re
MIPS Pipeline
Instr 1 IF ID EX MEM WB
Instr 2 IF ID EX MEM WB
Instr 3 IF ID EX MEM WB
Time Æ
33 CS/ECE 3330 – Fall 2009
https://www.coursehero.com/file/5657130/cs3330-chap4-pipeline-1/
1
Pipeline Performance
m
er as
lw 200ps 100 ps 200ps 200ps 100 ps 800ps
co
sw 200ps 100 ps 200ps 200ps 700ps
eH w
R-format 200ps 100 ps 200ps 100 ps 600ps
o.
beq 200ps 100 ps 200ps 500ps
rs e
ou urc
34 CS/ECE 3330 – Fall 2009
o
aC s
vi y re
Pipeline Performance
ed d
https://www.coursehero.com/file/5657130/cs3330-chap4-pipeline-1/
2
Pipeline Speedup
m
decrease
er as
co
eH w
o.
rs e
ou urc
36 CS/ECE 3330 – Fall 2009
o
aC s
vi y re
Load/store addressing
– Can calculate address in 3rd stage, access
memory in 4th stage
Alignment of memory operands
– Memory access takes only one cycle
https://www.coursehero.com/file/5657130/cs3330-chap4-pipeline-1/
3
Hazards
m
Deciding
er as
D idi on controll action
i depends
d d on previous
i
instruction
co
eH w
o.
rs e
ou urc
38 CS/ECE 3330 – Fall 2009
o
aC s
vi y re
Structure Hazards
instruction/data memories
Or separate instruction/data caches
https://www.coursehero.com/file/5657130/cs3330-chap4-pipeline-1/
4
Data Hazards
m
er as
co
eH w
o.
rs e
ou urc
40 CS/ECE 3330 – Fall 2009
o
aC s
vi y re
https://www.coursehero.com/file/5657130/cs3330-chap4-pipeline-1/
5
Load-Use Data Hazard
m
er as
co
eH w
o.
rs e
ou urc
42 CS/ECE 3330 – Fall 2009
o
aC s
vi y re
next instruction
ar stu
C code for A = B + E; C = B + F;
sh is
Th
https://www.coursehero.com/file/5657130/cs3330-chap4-pipeline-1/
6
Control Hazards
In MIPS pipeline
Need to compare registers and compute target
early in the pipeline
Add hardware to do it in ID stage
m
er as
co
eH w
o.
rs e
ou urc
44 CS/ECE 3330 – Fall 2009
o
aC s
vi y re
Stall on Branch
fetching next instruction
ar stu
sh is
Th
https://www.coursehero.com/file/5657130/cs3330-chap4-pipeline-1/
7
Branch Prediction
m
er as
co
eH w
o.
rs e
ou urc
46 CS/ECE 3330 – Fall 2009
o
aC s
vi y re
Prediction
correct
sh is
Th
Prediction
incorrect
https://www.coursehero.com/file/5657130/cs3330-chap4-pipeline-1/
8
More-Realistic Branch Prediction
m
er as
– When wrong, stall while re-fetching, and update history
co
eH w
o.
rs e
ou urc
48 CS/ECE 3330 – Fall 2009
o
aC s
vi y re
Pipeline Summary
Subject to hazards
Th
https://www.coursehero.com/file/5657130/cs3330-chap4-pipeline-1/
9
Powered by TCPDF (www.tcpdf.org)
Hazards
Structure Hazards
1
Data Hazards
Data Hazards
2
Forwarding (aka Bypassing)
3
Code Scheduling to Avoid Stalls
Control Hazards
4
Stall on Branch
slt $t0
$t0, $t1,
$t1 $t2 Always executes
beq END_OF_LOOP
add $t0, $t1, $t2 Control dependent
add $t0, $t0, $t1
END_OF_LOOP:
47 CS/ECE 3330 – Fall 2009
5
Branch Prediction
Prediction
correct
Prediction
incorrect
6
More-Realistic Branch Prediction
7
MIPS Pipelined Datapath
MEM
Right-to-left WB
flow leads to
hazards
Pipeline registers
8
Pipeline Operation
9
ID for Load, Store, …
EX for Load
10
MEM for Load
WB for Load
Wrong
register
number
11
Corrected Datapath for Load
EX for Store
12
MEM for Store
WB for Store
13
Multi-Cycle Pipeline Diagram
Traditional form
14
Single-Cycle Pipeline Diagram
Pipelining Demo
http://bellerofonte.dii.unisi.it/WEBMIPS/
15
4.7 Data Hazards: Forwarding vs. Stalling
Data Hazards in ALU Instructions
Consider this sequence:
sub $2, $1,$3
and $12,$2,$5
or $13,$6,$2
add $14,$2,$2
sw $15,100($2)
We can resolve hazards with forwarding
How do we detect when to forward?
1
Detecting the Need to Forward
2
Detecting the Need to Forward
Forwarding Paths
a. Without forwarding
3
Forwarding Paths
Forwarding Conditions
EX hazard
if (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0)
and (EX/MEM.RegisterRd = ID/EX.RegisterRs))
ForwardA = 10
if (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0)
and (EX/MEM.RegisterRd = ID/EX.RegisterRt))
ForwardB = 10
4
Forwarding Conditions
MEM hazard
if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0)
and (MEM/WB.RegisterRd = ID/EX.RegisterRs))
ForwardA = 01
if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0)
and (MEM/WB.RegisterRd = ID/EX.RegisterRt))
ForwardB = 01
5
Double Data Hazard
Initially:
$1 = 1
$2 = 2
$3 = 3
$4 = 4
6
Datapath with Forwarding
Need to stall
for one cycle
7
Load-Use Hazard Detection
8
How to Stall the Pipeline
Force control values in ID/EX register
to 0
EX, MEM and WB do nop (no-operation)
Prevent update of PC and IF/ID register
Using instruction is decoded again
Following instruction is fetched again
1-cycle stall allows MEM to read data for lw
– Can subsequently forward to EX stage
Stall inserted
here
9
Stall/Bubble in the Pipeline
Or, more
accurately…
10
The Big Picture
Stalls reduce performance
But are required to get correct results
Compiler can arrange code to avoid hazards
d stalls
and ll
Requires knowledge of the pipeline structure
Flush these
instructions
(Set control
values to 0)
PC
11
Reducing Branch Delay
12
Example: Branch Taken
Really, sll $0, $0, 0
… IF ID EX MEM WB
13
Data Hazards for Branches
add $ , $5,
$4, $ , $6
$ IF ID EX MEM WB
beq stalled IF ID
q stalled
beq IF ID
beq stalled ID
14
Dynamic Branch Prediction
outer: …
…
inner: …
…
beq …, …, inner
…
beq …, …, outer
15
2-Bit Predictor
16
Last Time
Data Hazards
Detection
Classification
Handling
H dli
Control Hazards and Branch Prediction
4.9 Exceptions
Exceptions and Interrupts
Interrupt
From an external I/O controller
Dealing
D li with them without
ith th ith t sacrificing
ifi i
performance is hard
1
Sample Exceptions
I/O request
Invoke the operating system from user
program
Arithmetic overflow
Undefined instruction
Hardware malfunction
Handling Exceptions
2
An Alternate Mechanism
Vectored Interrupts
Handler address determined by the cause
Example:
Undefined opcode: C000 0000
Overflow: C000 0020
…: C000 0040
Instructions either
Deal with the interrupt, or
Jump
J to
t reall handler
h dl
Handler Actions
3
Exceptions in a Pipeline
4
Exception Properties
Restartable exceptions
Pipeline can flush the instruction
Handler executes, then returns to the instruction
R f t h d and
– Refetched d executed
t d from
f t h
scratch
Exception Example
Exception on add in
40 sub $11, $2, $4
44 and $12, $2, $5
48 or $13 $2,
$13, $2 $6
4C add $1, $2, $1
50 slt $15, $6, $7
54 lw $16, 50($7)
…
Handler
80000180 sw $25,
$2 1000($0)
80000184 sw $26, 1004($0)
…
5
Exception Example
Exception Example
6
Multiple Exceptions
Imprecise Exceptions
7
Instruction-Level Parallelism (ILP)
4.10 Para
parallel
To increase ILP
ed ILP
• 16 BIPS, peak CPI = 0.25, peak IPC = 4
– But dependencies reduce this in practice
Multiple Issue
8
Speculation
Compiler/Hardware Speculation
9
Speculation and Exceptions
10
Scheduling Static Multiple Issue
Two-issue packets
One ALU/branch instruction
One load/store instruction
64-bit
64 bi aligned
li d
– ALU/branch, then load/store
– Pad an unused instruction with nop
11
MIPS with Static Dual Issue
12
Scheduling Example
Loop:
p lw $t0,
$ , 0($s1)
($ ) # $t0=array
$ y element
addu $t0, $t0, $s2 # add scalar in $s2
sw $t0, 0($s1) # store result
addi $s1, $s1,–4 # decrement pointer
bne $s1, $zero, Loop # branch $s1!=0
Loop Unrolling
13
Loop Unrolling Example
“Superscalar” processors
CPU decides whether to issue 0, 1, 2, … each
cycle
Avoiding structural and data hazards
Avoids the need for compiler scheduling
Though it may still help
Code semantics ensured by the CPU
14
Dynamic Pipeline Scheduling
Preserves
dependencies
Hold pending
operands
15
Register Renaming
Speculation
16
Why Do Dynamic Scheduling?
17
Power Efficiency
18