0% found this document useful (0 votes)
50 views20 pages

Instruction Pipelining: 1 Zelalem Birhanu, Aait

This document provides an overview of instruction pipelining and pipelining hazards. It discusses how pipelining can reduce the clock cycles needed to execute instructions by overlapping the fetch, decode, and execute stages. However, pipelining can introduce hazards like resource hazards when instructions compete for hardware resources and data hazards when instructions depend on results from previous instructions. The document outlines different types of data hazards and approaches to handling hazards like stalling the pipeline or forwarding data between stages.

Uploaded by

tesfu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views20 pages

Instruction Pipelining: 1 Zelalem Birhanu, Aait

This document provides an overview of instruction pipelining and pipelining hazards. It discusses how pipelining can reduce the clock cycles needed to execute instructions by overlapping the fetch, decode, and execute stages. However, pipelining can introduce hazards like resource hazards when instructions compete for hardware resources and data hazards when instructions depend on results from previous instructions. The document outlines different types of data hazards and approaches to handling hazards like stalling the pipeline or forwarding data between stages.

Uploaded by

tesfu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 20

Lecture 10

Instruction Pipelining

Zelalem Birhanu, AAiT 1


In this lecture:

Pipelining
Pipelining hazards
Resource hazards
Data hazards

Zelalem Birhanu, AAiT 2


Review


= .
CPI: Average clock cycle per instruction

e.g. Suppose a program has 10 instructions with the following


relationship between instructions and clock cycles required
to execute each instruction
No. of Clock The CPI for this program is given by:
instructions Cycles 41 + 32 + 33
4 1 10
= 1.9
3 2 (10 instructions with 19 clock cycles)
3 3
Zelalem Birhanu, AAiT 3
Review

To reduced execution time:


Reduce clock period (Increase clock frequency)
(Improve response time)

Reduce CPI (execute more instructions with the


same number of clock cycles)
(Improve throughput)
One approach to reduce CPI is to overlap execution of
instructions (pipelining)

Zelalem Birhanu, AAiT 4


Pipelining

Instruction cycle has several stages (fetch, decode,


execute)
Let instructions execute one after the other
(assume one clock cycle per stage (3 clock cycles per instruction) )

Clk
Instruction 1 Fetch Decode Execute
Instruction 2 Fetch Decode Execute
Instruction 3

9 clock cycles for 3 instructions, 3n clock cycles for n instructions


Zelalem Birhanu, AAiT 5
Pipeliningcntd

Let the instruction stages overlap


When instruction2 is being decoded, instruction1
is fetched and so on

Clk
Instruction 1 Fetch Decode Execute
Instruction 2 Fetch Decode Execute
Instruction 3 Fetch Decode Execute

5 clock cycles for 3 instructions (CPI is reduced)


Zelalem Birhanu, AAiT 6
Pipeliningcntd

Additional hardware is required for a pipelined


processor (pipeline registers between the stages)

PC FI/DI DI/EI
R R
e e
g g
Fetch i Decode i Execute
s s
(FI) t
(DI) t
(EI)
e e
r r
s s

Zelalem Birhanu, AAiT 7


More stages

In practice the three stages may take different times (clock


cycles): execution may take more time than decoding. This
would reduce the effectiveness of the pipeline

10ns 10ns 30ns

Fetch Decode Execute

Currently decoded instruction has to wait until previous


instruction is executed

Throughput is limited by the slowest stage

Zelalem Birhanu, AAiT 8


More stagescntd

If we have more stages:


The stages will be of more nearly equal duration
Program execution time is reduced more
e.g. 5-stage pipeline

10ns 10ns 10ns 10ns 10ns

Fetch Decode Fetch Execute Write


Instr. Instr. Operands Instr. Operand
(FI) (DI) (FO) (EI) (WO)

Operands can be fetched from memory or from registers


Operand can be written to memory or to registers
Zelalem Birhanu, AAiT 9
5-stage Pipeline

Assume:
All instructions require all the five stages
Equal duration for each stage
Time

I1 FI DI FO EI WO
I2 FI DI FO EI WO
I3 FI DI FO EI WO

Assuming one clock cycle per stage, 3 instructions


would require 7 clock cycles
Zelalem Birhanu, AAiT 10
Pipeline Performance

Assume an instruction goes through k stages and each stage has


a duration of
Without pipelining, execution time for n instructions (T) will be:
=
With pipelining
, = + 1

e.g. For =1, k=5, n=10


= 5 10 = 50
, = 5 + 10 1 = 14
50
Speed up factor of = 3.57
14
With pipelining the program is executed 3.57 times faster than
without pipelining
Zelalem Birhanu, AAiT 11
Pipeline Performancecntd


Speed up factor ( ) = =
, + 1

Zelalem Birhanu, AAiT 12


Pipeline Hazards

Some things could go wrong on real pipelined


executions
A pipeline hazard occurs when the pipeline, or some
portion of the pipeline, must stall (be idle) because
conditions do not permit continued execution

Pipeline hazards:
Resource (Structural) hazards
Data hazards
Control hazards

Zelalem Birhanu, AAiT 13


Resource Hazards

Occur when two or more instructions that are already in


the pipeline need the same resource
e.g. Memory access
Consider a 5-stage pipeline (each stage takes one cycle)
Time
Memory 1 2 3 4 5 6 7

Address
Instructions I1 FI DI FO EI WO
CPU I2 FI DI FO EI WO
Data Data
I3 FI DI FO EI WO

If operand is to be fetched from memory at stage 3 of the first instruction, a


resource hazard occurs while the processor tries to fetch third instruction
(both operations need to use the same bus)
Zelalem Birhanu, AAiT 14
Resource Hazardscntd

Therefore the fetch instruction stage of the pipeline must stall (be
idle) for one cycle (one more clock cycle required to execute the 3
instructions)
Time
1 2 3 4 5 6 7 8
I1 FI DI FO EI WO Assume all other
I2 FI DI FO EI WO operands are in
registers
I3 Idle FI DI FO EI WO

Another solution for resource hazards is to increase available


resources (e.g. Have separate data and instruction memory
with separate buses)

Zelalem Birhanu, AAiT 15


Data Hazards

Occur when one instruction depends on data value


produced by a preceding instruction
e.g.
R1 0
ADD R1,R2 (R1=1) R2 1
ADD R3,R1 (R3=3) R3 2
Wrong
value of R1 Time
is read 1 2 3 4 5 6 7
ADD R1,R2 FI DI FO EI WO
(R1=0) (R1=1)
ADD R3,R1 FI DI FO EI WO
(R1=0)
FI DI FO EI WO

Zelalem Birhanu, AAiT 16


Data Hazardscntd

Such hazard is termed as read after write (RAW) hazard since


current instruction must wait to read data until after a previous
instruction writes the correct data

The hazard occurs if read takes place before the write operation is
complete
Other types of data hazards:
Write after read (WAR)
Write after write (WAW)
Approaches for handling data hazards:
Avoid hazard
Detect and stall
Detect and forward

Zelalem Birhanu, AAiT 17


Data Hazardscntd

Write after Read (WAR) hazard


The hazard occurs if write takes place before a read operation is complete
Next instruction modifies (writes) operand before current instruction uses
(reads) the operand (Current instruction reads wrong value)
e.g. Add R4,R1,R3 (R4=R1+R3)
Add R3,R1,R2 (R3=R1+R2) If this happens first
WAR hazard occurs

Write after Write (WAW) hazard


Next instruction modifies (writes) operand before current instruction
modifies (writes) the operand (previous instruction reads wrong value)
Current instruction modifies operand before previous instruction uses the
operand (previous instruction reads wrong value)
These hazards occur with multiple pipelines (superscalar processors)

Zelalem Birhanu, AAiT 18


Data Hazardscntd

Avoid hazard
Make sure there are no hazards in the code
Put no operation instructions between dependent instructions
(programmer or compiler)
ADD R1,R2
NOP (no operation)
ADD R3,R1
Detect and stall (wait until the write operation is over)
Time
1 2 3 4 5 6 7
ADD R1,R2 FI DI FO EI WO
(R1=0) (R1=1)
ADD R3,R1 FI DI idle idle FO EI
(R1=1)
FI DI FO
19
More Readings

1. Computer Architecture and Organization,


William Stallings, 8th edition (section 12.4)

Zelalem Birhanu, AAiT 20

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy