Presentation 35191 Content Document 20250423021246PM
Presentation 35191 Content Document 20250423021246PM
Dr Shashikiran V , Associate Professor, Dept of CSE, Dayananda Sagar University and Team
1
2
WHY HOW much speedup in the instruction execution rate ?
P
I
P
E
L
I
N
E
3
4
Pipelining – Basic concepts
5
6
Pipelining – Basic concepts contd.
7
5 stage – pipelined processor
8
5 stages:
Stage 1:Instruction Fetch (IF):
• In this stage, the processor fetches the instruction from
memory using the program counter (PC)
and
increments the PC to point to the next instruction.
The fetched instruction is then placed into an instruction
register.
9
5 stages:
Stage 2: Instruction Decode (ID):
In this stage, the fetched instruction is decoded to
determine what operation it specifies
and
what operands are involved.
Any necessary register values are also read in this stage.
10
5 stages:
Stage 3: Execution (EX):
11
5 stages:
Stage 4: Memory Access (MEM):
12
5 stages:
Stage 5: Write Back (WB):
13
Example:
Executing a single instruction, "Add R1, R2, R3," in a five-stage pipelined processor.
• IF: Fetches the instruction "Add R1, R2, R3" from memory.
• ID: Decodes the instruction and reads operands from the register file (R2 and R3).
14
Register Files
(Required to understand Data Path)
15
Signals for handling read and write operations within the
register file:
18
Datapath for MIPS processor – Single cycle
19
How a five-stage pipelined datapath executes a register-to-register
ALU instruction (such as ADD, SUB, AND, OR) in a CPU architecture:
1.Fetch (IF):
1. The Fetch stage retrieves the instruction from memory based on the current value of the program counter
(PC).
2. The fetched instruction is then stored in the instruction register (IR).
2.Decode (ID):
1. In the Decode stage, the instruction in the IR is decoded.
2. The control unit interprets the instruction and generates control signals for subsequent stages based on the
instruction's operation code (opcode).
3. The source and destination registers (Rs, Rt, and Rd) are identified during this stage.
3.Execute (EX):
1. The Execute stage performs the actual ALU operation specified by the instruction.
2. It takes the values from the source registers (Rs and Rt), performs the ALU operation, and produces the result.
3. For example, if the instruction is "ADD R1, R2, R3," the ALU would add the values stored in registers R2 and
R3.
4.Memory Access (MEM):
1. For a register-to-register ALU instruction, there is no memory access involved. Therefore, this stage typically
performs no operation (a "bubble" stage).
5.Write Back (WB):
1. In the Write Back stage, the result of the ALU operation is written back to the destination register (Rd).
2. If the instruction is "ADD R1, R2, R3," the result of the addition operation would be written back to register R1.
20
Datapath for MIPS processor – Single cycle
It ensures consistency
in operand sizes and
proper execution of
instructions.
21
How a five-stage pipelined datapath executes a register-immediate
ALU instruction (such as ADDI, SUBI, ANDI, ORI) in a CPU architecture:
1.Fetch (IF):
1. The Fetch stage retrieves the instruction from memory based on the current value of the program counter (PC).
2. The fetched instruction is then stored in the instruction register (IR).
2.Decode (ID):
1. In the Decode stage, the instruction in the IR is decoded.
2. The control unit interprets the instruction and generates control signals for subsequent stages based on the
instruction's operation code (opcode).
3. The source register (Rs), destination register (Rd), and immediate value (imm) are identified during this stage.
3.Execute (EX):
1. The Execute stage performs the actual ALU operation specified by the instruction, along with immediate value.
2. It takes the value from the source register (Rs), the immediate value (imm), performs the ALU operation, and
produces the result.
3. For example, if the instruction is "ADDI R1, R2, 10," the ALU would add the value stored in register R2 with the
immediate value 10.
4.Memory Access (MEM):
1. For a register-immediate ALU instruction, there is no memory access involved. Therefore, this stage typically
performs no operation (a "bubble" stage).
5.Write Back (WB):
1. In the Write Back stage, the result of the ALU operation is written back to the destination register (Rd).
2. If the instruction is "ADDI R1, R2, 10," the result of the addition operation would be written back to register R1.
22
Datapath for MIPS processor – Single cycle
23
Harvardcycle
Datapath for MIPS processor – Single style: separate read-only program
memory & - read/write data memory
24
In a five-stage pipelined CPU architecture, the datapath for load
and store instructions
1.Fetch (IF):
1. Fetches the instruction from memory using the program counter (PC).
2. The fetched instruction is stored in the instruction register (IR).
2.Decode (ID):
1. Decodes the instruction in the IR.
2. Identifies the operation (load or store), memory address, and the involved register(s).
3. Generates control signals based on the instruction's opcode.
3.Execute (EX):
1. In the context of load and store instructions, this stage is used for calculating the effective address.
2. If the instruction is a load, the effective address is calculated using the base address and any offset specified
in the instruction.
3. If the instruction is a store, the effective address calculation is similar, but it may involve additional steps
depending on the addressing mode.
4.Memory Access (MEM):
1. Performs the memory access operation.
2. For load instructions, the data is read from memory at the calculated effective address.
3. For store instructions, the data from the source register is written to memory at the calculated effective
address.
5.Write Back (WB):
1. For load instructions, the data read from memory is written back to the destination register.
2. For store instructions, there is typically no operation in the Write Back stage, as the operation is already
completed during the Memory Access stage.
25
Pipelined Datapath for MIPS processor – Multiple cycle
26
Pipelining in Multi-cycle MIPS processor
The pipeline can be thought of as a series of data paths shifted in time
27
Major hurdles in Pipelining
Structural hazards arise from resource conflicts when the hardware
cannot support all possible combinations of instructions simultaneously
in overlapped execution.
Data hazards arise when an instruction depends on the results of a
previous instruction in a way that is exposed by the overlapping of
instructions in the pipeline.
Control hazards arise from the pipelining of branches and other
instructions that change the PC.
28
Pipelining – Structural Hazards
30
Pipelining – Data Hazards
• Data hazards occur when the pipeline changes the order of read/write
accesses to operands so that the order differs from the order seen by
sequentially executing instructions on an unpipelined processor
Consider the example
1) DADD R1,R2,R3 1) All the instructions after the DADD use the result of the
DADD instruction.
2) DSUB R4,R1,R5
3) AND R6,R1,R7
2) DADD instruction writes the value of R1 in the WB pipe
4) OR R8,R1,R9
stage, but the DSUB instruction reads the value during its
5) XOR R10,R1,R11 ID stage. This problem is called a data hazard.
31
Pipelining – Example for Data Hazards
32
Pipelining with Forwarding is the solution to solve
Data Hazard.
33
Pipelining : - Control Hazard
The instruction after the branch is fetched, but the instruction is ignored, and the fetch is
restarted
once the branch target is known. It is probably obvious that if the branch is not taken,
the second IF for branch successor is redundant. This is shown below.
34
Pipelining : - Branch Prediction schemes
Branch prediction : predicted-not-taken scheme
• In the simple five-stage pipeline, this predicted-not-taken or predicted-untaken scheme
is implemented by continuing to fetch instructions as if the branch were a normal
instruction. The pipeline looks as if nothing out of the ordinary is happening. If the
branch is taken, however, we need to turn the fetched instruction into a no-op and
restart the fetch at the target address. Figure below illustrates the same.
35
How Pipelining is implemented : Instruction Fetch
38
How Pipelining is implemented : Instruction execution
39
How Pipelining is implemented : Memory access
40
How Pipelining is implemented : write back
41
Data path for 5-stage pipelined processor
42
What Makes Pipelining Hard to Implement?
1) I/O device request
2) Invoking an operating system service from a user program
3) Tracing instruction execution
4) Breakpoint (programmer-requested interrupt)
5) Integer arithmetic overflow
6) FP arithmetic anomaly
7) Page fault (not in main memory)
8) Misaligned memory accesses (if alignment is required)
9) Memory protection violation
10) Using an undefined or unimplemented instruction
11) Hardware malfunctions
12) Power failure
43
MIPS Pipeline to Handle Multicycle Operations
Static scheduling
The compiler can attempt to schedule instructions to avoid the
hazard( structure, data and control); this approach is called compiler or
static scheduling.
Dynamic scheduling
Several early processors used another approach, called dynamic scheduling,
whereby the hardware rearranges the instruction execution to reduce the
stalls.
Dynamic scheduling with scoreboard is adopted in CDC6600 machine.
46