Problem
Problem
Consider a pipeline having 4 phases with duration 60, 50, 90 and 80 ns. Given latch delay is
10 ns. Calculate-
1. Pipeline cycle time
2. Non-pipeline execution time
3. Speed up ratio
4. Pipeline time for 1000 tasks
5. Sequential time for 1000 tasks
6. Throughput
Solution-
Given-
• Four stage pipeline is used
• Delay of stages = 60, 50, 90 and 80 ns
• Latch delay or delay due to each register = 10 ns
Cycle time
= Maximum delay due to any stage + Delay due to its register
= Max { 60, 50, 90, 80 } + 10 ns
= 90 ns + 10 ns
= 100 ns
Speed up
= Non-pipeline execution time / Pipeline execution time
= 280 ns / Cycle time
= 280 ns / 100 ns
= 2.8
Part-06: Throughput-
A four stage pipeline has the stage delays as 150, 120, 160 and 140 ns respectively.
Registers are used between the stages and have a delay of 5 ns each. Assuming constant
clocking rate, the total time taken to process 1000 data items on the pipeline will be-
1. 120.4 microseconds
2. 160.5 microseconds
3. 165.5 microseconds
4. 590.0 microseconds
Solution-
Given-
• Four stage pipeline is used
• Delay of stages = 150, 120, 160 and 140 ns
• Delay due to each register = 5 ns
• 1000 data items or instructions are processed
Cycle Time-
Cycle time
= Maximum delay due to any stage + Delay due to its register
= Max { 150, 120, 160, 140 } + 5 ns
= 160 ns + 5 ns
= 165 ns
Problem-03:
Consider a non-pipelined processor with a clock rate of 2.5 gigahertz and average cycles
per instruction of 4. The same processor is upgraded to a pipelined processor with five
stages but due to the internal pipeline delay, the clock speed is reduced to 2 gigahertz.
Assume there are no stalls in the pipeline. The speed up achieved in this pipelined
processor is-
1. 3.2
2. 3.0
3. 2.2
4. 2.0
Solution-
Since there are no stalls in the pipeline, so ideally one instruction is executed per clock
cycle. So,
Pipeline execution time
= 1 clock cycle
= 0.5 ns
Speed Up-
Speed up
= Non-pipeline execution time / Pipeline execution time
= 1.6 ns / 0.5 ns
= 3.2
Thus, Option (A) is correct.
Problem-04:
The stage delays in a 4 stage pipeline are 800, 500, 400 and 300 picoseconds. The first
stage is replaced with a functionally equivalent design involving two stages with respective
delays 600 and 350 picoseconds.
The throughput increase of the pipeline is _____%.
Solution-
Cycle time
= Maximum delay due to any stage + Delay due to its register
= Max { 800, 500, 400, 300 } + 0
= 800 picoseconds
Thus, Execution time in 4 stage pipeline = 1 clock cycle = 800 picoseconds.
Throughput
= Number of instructions executed per unit time
= 1 instruction / 800 picoseconds
Cycle time
= Maximum delay due to any stage + Delay due to its register
= Max { 600, 350 } + 0
= 600 picoseconds
Thus, Execution time in 2 stage pipeline = 1 clock cycle = 600 picoseconds.
Throughput
= Number of instructions executed per unit time
= 1 instruction / 600 picoseconds
Throughput Increase-
Throughput increase
= { (Final throughput – Initial throughput) / Initial throughput } x 100
= { (1 / 600 – 1 / 800) / (1 / 800) } x 100
= { (800 / 600) – 1 } x 100
= (1.33 – 1) x 100
= 0.3333 x 100
= 33.33 %
Problem-05:
Solution-
Cycle Time in Non-Pipelined Processor-
Cycle time
= Maximum delay due to any stage + Delay due to its register
= Max { 2.5, 1.5, 2, 1.5, 2.5 } + 0.5 ns
= 2.5 ns + 0.5 ns
= 3 ns
Speed up
= Non-pipeline execution time / Pipeline execution time
= 10 ns / 3 ns
= 3.33
Thus, Option (C) is correct.
Problem-06:
Solution-
Cycle time
= Maximum delay due to any stage + Delay due to its register
= Max { 3, 2, 4, 2, 3 } + 0
= 4 ns
Cycle time
= Delay due to a stage + Delay due to its register
= 2 ns + 0
= 2 ns
Time Saved-
Time saved
= Execution time in design D1 – Execution time in design D2
= 416 ns – 214 ns
= 202 ns
Thus, Option (B) is correct.
Problem-07:
Consider an instruction pipeline with four stages (S1, S2, S3 and S4) each with
combinational circuit only. The pipeline registers are required between each stage and at
the end of the last stage. Delays for the stages and for the pipeline registers are as given in
the figure-
What is the approximate speed up of the pipeline in steady state under ideal conditions
when compared to the corresponding non-pipeline implementation?
1. 4.0
2. 2.5
3. 1.1
4. 3.0
Solution-
Cycle time
= Maximum delay due to any stage + Delay due to its register
= Max { 5, 6, 11, 8 } + 1 ns
= 11 ns + 1 ns
= 12 ns
Speed Up-
Speed up
= Non-pipeline execution time / Pipeline execution time
= 30 ns / 12 ns
= 2.5
Thus, Option (B) is correct.
Problem-08:
Consider a 4 stage pipeline processor. The number of cycles needed by the four
instructions I1, I2, I3 and I4 in stages S1, S2, S3 and S4 is shown below-
S1 S2 S3 S4
I1 2 1 1 1
I2 1 3 2 2
I3 2 1 1 3
I4 1 2 2 2
Solution-
From here, number of clock cycles required to execute the loop = 23 clock cycles.
Thus, Option (B) is correct.
Problem-09:
The IF, ID and WB stages take one clock cycle each to complete the operation. The number
of clock cycles for the EX stage depends on the instruction. The ADD and SUB instructions
need 1 clock cycle and the MUL instruction need 3 clock cycles in the EX stage. Operand
forwarding is used in the pipelined processor. What is the number of clock cycles taken to
complete the following sequence of instructions?
1. 7
2. 8
3. 10
4. 14
Solution-
Problem-10:
Consider the following procedures. Assume that the pipeline registers have zero latency.
P1 : 4 stage pipeline with stage latencies 1 ns, 2 ns, 2 ns, 1 ns
P2 : 4 stage pipeline with stage latencies 1 ns, 1.5 ns, 1.5 ns, 1.5 ns
P3 : 5 stage pipeline with stage latencies 0.5 ns, 1 ns, 1 ns, 0.6 ns, 1 ns
P4 : 5 stage pipeline with stage latencies 0.5 ns, 0.5 ns, 1 ns, 1 ns, 1.1 ns
Solution-
Cycle time
= Max { 1 ns, 2 ns, 2 ns, 1 ns }
= 2 ns
Clock frequency
= 1 / Cycle time
= 1 / 2 ns
= 0.5 gigahertz
Cycle time
= Max { 1 ns, 1.5 ns, 1.5 ns, 1.5 ns }
= 1.5 ns
Clock frequency
= 1 / Cycle time
= 1 / 1.5 ns
= 0.67 gigahertz
Cycle time
= Max { 0.5 ns, 1 ns, 1 ns, 0.6 ns, 1 ns }
= 1 ns
Clock frequency
= 1 / Cycle time
= 1 / 1 ns
= 1 gigahertz
Cycle time
= Max { 0.5 ns, 0.5 ns, 1 ns, 1 ns, 1.1 ns }
= 1.1 ns
Clock frequency
= 1 / Cycle time
= 1 / 1.1 ns
= 0.91 gigahertz
Problem-11:
Consider a 3 GHz (gigahertz) processor with a three-stage pipeline and stage latencies T1,
T2 and T3 such that T1 = 3T2/4 = 2T3. If the longest pipeline stage is split into two pipeline
stages of equal latency, the new frequency is ____ GHz, ignoring delays in the pipeline
registers.
Solution-
Let ‘t’ be the common multiple of each ratio, then-
• T1 = t
• T2 = 4t / 3
• T3 = t / 2
Frequency Of Pipeline-
Frequency
= 1 / Pipeline cycle time
= 1 / (4t / 3)
= 3 / 4t
The stage with longest latency i.e. stage-02 is split up into 4 stages.
After splitting, the latency of different stages are-
• Latency of stage-01 = 0.25 ns
• Latency of stage-02 = 0.165 ns
• Latency of stage-03 = 0.165 ns
• Latency of stage-04 = 0.125 ns