0% found this document useful (0 votes)

11 views10 pages

Pipeline in ARM

The document explains the three-stage and five-stage pipeline architectures of ARM processors, detailing their components such as the register bank, ALU, and barrel shifter. It highlights the advantages of pipelining, including increased instruction throughput and CPU performance, while also addressing disadvantages like complexity, hazards, and potential stalls during execution. Additionally, it discusses types of hazards (structural, data, and control) and solutions like forwarding and stalling to manage these issues in pipelined processors.

Uploaded by

Barath Ganesh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views10 pages

Pipeline in ARM

Uploaded by

Barath Ganesh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Three Stage Pipeline in ARM

The three stage pipeline architecture of ARM is given above.

1. Register Bank – It stores the state of the processor. It is used in arithmetic operations,
intermediate variable storage, temporary address storage etc… The register bank pictured
above has 5 ports. The ports with red dots are the outgoing read ports. The ports with the
blue dots are the incoming write ports. The ports with the yellow dots are the read and
write port of the the program counter. The program counter requires special read and
write ports because it is responsible for holding address of the next instruction. Through
the special write port, the updated address is written into the program counter. Through
the special read port, the current instruction address is read from the program counter.

2. The ALU or the Arithmetic Logic Unit performs numerical and logical operations. It does
not increment the PC.
3. The barrel shifter can shift or rotate one operand by any number of bits by pure
combinational logic instead of sequential logic. A barrel shifter is used for the floating
point arithmetic.

4. The address register and incrementer, which select and hold all memory addresses and
generate sequential addresses when required.
Three stage pipeline:
There are three stages in this pipeline method:

1. Fetch – The instruction is fetched from the memory and stored in the instruction register.

2. Decode – The instruction is moved to the decoder which decodes the instruction. It
activates the appropriate control signals and takes the necessary steps for the the next
execution stage.

3. Execute – The instruction is executed. Data transfer, logical and arithmetic operations all
take place during this stage.

Three stage pipeline

In a microprocessor without pipe lining, each simple instruction takes three cycles to
complete. And only when one instruction execution is complete, another instruction is
fetched. But in three stage pipe lining, as soon as the instruction fetch of one instruction is
over, the next instruction is fetched. The different stages of the consecutive instructions
happen simultaneously.
For simple instructions, the execute stage last only one clock cycle. But there are some
instructions which has a multi-cycle execution stage. For example. STR
In the above example, the first instruction (ADD) is fetched during the first cycle. The second
instruction (STR) is fetched during the second cycle. During the third cycle, there is
simultaneous decoding of the second instruction and the fetching of the third instruction
(ADD). But the second instruction STR is a multi-cycle execute instruction. Hence there is a
temporary stalling of the pipeline during the fourth cycle. During the fifth cycle, the pipeline
resumes. But because both the third and fourth instructions are in their decode stage, the
pipeline stalls for the fourth instruction. The decode of the third instruction takes place during
the fifth cycle and the decode of the fourth instruction takes place during the sixth cycle. The
rest of the pipeline happens normally.

PC behaviour:
During the execution stage of the first instruction, the third instruction is in the fetch stage.
That means the address in the program counter is pointing to the third instruction. Thus it can
be said that PC must always point 8 bytes ahead of the current instruction (The size of one
instruction in ARM is 4 bytes. Supposing the address of the current instruction is x, the next
instruction would be at x+4 and the next to next instruction will be at x+8).

Advantages of pipe-lining:

1. It increases the instruction throughput. The time it takes to complete an instruction

doesn’t change but the number of simultaneous instructions that can be processed
increases with increase in pipe-lining. It reduces the delay between completed
instructions.

2. CPU’s ALU can be designed to work faster. But this requires complex hardware.

3. It increases the performance of the processor.

Disadvantages of pipe-lining:

1. It is more complex and more expensive to build.

2. In a pipelined processor, insertion of flip flops between modules increases the instruction
latency compared to a non-pipelining processor.

3. The instruction throughput is difficult to predict.

4. The occurrence of a branching instruction flushes (erases) the entire pipeline.

5. Not all instructions are independent of each other. The output of one instruction maybe
the input to another instruction. In such cases, stalling of the pipe-lining is required so
that one instruction has completed execution. The output of this instruction is needed for
the subsequent dependent instruction.

6. When writing assembly code, it is assumed that the one instruction is executed after
another. But when this assumption is not validated by the pipelining, the program behaves
unexpectedly or incorrectly causing situations known as hazards.
Five Stage Pipeline in ARM

The time required to execute a program is given by

where Ninst is the number of ARM instructions executed in the course of the program, CPI is
the average number of clock cycles per instruction and fclk is the processor’s clock frequency.
To increase the performance of the processor, we need to decrease the time taken to
execute the program. To decrease Tprog, we can reduce CPI or increase fclk (for a given
program and compiler, Ninst is constant).
In three stage pipe-lining, during the execute stage, data transfer from source (from memory
to register or from register to ALU), data processing (arithmetic or logical operation in the
ALU) and the data transfer to destination all has to take place during a single clock cycle.
Because there are so many operations that needs to done in a single clock cycle, the time
period of the clock cycle must be sufficiently long. This means fclk cannot be increased above
a certain maximum value (With increase in fclk, time period of one clock cycle decreases).
But if we can split the execute stage into three stages namely Execute, Buffer and Write back,
then each of one of these stages will have only a small number of operations to complete in
one cycle. As a result, we can decrease the time period of one clock cycle and
correspondingly increase fclk. This is the 5-stage pipeline method.
The five stages of pipeline are:

1. Fetch – The instruction is fetched from the memory and stored in the instruction register.
2. Decode – The instruction is moved to the decoder which decodes the instruction. It
activates the appropriate control signals and takes the necessary steps for the the next
execution stage.

3. Execute – An operand is shifted and the ALU result generated. If the instruction is a load
or store, the memory address is computed in the ALU.

4. Buffer/Data – Data memory is accessed if required. Otherwise the ALU result is simply
buffered for one cycle.

5. Write back – The result generated by the instruction are written back to the register file,
including any data loaded from memory.
One of the major disadvantages of pipelining is something called the hazards. Hazards are
situations that prevent the proper functioning of the pipelining. It prevents the instructions to
be executed properly. There are three types of hazards.

1. Structural Hazards:
Structural hazards are the result of resource conflicts in the hardware, when the the hardware
cannot support all possible combinations of instructions in a simultaneous execution. For
example

In clock cycle 4, the load instruction is in the data/buffer stage and the instruction 3 is in
instruction fetch stage. Both are memory access operations. In Van-Neumann architecture,
where both data and instructions are stored in the same memory, this will cause a resource
conflict. Both instructions cannot access the memory simultaneously.
There are two possible solutions to this problem.

1. Stall the instruction fetch of the 3rd instruction for one clock cycle.

2. Use a harvard architecture.

The instruction fetch of the instruction 3 has been stalled for one clock cycle.
ARM actually uses Harvard architecture hence ARM does not face the structural hazard.

2. Data Hazards:
Data hazards occurs when the successive instructions are not independent of each other. In
other words, the input for a instruction depends upon the output of the previous instruction.

In the example given below, we can see that SUB,AND,OR and XOR take the value in R1 as
an input. We can also see that the first instruction ADD modifies the value in R1. The result
of the ADD operation will be written back into R1 only in clock cycle 5. But SUB needs the
updated value of R1 is in the fourth cycle itself. This causes the pipeline to malfunction. This
is called the data hazard.

Data hazard can be rectified with the forwarding mechanism.

Under the forwarding mechanism, the result of the ALU operation is supplied back to the
ALU as the input for every execution operation. If the processor detects that a instruction
needs the output of the last executed instruction, it will select the input the ALU received
from the last instruction’s execution. Otherwise it will just read its required data from the
registers during the data/buffer stage.

For example, ADD finishes its execution in the third cycle. It forwards its output to ALU as
the new input. During the fourth cycle, the processor detects that the SUB instruction requires
the updated value of R1. But R1 has not been updated yet (It will be updated in cycle 5). But
the result of the ADD operation is already in the input of the ALU. Instead of accessing the
R1 register, it simply makes use of this input. This is called forwarding. This code now can

be executed without stalls. Forwarding

mechanism
Fowarding can also be generalised to pass the result directly to any functional unit that
requires this result rather than only passing the result to the same unit. In the previous
example, we only passed the result from the ALU to the ALU itself. We can design the
hardware to pass the result any other unit that requires it.

Another solution to the data hazard is stalling. But this decreases efficiency and performance.
But there are certain data hazards that cannot be solved with forwarding only. It requires
stalling also. For example.

The load instruction has a delay or latency that cannot be eliminated by forwarding alone
hence stalling is necessary.
3. Control Hazards:
Control hazard occurs when the pipeline makes wrong decisions on branch prediction and
therefore brings instructions into the pipeline that must subsequently be discarded. The term
branch hazard also refers to a control hazard.

In the below example, beqz (Branch Equal to Zero) is a conditional branching instruction.
When beqz is in the instruction decode stage, sub instruction is in the execute state. But beqz
instruction depends on the value of R1. But R1 is not yet modified by the sub instruction.
That will happen only two cycles later.

|_1__|_2__|_3__|_4__|_5__|_6__|_7__|_8__|_9__|_10_|

Instructions

|_IF_|_ID_|_EX_|_MM_|_WB_|_____ ld r2,
0(r4)
|_IF_|_ID_|_EX_|_MM_|_WB_|_________ ld r3, 4(r4)
|_IF_|_ID_|_EX_|_MM_|_WB_|____ sub r1, r2, r3
|_IF_|_ID_|_EX_|_MM_|_WB_|____ beqz r1, L1
Further, we don’t know what instruction to fetch after the beqz instruction. If the condition
becomes true then we have to fetch the successive branch instruction. If the condition is false,
then we have to continue with the current instruction list. Ideally at cycle 5, we should
fetched a new instruction but we still don’t know the result of the condition at this stage.

The simplest method is to stall the pipeline until the MEM stage when we will know the
result of the conditional statement. But if we don’t do that and we decide to go ahead with the
pipelining assuming one result or another, then there is a chance that our assumption is
wrong. In other words we have made a wrong decision on the branch prediction. This is
called the control hazard.

UNIT 3 Second Half Notes
No ratings yet
UNIT 3 Second Half Notes
28 pages
Instruction Pipeline Design, Arithmetic Pipeline Deign - Super Scalar Pipeline Design
No ratings yet
Instruction Pipeline Design, Arithmetic Pipeline Deign - Super Scalar Pipeline Design
34 pages
Lecture 1
100% (1)
Lecture 1
10 pages
Computer Architecture and Organization
No ratings yet
Computer Architecture and Organization
49 pages
Pipeline Hazards. Presentation
100% (2)
Pipeline Hazards. Presentation
20 pages
Unit 5
No ratings yet
Unit 5
43 pages
Instruction Pipelining
No ratings yet
Instruction Pipelining
21 pages
Principles of Designing Pipelined Processor-1
No ratings yet
Principles of Designing Pipelined Processor-1
32 pages
COA Lecture 10
No ratings yet
COA Lecture 10
22 pages
Pipelining. Pipeline Hazards: Sabina Batyrkhanovna
No ratings yet
Pipelining. Pipeline Hazards: Sabina Batyrkhanovna
19 pages
Chap-06a Pipelining
No ratings yet
Chap-06a Pipelining
12 pages
Driver Installation Guide-CM748 CM749
No ratings yet
Driver Installation Guide-CM748 CM749
6 pages
Instruction Pipeline
No ratings yet
Instruction Pipeline
16 pages
ACA Unit 2,7th Sem CSE
No ratings yet
ACA Unit 2,7th Sem CSE
13 pages
Computer Organization: An Introduction To RISC Hardware: 6.1 An Overview of Pipelining
No ratings yet
Computer Organization: An Introduction To RISC Hardware: 6.1 An Overview of Pipelining
12 pages
Week 4 - Pipelining
No ratings yet
Week 4 - Pipelining
44 pages
Module 03
No ratings yet
Module 03
9 pages
ILP - Appendix C PDF
No ratings yet
ILP - Appendix C PDF
52 pages
COA Unit 3 Pipelining 31.5.23
No ratings yet
COA Unit 3 Pipelining 31.5.23
12 pages
Ch#16 (CPU Structure and Function)
No ratings yet
Ch#16 (CPU Structure and Function)
48 pages
Presentation 1
No ratings yet
Presentation 1
22 pages
Pipe Lining
No ratings yet
Pipe Lining
5 pages
4-Concept of Pipelining
No ratings yet
4-Concept of Pipelining
20 pages
UNIT - 5 Pipeling Concept
No ratings yet
UNIT - 5 Pipeling Concept
15 pages
3.2 Pipeline Processing
No ratings yet
3.2 Pipeline Processing
18 pages
CA Unit-2 Chapter-2
No ratings yet
CA Unit-2 Chapter-2
36 pages
Module 5 Pipeline and Vector Processing
No ratings yet
Module 5 Pipeline and Vector Processing
71 pages
Efficient Programming Techniques For ARM
100% (1)
Efficient Programming Techniques For ARM
18 pages
4.2 5-Stage Pipeline ARM Organization: Memory Bottle Neck
No ratings yet
4.2 5-Stage Pipeline ARM Organization: Memory Bottle Neck
6 pages
Lecture 7 - PIPELINING
No ratings yet
Lecture 7 - PIPELINING
16 pages
Module 3 Pipelining
No ratings yet
Module 3 Pipelining
7 pages
Co - Unit Ii - Ii
No ratings yet
Co - Unit Ii - Ii
34 pages
Lecture 3.1.2 (Concept of Pipelining, Pipeline Hazards)
No ratings yet
Lecture 3.1.2 (Concept of Pipelining, Pipeline Hazards)
6 pages
SIMD Machines:: Pipeline System
No ratings yet
SIMD Machines:: Pipeline System
35 pages
COA Unit - V Notes
No ratings yet
COA Unit - V Notes
21 pages
PIPELINING
No ratings yet
PIPELINING
30 pages
Coa Iat-2 QB Soln
No ratings yet
Coa Iat-2 QB Soln
16 pages
ARM Microcontroller - CIE 2
No ratings yet
ARM Microcontroller - CIE 2
63 pages
Gigabyte Ga-Z170x Gaming GT Rev 1.01 PDF
No ratings yet
Gigabyte Ga-Z170x Gaming GT Rev 1.01 PDF
76 pages
Unit 5 1
No ratings yet
Unit 5 1
21 pages
Pipelinehazard For Class
No ratings yet
Pipelinehazard For Class
61 pages
Dpco Unit 4
No ratings yet
Dpco Unit 4
21 pages
Pipelinehazard 160823134502
No ratings yet
Pipelinehazard 160823134502
61 pages
COA Module 5 QB Complete Solutions
No ratings yet
COA Module 5 QB Complete Solutions
32 pages
IBM Letter Dated 11 March 2010 To TurboHercules SAS
100% (2)
IBM Letter Dated 11 March 2010 To TurboHercules SAS
10 pages
Computer Architecture 1
No ratings yet
Computer Architecture 1
8 pages
Pipeline Hazards
No ratings yet
Pipeline Hazards
53 pages
Cad For Vlsi 2 Pro Ject - Superscalar Processor Implementation
No ratings yet
Cad For Vlsi 2 Pro Ject - Superscalar Processor Implementation
10 pages
Coa Unit 4
No ratings yet
Coa Unit 4
10 pages
Ca07 2014 PDF
No ratings yet
Ca07 2014 PDF
56 pages
Unit 6
No ratings yet
Unit 6
11 pages
An4758 Proprietary Code Readout Protection On Stm32l4 Stm32l4 Stm32g4 and Stm32wb Series Mcus Stmicroelectronics
No ratings yet
An4758 Proprietary Code Readout Protection On Stm32l4 Stm32l4 Stm32g4 and Stm32wb Series Mcus Stmicroelectronics
43 pages
Dell Latitude-E5430 Owner's Manual En-Us
No ratings yet
Dell Latitude-E5430 Owner's Manual En-Us
69 pages
Paradise Manual
No ratings yet
Paradise Manual
9 pages
ASUS M50Vm - REV 1.0sec
No ratings yet
ASUS M50Vm - REV 1.0sec
96 pages
Dell Tablet Install Instructions PDF
No ratings yet
Dell Tablet Install Instructions PDF
28 pages
EVO Carplay v3.0
No ratings yet
EVO Carplay v3.0
19 pages
5.pipeline and Multiprocessors
100% (1)
5.pipeline and Multiprocessors
16 pages
8085 Microprocessor and Its Signals
No ratings yet
8085 Microprocessor and Its Signals
3 pages
CS17303 Computer Architecture Notes On Lesson Unit IV - Sumathi
No ratings yet
CS17303 Computer Architecture Notes On Lesson Unit IV - Sumathi
24 pages
Spesifikasi Alat: Features
No ratings yet
Spesifikasi Alat: Features
3 pages
CO Pipelining PDF Notes
No ratings yet
CO Pipelining PDF Notes
10 pages
Mid TErm - ICT Y7&8
No ratings yet
Mid TErm - ICT Y7&8
6 pages
Star Lion College of Engineering & Technology: Cs2354 Aca-2 Marks & 16 Marks
No ratings yet
Star Lion College of Engineering & Technology: Cs2354 Aca-2 Marks & 16 Marks
14 pages
UNIT 2 Hardware and Software Concept
No ratings yet
UNIT 2 Hardware and Software Concept
136 pages
Pipeline: A Simple Implementation of A RISC Instruction Set
No ratings yet
Pipeline: A Simple Implementation of A RISC Instruction Set
16 pages
How Does One Learn AMBA Bus Protocols The Best and Easiest Way - Quora
No ratings yet
How Does One Learn AMBA Bus Protocols The Best and Easiest Way - Quora
5 pages
ESP8266 30A Relay Module (2024-10-04 00 - 26 - 27)
No ratings yet
ESP8266 30A Relay Module (2024-10-04 00 - 26 - 27)
8 pages
Network 2
No ratings yet
Network 2
15 pages
SMART Interactive Display MXV2 Brochure en
No ratings yet
SMART Interactive Display MXV2 Brochure en
4 pages
Synology DS2419 Plus Data Sheet Enu 221031 185718
No ratings yet
Synology DS2419 Plus Data Sheet Enu 221031 185718
7 pages
Lecture 1 GST 101 24-25
No ratings yet
Lecture 1 GST 101 24-25
48 pages
Lenovo V310 15ISK 15IKB Platform Specifications
No ratings yet
Lenovo V310 15ISK 15IKB Platform Specifications
1 page
Laptop Bill 2
No ratings yet
Laptop Bill 2
3 pages
CHMT O Level R3
No ratings yet
CHMT O Level R3
109 pages
Teaching PLC Chap5
No ratings yet
Teaching PLC Chap5
23 pages
Firmware Update TX-NR5100 04-23-2024
No ratings yet
Firmware Update TX-NR5100 04-23-2024
4 pages
LDA 1910 M - Manual
No ratings yet
LDA 1910 M - Manual
16 pages
ZKTeco MB460
No ratings yet
ZKTeco MB460
3 pages
MC&A Unit 4,5,6-5
No ratings yet
MC&A Unit 4,5,6-5
106 pages
Dell Optiplex 7420 AIO Datasheet
No ratings yet
Dell Optiplex 7420 AIO Datasheet
15 pages
Memory Hierarchy and Cache Quiz Answers
No ratings yet
Memory Hierarchy and Cache Quiz Answers
3 pages
SAP interface programming with RFC and VBA: Edit SAP data with MS Access
From Everand
SAP interface programming with RFC and VBA: Edit SAP data with MS Access
Karl Josef Hensel
No ratings yet
Advanced Backend Code Optimization
From Everand
Advanced Backend Code Optimization
Sid Touati
No ratings yet
IGNOU Operating System Previous Years Solved Papers
From Everand
IGNOU Operating System Previous Years Solved Papers
Manish Soni
No ratings yet
Preliminary Specifications: Programmed Data Processor Model Three (PDP-3) October, 1960
From Everand
Preliminary Specifications: Programmed Data Processor Model Three (PDP-3) October, 1960
Digital Equipment Corporation
No ratings yet
Profound Linux For Users
From Everand
Profound Linux For Users
Onder Teker
No ratings yet
Practical Reverse Engineering: x86, x64, ARM, Windows Kernel, Reversing Tools, and Obfuscation
From Everand
Practical Reverse Engineering: x86, x64, ARM, Windows Kernel, Reversing Tools, and Obfuscation
Bruce Dang
No ratings yet
First Hop Redundancy Protocol: Network Redundancy Protocol
From Everand
First Hop Redundancy Protocol: Network Redundancy Protocol
Mulayam Singh
No ratings yet
Computer Science II Essentials
From Everand
Computer Science II Essentials
Randall Raus
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Pipeline in ARM

Uploaded by

Pipeline in ARM

Uploaded by

Three Stage Pipeline in ARM

The three stage pipeline architecture of ARM is given above.

Three stage pipeline

1. It increases the instruction throughput. The time it takes to complete an instruction

3. It increases the performance of the processor.

1. It is more complex and more expensive to build.

3. The instruction throughput is difficult to predict.

4. The occurrence of a branching instruction flushes (erases) the entire pipeline.

The time required to execute a program is given by

2. Use a harvard architecture.

Data hazard can be rectified with the forwarding mechanism.

be executed without stalls. Forwarding

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.