0% found this document useful (0 votes)
17 views8 pages

CAP EndSem Unit 5

Uploaded by

Apurva Jarwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views8 pages

CAP EndSem Unit 5

Uploaded by

Apurva Jarwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

UNIT 5

Pipeline Hazards
Earlier we had mentioned that the memory limits the speed of the CPU. Now there is
one more case. In a pipelined design few instructions are in some stage of execution.
There are possibilities for some kind of dependency amongst these set of instructions
and thereby limiting the speed of the Pipeline. The dependencies occur for a few
reasons which we will be discussing soon. The dependencies in the pipeline are called
Hazards as these cause hazard to the execution. We use the
word Dependencies and Hazard interchangeably as these are used so in Computer
Architecture. Essentially an occurrence of a hazard prevents an instruction in the pipe
from being executed in the designated clock cycle. We use the word clock cycle,
because each of these instructions may be in different machine cycle of theirs.

There are three kinds of hazards:

 Structural Hazards
 Data Hazards
 Control Hazards

There are many specific solutions to dependencies. The simplest is introducing


a bubble which stalls the pipeline and reduces the throughput. The bubble makes the
next instruction wait until the earlier instruction is done with.

Structural Hazards
Structural hazards arise due to hardware resource conflict amongst the instructions in
the pipeline. A resource here could be the Memory, a Register in GPR or ALU. This
resource conflict is said to occur when more than one instruction in the pipe is requiring
access to the same resource in the same clock cycle. This is a situation that the hardware
cannot handle all possible combinations in an overlapped pipelined execution.
A better solution would be to increase the structural resources in the system using one
of the few choices below:

 The pipeline may be increased to 5 or more stages and suitably redefine the
functionality of the stages and adjust the clock frequency. This eliminates the issue
of the hazard at every 4th instruction in the 4-stage pipeline
 The memory may physically be separated as Instruction memory and Data
Memory. A Better choice would be to design as Cache memory in CPU, rather than
dealing with Main memory. IF uses Instruction memory and Result writing uses
Data Memory. These become two separate resources avoiding dependency.
 It is possible to have Multiple levels of Cache in CPU too.
 There is a possibility of ALU in resource dependency. ALU may be required in IE
machine cycle by an instruction while another instruction may require ALU in IF
stage to calculate Effective Address based on addressing mode. The solution would
be either stalling or have an exclusive ALU for address calculation.
 Register files are used in place of GPRs. Register files have multiport access with
exclusive read and write ports. This enables simultaneous access on one write
register and read register.

The last two methods are implemented in modern CPUs. Beyond these, if dependency
arises, Stalling is the only option. Keep in mind that increasing resources involves
increased cost. So the trade-off is a designer’s choice.

Data Hazards
Data hazards occur when an instruction's execution depends on the results of some
previous instruction that is still being processed in the pipeline. Consider the example
below. Occur when given instruction depends on data from an instruction ahead of it in
pipeline.

Solution 1: Introduce three bubbles at SUB instruction IF stage. This will facilitate SUB –
ID to function at t6. Subsequently, all the following instructions are also delayed in the
pipe.

Solution 2: Data forwarding - Forwarding is passing the result directly to the


functional unit that requires it: a result is forwarded from the output of one unit to the
input of another. The purpose is to make available the solution early to the next
instruction.
Solution 3: Compiler can play a role in detecting the data dependency and reorder
(resequence) the instructions suitably while generating executable code. This way the
hardware can be eased.

Solution 4: In the event, the above reordering is infeasible, the compiler may detect and
introduce NOP ( no operation) instruction(s). NOP is a dummy instruction equivalent
bubble, introduced by the software.

The compiler looks into data dependencies in code optimisation stage of the
compilation process.

Data Hazards classification


Data hazards are classified into three categories based on the order of READ or WRITE
operation on the register and as follows:

1. RAW (Read after Write) [Flow/True data dependency]

Given two instructions I and J, where I comes before J …

Instruction J should read an operand after it is written by I

Called a data dependence in compiler terminology.

This is a case where an instruction uses data produced by a previous one. Example

ADD R0, R1, R2


SUB R4, R3, R0

2. WAR (Write after Read) [Anti-Data dependency]

This is a case where the second instruction writes onto register before the first
instruction reads. This is rare in a simple pipeline structure. However, in some
machines with complex and special instructions case, WAR can happen.

ADD R2, R1, R0


SUB R0, R3, R4

3. WAW (Write after Write) [Output data dependency]


This is a case where two parallel instructions write the same register and must do it
in the order in which they were issued.

ADD R0, R1, R2


SUB R0, R4, R5

WAW and WAR hazards can only occur when instructions are executed in parallel or out
of order. These occur because the same register numbers have been allotted by the
compiler although avoidable. This situation is fixed by renaming one of the registers by
the compiler or by delaying the updating of a register until the appropriate value has
been produced. Modern CPUs not only have incorporated Parallel execution with
multiple ALUs but also Out of order issue and execution of instructions along with many
stages of pipelines.

Control Hazards
Control hazards are called Branch hazards and caused by Branch Instructions. Branch
instructions control the flow of program/ instructions execution. Recall that we use
conditional statements in the higher-level language either for iterative loops or with
conditions checking (correlate with for, while, if, case statements). These are transformed
into one of the variants of BRANCH instructions. It is necessary to know the value of the
condition being checked to get the program flow. Life is complicating you! So it is for
the CPU!

Thus a Conditional hazard occurs when the decision to execute an instruction is based
on the result of another instruction like a conditional branch, which checks the
condition’s resultant value.

The branch and jump instructions decide the program flow by loading the appropriate
location in the Program Counter(PC). The PC has the value of the next instruction to be
fetched and executed by CPU. Consider the following sequence of instructions.

Solutions for Conditional Hazards


1. Stall the Pipeline as soon as decoding any kind of branch instructions. Just not
allow anymore IF. As always, stalling reduces throughput. The statistics say that in a
program, at least 30% of the instructions are BRANCH. Essentially the pipeline
operates at 50% capacity with Stalling.
2. Prediction – Imagine a for or while loop getting executed for 100 times. We know
for sure 100 times the program flows without the branch condition being met. Only
in the 101st time, the program comes out of the loop. So, it is wiser to allow the
pipeline to proceed and undo/flush when the branch condition is met. This does
not affect the throttle of the pipeline as much stalling.
3. Dynamic Branch Prediction - A history record is maintained with the help of
Branch Table Buffer (BTB). The BTB is a kind of cache, which has a set of entries,
with the PC address of the Branch Instruction and the corresponding effective
branch address. This is maintained for every branch instruction encountered. SO
whenever a conditional branch instruction is encountered, a lookup for the
matching branch instruction address from the BTB is done. If hit, then the
corresponding target branch address is used for fetching the next instruction. This
is called dynamic branch prediction.

Figure 16.6
Branch Table Buffer

This method is successful to the extent of the temporal locality of reference in the
programs. When the prediction fails flushing needs to take place.

4. Reordering instructions - Delayed branch i.e. reordering the instructions to


position the branch instruction later in the order, such that safe and useful
instructions which are not affected by the result of a branch are brought-in earlier
in the sequence thus delaying the branch instruction fetch. If no such instructions
are available then NOP is introduced. This delayed branch is applied with the help
of Compiler.

------------------------------------------------------------------------------------

Instruction Level Parallelism:-

Instruction Level Parallelism (ILP) is used to refer to the architecture in which multiple
operations can be performed parallelly in a particular process, with its own set of
resources – address space, registers, identifiers, state, program counters. It refers to
the compiler design techniques and processors designed to execute operations, like
memory load and store, integer addition, float multiplication, in parallel to improve
the performance of the processors. Examples of architectures that exploit ILP are
VLIWs, Superscalar Architecture.
ILP processors have the same execution hardware as RISC processors. The machines
without ILP have complex hardware which is hard to implement. A typical ILP allows
multiple-cycle operations to be pipelined

Architecture :

Instruction Level Parallelism is achieved when multiple operations are performed in


single cycle, that is done by either executing them simultaneously or by utilizing gaps
between two successive operations that is created due to the latencies.
Now, the decision of when to execute an operation depends largely on the compiler
rather than hardware. However, extent of compiler’s control depends on type of ILP
architecture where information regarding parallelism given by compiler to hardware
via program varies. The classification of ILP architectures can be done in the following
ways –
1. Sequential Architecture :
Here, program is not expected to explicitly convey any information regarding
parallelism to hardware, like superscalar architecture.
2. Dependence Architectures :
Here, program explicitly mentions information regarding dependencies between
operations like dataflow architecture.
3. Independence Architecture :
Here, program gives information regarding which operations are independent of
each other so that they can be executed instead of the ‘nop’s.
In order to apply ILP, compiler and hardware must determine data dependencies,
independent operations, and scheduling of these independent operations, assignment
of functional unit, and register to store data

-----------------------------------------------------------------------------------------------------------------
What is Indirect instruction cycle? Explain data flow in it?

• The execution of an instruction may involve one or more


operands in memory, each of which requires a memory
access.
• Further if indirect addressing is used then additional
memory accesses are required
• We can think of the fetching of indirect addresses as one
more instruction stages
• The main line of activity consists of alternating
instruction fetch and instruction execution activities.
• After an instruction is fetched it is examined to
determine if any indirect addressing is involved
• If so the required operands are fetched using indirect
addressing
• Following execution an interrupt may be processed
before the next instruction fetch

DATA Flow:-

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy