0% found this document useful (0 votes)

169 views31 pages

CS 3351 Digital Principles and Computer Organization

UNIT IV PROCESSOR

Uploaded by

Dr.Kalaivazhi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

169 views31 pages

CS 3351 Digital Principles and Computer Organization

UNIT IV PROCESSOR

Uploaded by

Dr.Kalaivazhi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 31

UNIT 4 PROCESSOR

Instruction Execution – Building a Data Path – Designing a Control Unit – Hardwired Control,
Microprogrammed Control – Pipelining – Data Hazard – Control Hazards.

4.1 Instruction Cycle

As instructions are a part of the program which are stored inside the memory, so every time the processor
requires to execute an instruction, for that the processor first fetches the instruction from the memory,
then decodes the instruction and then executes the instruction. The whole process is known as an instruction
cycle.

In the basic computer, each instruction cycle includes the following procedures −

 It can fetch instruction from memory.

 It is used to decode the instruction.
 It can read the effective address from memory if the instruction has an indirect address.
 It can execute the instruction.

1. After the following four procedures are done, the control switches back to the first step and repeats the
similar process for the next instruction.
2. Therefore, the cycle continues until a Halt condition is met.
3. The figure shows the phases contained in the instruction cycle.

Figure 4.1 Instruction Cycle

Fetch Cycle

 The address instruction to be implemented is held at the program counter.

 The processor fetches the instruction from the memory that is pointed by the PC.
 Next, the PC is incremented to display the address of the next instruction.
 This instruction is loaded onto the instruction register.
 The processor reads the instruction and executes the important procedures.

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 1

Execute Cycle

The data transfer for implementation takes place in two methods are as follows −

Processor-memory − The data sent from the processor to memory or from memory to processor.
Processor-Input/Output − The data can be transferred to or from a peripheral device by the transfer between a
processor and an I/O device.

 These two methods associate and complete the execute cycle.

Instruction cycle state transition diagram

Figure 4.2 State transition Diagram for Instruction Cycle

Instruction execution :

Instruction execution needs the following steps, which are

 PC (program counter) register of the processor gives the address of the instruction which needs to be fetched
from the memory.
 If the instruction is fetched then, the instruction opcode is decoded.
 On decoding, the processor identifies the number of operands. If there is any operand to be fetched from the
memory, then that operand address is calculated.
 Operands are fetched from the memory. If there is more than one operand, then the operand fetching process
may be repeated (i.e. address calculation and fetching operands).
 After this, the data operation is performed on the operands, and a result is generated.
 If the result has to be stored in a register, the instructions end here.

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 2

 If the destination is memory, then first the destination address has to be calculated. Then the result is then
stored in the memory. If there are multiple results which need to be stored inside the memory, then this
process may repeat (i.e. destination address calculation and store result).
 Now the current instructions have been executed. Side by side, the PC is incremented to calculate the address
of the next instruction.
 The above instruction cycle then repeats for further instructions.

4.2Building a Data path

Data path element:
 A unit used to operate on or hold data within a processor.
 In the MIPS implementation, the datapath elements include the instruction and data memories,
the register file, the ALU, and adders.

1. Figure 4.3a shows the first element needed: A memory unit to store the instructions of a program and
supply instructions given an address.
 The instruction memory need only provide read access because the data path does not write
instructions.
 The instruction memory is treated as combinational logic since it only reads,
 Output at any time reflects the contents of the location specified by the address input,
 No read control signal is needed.
2. Figure 4.3b shows the program counter (PC), a register that holds the address of the current instruction.
 The program counter is a 32-bit register that is written at the end of every clock cycle.
 So, it does not need a write control signal.
3. Figure 4.3c shows the adder needed to increment the PC to the address of the next instruction.
 The adder is a wired ALU that always add its two 32-bit inputs and place the sum on its output.

Figure 4.3 Elements needed for data path design Figure 4.4 Combination of three elements

4. Figure 4.4 shows how to combine the three elements from Figure 4.3 to form a data path that fetches
instructions and increments the PC to obtain the address of the next sequential instruction.

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 3

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 4
Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 5
Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 6
R-format instructions
1. They all read two registers, perform an ALU operation on the contents of the registers, and write the
result to a register called as R-type instructions or arithmetic-logical instructions.
2. This instruction class includes add, sub, AND, OR, and slt.
3. The processor’s 32 general-purpose registers are stored in a structure called a register file.
 It is a collection of registers in which any register can be read or written by specifying the number of
the register in the file.
 It contains the register state of the computer and an ALU to operate on the values read from the
registers.
4. R-format instructions have three register operands,
a. To read two data words from the register file
 an input to the register file that specifies the register number to be read
 an output from the register file that will carry the value that has been read from the registers.
b. Write one data word into the register file for each instruction.
 Two inputs: one to specify the register number to be written and one to supply the data to be
written into the register.
 Outputs the contents of whatever register numbers are on the Read register inputs
5. Writes are controlled by the write control signal, which must be asserted for a write to occur at the clock
edge.
6. Figure 4.5a shows the elements of R-format instruction, a total of four inputs are needed (3 for register
numbers and 1 for data) and two outputs (both for data).
7. The register number inputs are 5 bits wide to specify one of 32 registers (32 = 25), whereas the data input
and two data output buses are each 32 bits wide.
8. Figure 4.5b shows the ALU, which takes two 32-bit inputs and produces a 32-bit result, as well as a 1-bit
signal if the result is 0.

Figure 4.5 Elements of R-Format instruction. Figure 4.6. Elements of loads and stores
Implementation of loads and stores
1. The MIPS load word and store word instructions computes a memory address by adding the base register,
which is $t2, to the 16-bit signed off set field contained in the instruction.
lw $t1,offset_value($t2)
sw $t1,offset_value ($t2)
2. A sign-extend unit is needed to extend the 16-bit off set field in the instruction to a 32-bit signed value,
and a data memory unit to read from or write to as shown in Figure 3.6.
3. sign-extend To increase the size of a data item by replicating the high-order sign bit of the original data
item in the high order bits of the larger, destination data item.

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 7

4. The data memory must be written on store instructions; hence, it has read and write control signals, an
address input, and an input for the data to be written into memory.
5. Diagram explanation
 The memory unit is a state element with inputs for the address and the write data, and a single output
for the read result.
 There are separate read and write controls, although only one of these may be asserted on any given
clock.
 The memory unit needs a read signal, since, unlike the register file, reading the value of an invalid
address can cause problems.
 The sign extension unit has a 16-bit input that is sign-extended into a 32-bit result appearing on the
output.
 The data memory is assumed to be edge-triggered for writes. Standard memory chips actually have a
write enable signal that is used for writes.
 Although the write enable is not edge-triggered, our edge-triggered design could easily be adapted to
work with real memory chips.
Branch instructions
beq $t1,$t2,offset
1. The beq instruction has three operands,
 Two registers that are compared for equality,
 A 16-bit off set used to compute the branch target address relative to the branch instruction address.
2. Branch target address
 The address specified in a branch, which becomes the new program counter (PC) if the branch is
taken.
 In the MIPS architecture the branch target is given by the sum of the offset field of the instruction and
the address of the instruction following the branch.
3. Definition of branch instructions
a. ISA specifies that the base for the branch address calculation is the address of the instruction
following the branch.
b. It also states that the offset field is shifted left 2 bits so that it is a word off set; this shift increases the
effective range of the offset field by a factor of 4.
4. Branch taken
 A branch where the branch condition is satisfied and the program counter (PC) becomes the branch
target. All unconditional jumps are taken branches.
5. Branch not taken or (untaken branch)
 A branch where the branch condition is false and the program counter (PC) becomes the address of
the instruction that sequentially follows the branch.
Thus, the branch datapath must do two operations:
i. Compute the branch target address
ii. Compare the register contents.
6. Figure 4.7 shows the structure of the datapath segment that handles branches. The data path for a branch
uses
 The ALU -evaluates the branch condition
 A separate adder - computes the branch target (incremented PC + the branch displacement), shifted
left 2 bits.
7. Shift left 2 unit – routes the signals between input and output that adds 00two to the low-order end of the
sign-extended off set field; no actual shift hardware is needed, since the amount of the “shift” is constant.
8. Since the offset was sign-extended from 16 bits, the shift throws away only “sign bits.”

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 8

9. Control logic –decides whether the incremented PC or branch target should replace the PC, based on the
Zero output of the ALU.

Figure 4.7. The data path for a branch instruction

Creating a Single Data path
1. The simplest data path shown in figure 3.9 attempts to execute all instructions in one clock cycle.
2. No data path resource can be used more than once per instruction, so any element that needs more than
one must be duplicated.
3. Therefore a separate memory is needed for instructions and data. Although some of the functional units
need to be duplicated, many of the elements can be shared by different instruction flows.

Figure 4.8. MIPS architecture data path for different instruction classes.
4. To share a datapath element between two different instruction classes,
 A multiplexor is used to allow multiple connections to the input of an element,
 A control signal is used to select one among the multiple inputs.
5. The branch instruction uses the main ALU for comparison of the register operands, so the adder is used
for computing the branch target address.

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 9

6. An additional multiplexor is required to select either the sequentially following instruction address (PC +
4) or the branch target address to be written into the PC.
7. The control unit must be able to take inputs and generate a write signal for each state element, the
selector control for each multiplexor, and the ALU control.

4.3 Designing a Control Unit:

Control Unit :
 It is the part of the computer’s central processing unit (CPU), which directs the operation of
the processor.
 It was included as part of the Von Neumann Architecture by John von Neumann.
 It is the responsibility of the Control Unit to tell the computer’s memory, arithmetic/logic unit
and input and output devices how to respond to the instructions that have been sent to the
processor.
Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 10
 It fetches internal instructions of the programs from the main memory to the processor
instruction register, and based on this register contents, the control unit generates a control
signal that supervises the execution of these instructions.
 A control unit works by receiving input information to which it converts into control signals,
which are then sent to the central processor.
 The computer’s processor then tells the attached hardware what operations to perform.
 The functions that a control unit performs are dependent on the type of CPU because the
architecture of CPU varies from manufacturer to manufacturer.
Examples of devices that require a CU are:
i. Control Processing Units(CPUs)
ii. Graphics Processing Units(GPUs)

Figure 4.9 Block Diagram Of Control Unit

Functions of the Control Unit –

1. It coordinates the sequence of data movements into, out of, and between a processor’s many sub-units.
2. It interprets instructions.
3. It controls data flow inside the processor.
4. It receives external instructions or commands to which it converts to sequence of control signals.
5. It controls many execution units(i.e. ALU, data buffers and registers) contained within a CPU.
6. It also handles multiple tasks, such as fetching, decoding, execution handling and storing results.

Types of Control Unit –

There are two types of control units:

i. Hardwired control unit
ii. Micro programmable control unit.

4.4 Hardwired Control Unit

In the Hardwired control unit, the control signals that are important for instruction execution
control are generated by specially designed hardware logical circuits, in which we can not modify the
signal generation method without physical change of the circuit structure.

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 11

The operation code of an instruction contains the basic data for control signal generation. In the
instruction decoder, the operation code is decoded. The instruction decoder constitutes a set of many
decoders that decode different fields of the instruction opcode.

As a result, few output lines going out from the instruction decoder obtains active signal values.
These output lines are connected to the inputs of the matrix that generates control signals for execution
units of the computer.

This matrix implements logical combinations of the decoded signals from the instruction
opcode with the outputs from the matrix that generates signals representing consecutive control unit states
and with signals coming from the outside of the processor, e.g. interrupt signals.

The matrices are built in a similar way as a programmable logic arrays.

Figure 4.10 Block Diagram Of a hardwired control unit

Control signals for an instruction execution have to be generated not in a single time point but during the
entire time interval that corresponds to the instruction execution cycle.

Following the structure of this cycle, the suitable sequence of internal states is organized in the control
unit.

A number of signals generated by the control signal generator matrix are sent back to inputs of the next
control state generator matrix.

This matrix combines these signals with the timing signals, which are generated by the timing unit based
on the rectangular patterns usually supplied by the quartz generator. When a new instruction arrives at the
control unit, the control units is in the initial state of new instruction fetching.

Instruction decoding allows the control unit enters the first state relating execution of the new instruction,
which lasts as long as the timing signals and other input signals as flags and state information of the
computer remain unaltered.

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 12

A change of any of the earlier mentioned signals stimulates the change of the control unit state.

This causes that a new respective input is generated for the control signal generator matrix. When an
external signal appears, (e.g. an interrupt) the control unit takes entry into a next control state that is the
state concerned with the reaction to this external signal (e.g. interrupt processing).

The values of flags and state variables of the computer are used to select suitable states for the instruction
execution cycle.

The last states in the cycle are control states that commence fetching the next instruction of the program:
sending the program counter content to the main memory address buffer register and next, reading the
instruction word to the instruction register of computer.

When the ongoing instruction is the stop instruction that ends program execution, the control unit enters
an operating system state, in which it waits for a next user directive.

4.5 Micro programmable control unit

The fundamental difference between these unit structures and the structure of the hardwired
control unit is the existence of the control store that is used for storing words containing encoded
control signals mandatory for instruction execution.

In micro programmed control units, subsequent instruction words are fetched into the instruction
register in a normal way. However, the operation code of each instruction is not directly decoded to
enable immediate control signal generation but it comprises the initial address of a microprogram
contained in the control store.

With a single-level control store:

In this, the instruction opcode from the instruction register is sent to the control store address register.
Based on this address, the first microinstruction of a microprogram that interprets execution of this
instruction is read to the microinstruction register.

This microinstruction contains in its operation part encoded control signals, normally as few bit fields. In
a set microinstruction field decoders, the fields are decoded. The microinstruction also contains the
address of the next microinstruction of the given instruction microprogram and a control field used to
control activities of the microinstruction address generator.

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 13

Figure 4.11 Block Diagram Of Microprogrammed control unit with a single level control store

The last mentioned field decides the addressing mode (addressing operation) to be applied to the
address embedded in the ongoing microinstruction.

In microinstructions along with conditional addressing mode, this address is refined by using the
processor condition flags that represent the status of computations in the current program.

The last microinstruction in the instruction of the given microprogram is the microinstruction that
fetches the next instruction from the main memory to the instruction register.

With a two-level control store:

In this, in a control unit with a two-level control store, besides the control memory for microinstructions,
a nano-instruction memory is included.

In such a control unit, microinstructions do not contain encoded control signals.

The operation part of microinstructions contains the address of the word in the nano-instruction memory,
which contains encoded control signals.

The nano-instruction memory contains all combinations of control signals that appear in microprograms
that interpret the complete instruction set of a given computer, written once in the form of nano-
instructions.

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 14

Figure 4.12 Block Diagram Of Micro programmed control unit with a two- level control store

In this way, unnecessary storing of the same operation parts of microinstructions is avoided. In this case,
microinstruction word can be much shorter than with the single level control store.

It gives a much smaller size in bits of the microinstruction memory and, as a result, a much smaller size
of the entire control memory.

The microinstruction memory contains the control for selection of consecutive microinstructions, while
those control signals are generated at the basis of nano-instructions.

In nano-instructions, control signals are frequently encoded using 1 bit/ 1 signal method that eliminates
decoding.

4.6 Pipelining
An implementation technique in which multiple instructions are overlapped in execution, much like an
assembly line.

MIPS instructions classically take five steps:

1. Fetch instruction from memory.
2. Read registers while decoding the instruction. The regular format of MIPS instructions allows
reading and decoding to occur simultaneously.
3. Execute the operation or calculate an address.
4. Access an operand in data memory.
5. Write the result into a register.

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 15

Design goal: To balance the length of each pipeline stage. If the stages are perfectly balanced, then

Time between instructions pipelined =Time between instruction nonpipelined

Number of pipe stages
Designing Instruction Sets for Pipelining

All MIPS instructions are the same length. This makes it easier to fetch instructions in the first pipeline stage
and to decode them in the second stage.

1. MIPS have only a few instruction formats, with the source register fields being located in the same
place in each instruction.
 The second stage can begin reading the register file at the same time that the hardware is
determining what type of instruction was fetched.
 If MIPS instruction formats were not symmetric, stage 2 is splitted, resulting in six
pipeline stages.

2. Memory operands appear only in loads or stores in MIPS.

 The execute stage is used to calculate the memory address.
3. Operands must be aligned in memory. So, the requested data can be transferred between processor
and memory in a single pipeline stage.

Figure 4.13 Single-cycle, non - pipelined execution in top versus pipelined execution in bottom.

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 16

4.7 Hazards

A situation in pipelining when the next instruction cannot execute in the following clock cycle is called
hazards.
1. Structural hazard
2. Data Hazards
3. Control Hazards

1. Structural hazard

i. When a planned instruction cannot execute in the proper clock cycle because the hardware does not
support the combination of instructions that are set to execute.
ii. MIPS instruction set avoids structural hazards.
 If the pipeline in Figure 4.13 had a fourth instruction, in the same clock cycle the first
instruction is accessing data from memory while the fourth instruction is fetching an
instruction from that same memory.
 Without two memories, it could have a structural hazard.

2. Data Hazards

i. When a planned instruction cannot execute in the proper clock cycle because data that is needed to
execute the instruction is not yet available.
ii. It occurs when the pipeline must be stalled because one step must wait for another to complete.
iii. Data hazards arise from the dependence of one instruction on an earlier one that is still in the pipeline.

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 17

Example,
sum ($s0): add $s0, $t0, $t1
sub $t2, $s0, $t3
Forwarding or bypassing:
 Adding extra hardware to retrieve the missing item early from the internal resources
 A method of resolving a data hazard by retrieving the missing data element from internal buffers
rather than waiting for it to arrive from programmer visible registers or memory.
Load-use data hazard:
 A specific form of data hazard in which the data being loaded by a load instruction has not yet
become available when it is needed by another instruction.
Pipeline stall or bubble. A stall initiated in order to resolve a hazard.
3. Control Hazards or branch hazards
i. When the proper instruction cannot execute in the proper pipeline clock cycle because the instruction
that was fetched is not the one that is needed.
ii. The flow of instruction addresses is not the expected order.
iii. Computers use prediction to handle branches.
Branch prediction
i. A method of resolving a branch hazard that assumes a given outcome for the branch and proceeds
from that assumption rather than waiting to ascertain the actual outcome.
 One simple approach is to predict always that branches will be untaken. When you’re right,
the pipeline proceeds at full speed.
 Only when branches are taken does the pipeline stall.
ii. Dynamic hardware predictors make their guesses depending on the behavior of each branch and may
change predictions for a branch over the life of a program.
iii. It keeps a history for each branch as taken or untaken, and then using the recent past behavior to
predict the future.
iv. When the guess is wrong, the pipeline control must ensure that the instructions following the wrongly
guessed branch have no effect and must restart the pipeline from the proper branch address.
Advantages of pipelining
 Pipelining increases the number of simultaneously executing instructions and the rate at which
instructions are started and completed.
 Pipelining does not reduce the time it takes to complete an individual instruction, called the latency.
 Pipelining improves instruction throughput rather than individual instruction execution time or
latency.
 Latency(pipeline):
o The number of stages in a pipeline or the number of stages between two instructions during
execution.
4.8 Pipelined data path and control
In five-stage pipeline, an instruction is divided into five stages and they are in execution during any
single clock cycle.
4.8.1 IF: Instruction fetch
4.8.2 ID: Instruction decode and register file read
4.8.3 EX: Execution or address calculation
4.8.4 MEM: Data memory access
4.8.5 WB: Write back

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 18

Exceptions in the left -to-right flow of instructions:

 The write-back stage, which places the result back into the register file in the middle of the
data path
 The selection of the next value of the PC, choosing between the incremented PC and the
branch address from the MEM stage

Data flowing from right to left does not affect the current instruction; these reverse data movement’s
influence only later instructions in the pipeline.
 The first right-to-left flow of data causes data hazards
 The second causes control hazards.
Figure3.16 shows the pipelined version of the data path with registers, to hold information produced in
previous cycle.

Figure 4.14 The pipelined version of the data path with registers.

Pipelined data path for Load and Store instructions

 The right half of registers or memory is highlighted during read operation.

 The left half of the registers or memory is highlighted during write operation.

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 19

Stages Load instruction Store instruction
Instruction fetch  Instruction is read from memory using  Instruction is read from memory using
the address in the PC and placed in the the address in the PC and placed in the
IF/ID pipeline register. IF/ID pipeline register.
 The PC address is incremented by 4 and
written back into the PC to be ready for
the next clock cycle.
Instruction decode  IF/ID pipeline register supplies the 16-bit  IF/ID pipeline register supplies the
and register file immediate field, which is sign-extended register numbers for reading two
read to 32 bits, and the register numbers to registers and extends the sign of the 16-
read the two registers. bit immediate.
 All three values are stored in the ID/EX  These three 32-bit values are all stored
pipeline register, along with the in the ID/EX pipeline register.
incremented PC address.
Execute or address  The load instruction reads the contents of  The effective address is placed in the
calculation register 1 and the sign-extended EX/MEM pipeline register.
immediate from the ID/EX pipeline
register and adds them using the ALU.
 That sum is placed in the EX/MEM
pipeline register.
Memory access  The address from the EX/MEM pipeline  The data is placed into the EX/MEM
register is used to read the data memory. pipeline register in the EX stage to make
 The data is loaded into the MEM/WB it available during the MEM stage.
pipeline register.  The register containing the data to be
stored was read in an earlier stage and
stored in ID/EX.
Write-back  Reading the data from the MEM/WB  Nothing happens in the write-back stage
pipeline register and writing it into the of store instruction.
register file.  Since every instruction behind the store
is already in progress.
Load and store illustrates a key point:
1. Each logical component of the data path can be used only within a single pipeline stage. Otherwise, a
structural hazard is experienced.
2. The logical components of a data path are,
 Instruction memory,
 Register read ports,
 ALU, data memory and
 Register write port
Pipelined Control
It’s useful to group control signals by the stage with which they’re associated as shown in 4.15,
1. Instruction fetch: no control signals because the same thing happens every time.
2. Instruction decode/register file read: no control signals because the same thing happens every time.
3. Execution/address calculation:
 RegDst, ALUOp, and ALUSrc are used to select the write register, the ALU operation, and either
read data 2 or the 16-bit immediate offset.
4. Memory access: Branch, MemRead, and MemWrite to select whether the branch address will be written
Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 20
to the PC, memory will be read, or memory will be written.
5. Write-back:
 MemToReg and RegWrite which select the source of the write (ALU or Memory) and whether or not
to write to the register.
After the instruction decode in stage 2, control information is passed via the pipeline registers.
Explanation of figure 4.15
i. Four of the nine control lines are used in the EX phase.
ii. The remaining five control lines passed on to the EX/MEM pipeline register extended to hold the
control lines.
iii. Three are used during the MEM stage.
iv. The last two are passed to MEM/ WB for use in the WB stage.

Figure 4.15 The control lines for the final three stages

The control information follows the instruction with which it’s associated.

Figure 4.16 The pipelined data path with the control signals

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 21

Explanation of figure 4.16
i. The control values for the last three stages are created during the instruction decode stage and then
placed in the ID/EX pipeline register.
ii. The control lines for each pipe stage are used, and remaining control lines are then passed to the next
pipeline stage.
4.9 Handling Data hazards
Data Hazards
When a planned instruction cannot execute in the proper clock cycle since data that is needed to
execute the instruction is not yet available.
add $s0, $t0, $t1
sub $t2, $s0, $t3
Forwarding or Bypassing
A method of resolving a data hazard,
 By retrieving the missing data element from internal buffers rather than waiting for it to arrive from
registers or memory.
Consider the sequence:
sub $2, $1,$3
and $12,$2,$5
or $13,$6,$2
add $14,$2,$2
sw $15,100($2)
The last four instructions are all dependent on the result in register $2 of the first instruction.

Figure 4.17 Pipelined dependences

4.9.1 Figure 4.17 illustrates the execution of above instructions using a multiple-clock-cycle pipeline
representation.
4.9.2 When a register is read and written in the same clock cycle it is assumed that the write is in the
first half of the clock cycle and the read is in the second half, so the read delivers recently
written value.
4.9.2.1 The value of register $2, changes during the middle of clock cycle 5, when the sub
instruction writes its result.

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 22

4.9.2.2 The values read for register $2 would not be the result of the sub instruction unless
the read occurred during clock cycle 5 or later.
4.9.2.3 Thus, the instructions that would get the correct value of −20 are add and sw; the
AND and OR instructions would get the incorrect value 10.
4.9.3 A notation that names the fields of the pipeline registers gives a more precise notation of
dependences.
4.9.4 Using this notation, the two pairs of hazard conditions are
1a. EX/MEM.RegisterRd = ID/EX.RegisterRs
1b. EX/MEM.RegisterRd = ID/EX.RegisterRt
2a. MEM/WB.RegisterRd = ID/EX.RegisterRs
2b. MEM/WB.RegisterRd = ID/EX.RegisterRt
4.9.4.1 For example,
i. “ID/EX.RegisterRs”- the number of register whose value is in the pipeline register
ID/EX
 The first hazard in the sequence is on register $2, between the result of sub $2,$1,$3 and the first read
operand of and $12,$2,$5.
 This hazard can be detected when the and instruction is in the EX stage and the prior instruction is
in the MEM stage, so this is hazard 1a:
EX/MEM.RegisterRd = ID/EX.RegisterRs = $2
 The sub-or is a type 2b hazard:
MEM/WB.RegisterRd = ID/EX.RegisterRt = $2
 The two dependences on sub-add are not hazards because the register file supplies the proper data
during the ID stage of add.
 There is no data hazard between sub and sw because sw reads $2 the clock cycle after sub writes $2.
Conditions for detecting hazards and the control signals to resolve them:
1. EX hazard:
if (EX/MEM.RegWrite
and (EX/MEM.RegisterRd ≠ 0)
and (EX/MEM.RegisterRd = ID/EX.RegisterRs)) ForwardA = 10
if (EX/MEM.RegWrite
and (EX/MEM.RegisterRd ≠ 0)
and (EX/MEM.RegisterRd = ID/EX.RegisterRt)) ForwardB = 10MEM hazard
 EX/MEM.RegisterRd field is the register destination for either an ALU instruction (which comes from
the Rd field of the instruction) or a load (which comes from the Rt field).
 This case forwards the result from the previous instruction to either input of the ALU.
 If the previous instruction is going to write to the register file, and the write register number matches
the read register number of ALU inputs A or B, provided it is not register 0, then steer the multiplexor
to pick the value instead from the pipeline register EX/MEM.
2. MEM hazard:
if (MEM/WB.RegWrite
and (MEM/WB.RegisterRd ≠ 0)
and ( MEM/WB.RegisterRd = ID/EX.RegisterRs)) ForwardA = 01
if (MEM/WB.RegWrite
and (MEM/WB.RegisterRd ≠ 0)
and (MEM/WB.RegisterRd = ID/EX.RegisterRt)) ForwardB = 01

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 23

No hazard in the WB stage, because
o It is assumed that the register file supplies the correct result if the instruction in the ID stage
reads the same register written by the instruction in the WB stage.
o Such a register file performs another form of forwarding, but it occurs within the register file.
Complication in Data hazards
1. Hazards that occur between the results of the instruction in the WB stage, the result of the instruction
in the MEM stage, and the source operand of the instruction in the ALU stage.
2. Example:
When summing a vector of numbers in a single register, a sequence of instructions will all read and
write to the same register:
add $1,$1,$2
add $1,$1,$3
add $1,$1,$4
3. The result is forwarded from the MEM stage because the result in the MEM stage is the more recent
result.
4. Thus, the control for the MEM hazard would be (with the additions highlighted):
if (MEM/WB.RegWrite
and (MEM/WB.RegisterRd ≠ 0)
and not(EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0)
and (EX/MEM.RegisterRd ≠ ID/EX.RegisterRs))
and (MEM/WB.RegisterRd = ID/EX.RegisterRs)) ForwardA = 01

if (MEM/WB.RegWrite
and (MEM/WB.RegisterRd ≠ 0)
and not(EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0)
and (EX/MEM.RegisterRd ≠ ID/EX.RegisterRt))
and (MEM/WB.RegisterRd = ID/EX.RegisterRt)) ForwardB = 01
Data Hazards and Stalls
1. Forwarding cannot be used, when an instruction tries to read a register following a load instruction
that writes the same register.
2. Something must stall the pipeline for the combination of load followed by an instruction that reads its
result.
3. A hazard detection unit is used in addition to the forwarding unit to deal such cases.
 It operates during the ID stage so that it can insert the stall between the load and its use.
 Checking for load instructions, the control for the hazard detection unit is this single condition:
if (ID/EX.MemRead and
((ID/EX.RegisterRt = IF/ID.RegisterRs) or
(ID/EX.RegisterRt = IF/ID.RegisterRt)))
stall the pipeline
 First line checks whether the instruction is a load: the only instruction that reads data memory is a
load.
 The next two lines check whether the destination register field of the load in the EX stage matches
either source register of the instruction in the ID stage.
 If the condition holds, the instruction stalls one clock cycle.
 After this 1-cycle stall, the forwarding logic can handle the dependence and execution proceeds.
 Force control values in ID/EX register to 0
o EX, MEM and WB do nop (no-operation)

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 24

o nop -An instruction that does no operation to change state.
o No registers or memories are written if the control values are all 0.
Figure 4.18 explains the way stalls are really inserted into the pipeline,
1. A bubble is inserted beginning in clock cycle 4, by changing the instruction to a nop.
2. Note that the instruction is really fetched and decoded in clock cycles 2 and 3, but it’s EX stage is
delayed until clock cycle 5 .
3. Likewise the OR instruction is fetched in clock cycle 3, but its ID stage is delayed until clock cycle 5
4. After insertion of the bubble, all the dependences go forward in time and no further hazards occur.

Figure 4.18 the way stalls are really inserted into the pipeline

Figure 4.19 Datapath with Hazard Detection and forwarding unit

Figure 4.19 highlights the pipeline connections for both the hazard detection unit and the forwarding unit.
1. The forwarding unit controls the ALU multiplexors to replace the value from a general-purpose
register with the value from the proper pipeline register.
2. The hazard detection unit controls the writing of the PC and IF/ID registers plus the multiplexor that
Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 25
chooses between the real control values and all 0s.
3. The hazard detection unit stalls and deasserts the control fields if the load-use hazard test above is
true.
4.10 Handling Control Hazards
Control Hazards or Branch hazards
When the proper instruction cannot execute in the proper pipeline clock cycle because,
a. The instruction that was fetched is not the one that is needed.
b. The flow of instruction addresses is not what the pipeline expected.
Solutions to control hazards
1. Stall.
2. Predict.
Assume Branch Not Taken
4.10.1 Stalling until the branch is complete is too slow.
4.10.2 So predict that the branch will not be taken and continue execution down the sequential
instruction stream.
4.10.3 If the branch is taken, the instructions that are being fetched and decoded must be discarded.
Execution continues at the branch target.
4.10.4 To discard instructions, the original control values are changed to 0s in IF, ID, and EX stages
when the branch reaches the MEM stage.
4.10.5 For load-use stalls, just the control is changed to 0 in the ID stage and let them get into through
the pipeline.
4.10.6 Discarding instructions means flush instructions in the IF, ID, and EX stages of the pipeline.
4.10.7 Flush: To discard instructions in a pipeline, usually due to an unexpected event.
Reducing the Delay of Branches
1. The cost of the taken branch is reduced to improve branch performance.
2. The MIPS architecture was designed to support fast single-cycle branches that could be pipelined
with a small branch penalty.
3. Many branches rely only on simple tests (equality or sign, for example) and such tests do not require
a full ALU operation but can be done with at most a few gates.
4. When a more complex branch decision is required, a separate instruction that uses an ALU to perform
a comparison is required.
5. Moving the branch decision up requires two actions to occur earlier:
a. Computing the branch target address
 The PC value is already known and the immediate field is in the IF/ID pipeline register, so
the branch adder is just moved from the EX stage to the ID stage.
 The branch target address calculation will be performed for all instructions, but only used
when needed.
b. Evaluating the branch decision.
 The harder part is the branch decision itself.
 For branch equal, the two registers read during the ID stage is compared to see if they are
equal.
 Equality can be tested by first exclusive ORing their respective bits and then ORing all the
results.
 Moving the branch test to the ID stage implies additional forwarding and hazard detection
hardware, since a branch dependent on a result in the pipeline must still work properly
with this optimization.

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 26

For example,
 To implement branch on equal (and its inverse), there is a need to forward results to the equality test
logic that operates during ID.
 There are two complicating factors:
1. The instruction is decoded during ID, and decided whether a bypass to the equality unit is
needed, and complete the equality comparison so that if the instruction is a branch, the PC can
be set to the branch target address.
2. Because the values in a branch comparison are needed during ID but may be produced later in
time, it is possible that a data hazard can occur and a stall will be needed.
6. To flush instructions in the IF stage, an additional control line is used, called IF.Flush, that zeros the
instruction field of the IF/ID pipeline register.
7. Clearing the register transforms the fetched instruction into a nop, an instruction that has no action and
changes no state.
Dynamic Branch Prediction
1. Prediction of branches at runtime using runtime information.
2. Look up the address of the instruction to see if a branch was taken the last time this instruction
was executed and, if so, to begin fetching new instructions from the same place as the last time. It
is called dynamic branch prediction.
3. A simple static prediction scheme will probably waste too much performance; with more
hardware it is possible to try to predict branch behavior during program execution.
4. Implementation of that approach is branch prediction buffer or branch history table
5. A branch prediction buffer is a small memory indexed by the lower portion of the address of the
branch instruction, that contains one or more bits indicating whether the branch was recently
taken or not.
6. It is not known if the prediction is the right one—it may have been put there by another branch that
has the same low-order address bits. However, this doesn’t affect correctness.
7. Prediction is just a hint that we hope is correct, so fetching begins in the predicted direction.
8. If the hint turns out to be wrong, the incorrectly predicted instructions are deleted, the prediction
bit is inverted and stored back, and the proper sequence is fetched and executed.
9. This simple 1-bit prediction scheme has a performance shortcoming: even if a branch is almost
always taken, it can be predicted incorrectly twice, rather than once, when it is not taken.

Figure 4.20 the states in a 2-bit prediction scheme

10. By using 2 bits rather than 1, a branch that strongly favors taken or not taken—as many branches
do—will be mispredicted only once. The 2 bits are used to encode the four states in the system.
11. The 2-bit scheme is a general instance of a counter-based predictor,
a. It is incremented when the prediction is accurate
b. Otherwise decremented
Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 27
c. Uses the mid-point of its range as the division between taken and not taken.
Correlating predictor:

A branch predictor that combines local behavior of a particular branch and global information about the
behavior of some recent number of executed branches.
Tournament branch predictor

A branch predictor with multiple predictions for each branch and a selection mechanism that chooses which
predictor to enable for a given branch.

University Question:

Problem 1 Consider a 4 stage pipeline processor. The number of cycles needed by the four instructions I1,
I2, I3 and I4 in stages S1, S2, S3 and S4 is shown below-

S1 S2 S3 S4

I1 2 1 1 1

I2 1 3 2 2

I3 2 1 1 3

I4 1 2 2 2

What is the number of cycles needed to execute the following loop?

for(i=1 to 2) { I1; I2; I3; I4; }
A. 16
B. 23
C. 28
D. 30

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 28

Solution-

The phase-time diagram is-

From here, number of clock cycles required to execute the loop = 23 clock cycles.
Thus, Option (B) is correct.

Problem 2 Consider a pipelined processor with the following four stages-

IF : Instruction Fetch
ID : Instruction Decode and Operand Fetch
EX : Execute
WB : Write Back

The IF, ID and WB stages take one clock cycle each to complete the operation. The number of clock cycles
for the EX stage depends on the instruction. The ADD and SUB instructions need 1 clock cycle and the MUL
instruction need 3 clock cycles in the EX stage. Operand forwarding is used in the pipelined processor. What
is the number of clock cycles taken to complete the following sequence of instructions?

ADD R2, R1, R0 R2 ← R0 + R1

MUL R4, R3, R2 R4 ← R3 + R2
SUB R6, R5, R4 R6 ← R5 + R4

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 29

A. 7
B. 8
C. 10
D. 14

Solution-

The phase-time diagram is-

From here, number of clock cycles required to execute the instructions = 8 clock cycles.
Thus, Option (B) is correct.

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 30

CS 3351 Digital Principles & Computer Organization
100% (3)
CS 3351 Digital Principles & Computer Organization
27 pages
CS3352 DPCO UNIT 4 NOTES EduEngg
No ratings yet
CS3352 DPCO UNIT 4 NOTES EduEngg
67 pages
Building A Data Path
No ratings yet
Building A Data Path
15 pages
VLSI and Chip Design - EC3552 - Hand Written Notes - Unit 5 - ASIC Design and Testing
No ratings yet
VLSI and Chip Design - EC3552 - Hand Written Notes - Unit 5 - ASIC Design and Testing
20 pages
Unit 3 Genetic Algorithm Final
100% (1)
Unit 3 Genetic Algorithm Final
32 pages
CS3351 DPCO UNIT 4 Notes
No ratings yet
CS3351 DPCO UNIT 4 Notes
47 pages
Dlco (Cse, It, Ai&ds, Ai&ml)
No ratings yet
Dlco (Cse, It, Ai&ds, Ai&ml)
251 pages
Final HRMS Proposal
75% (12)
Final HRMS Proposal
19 pages
CS3351 Digital Principles and Computer Organization Lecture Notes 2
No ratings yet
CS3351 Digital Principles and Computer Organization Lecture Notes 2
213 pages
Unit 4 Neuro Fuzzy
No ratings yet
Unit 4 Neuro Fuzzy
25 pages
Dpco Lab Manual
No ratings yet
Dpco Lab Manual
75 pages
VLSI Lab Manual-2021
No ratings yet
VLSI Lab Manual-2021
91 pages
Unit 2 Neural Networks
No ratings yet
Unit 2 Neural Networks
52 pages
Experiment - 1 HDL Code To Realize All The Logic Gates
100% (1)
Experiment - 1 HDL Code To Realize All The Logic Gates
20 pages
R22 COA Unit 1
No ratings yet
R22 COA Unit 1
41 pages
Digital System Design Question Paper
No ratings yet
Digital System Design Question Paper
2 pages
CS 3351 Digital Principles and Computer Organization
100% (2)
CS 3351 Digital Principles and Computer Organization
61 pages
Design and Simulation of Logic Gates Using VHDL
No ratings yet
Design and Simulation of Logic Gates Using VHDL
26 pages
Unit 4 Notes
No ratings yet
Unit 4 Notes
15 pages
IoT Processors Record
No ratings yet
IoT Processors Record
31 pages
8085 PPT 1
No ratings yet
8085 PPT 1
17 pages
Lab Manual With Procedure For Xilinx and Microwind
No ratings yet
Lab Manual With Procedure For Xilinx and Microwind
102 pages
EE8681-Microprocessors and Microcontrollers-Lab Manual-Converted11
100% (1)
EE8681-Microprocessors and Microcontrollers-Lab Manual-Converted11
124 pages
Final Project Report CSC186
No ratings yet
Final Project Report CSC186
20 pages
Review Questions - Dpco - Unit Wise
0% (1)
Review Questions - Dpco - Unit Wise
4 pages
Rt-Level Combinational Circuit-Lecture 3
No ratings yet
Rt-Level Combinational Circuit-Lecture 3
36 pages
Stucor CS3351-GK
No ratings yet
Stucor CS3351-GK
269 pages
Vlsi QP 21,22
No ratings yet
Vlsi QP 21,22
11 pages
Unit 2 Architecture of 8051 Microcontroller
No ratings yet
Unit 2 Architecture of 8051 Microcontroller
25 pages
Dpco Important Part B Questions
No ratings yet
Dpco Important Part B Questions
1 page
Unit - 2 ARM Instruction Set-Notes
100% (1)
Unit - 2 ARM Instruction Set-Notes
18 pages
Practical File OF Computer Organization & Architecture ETCS - 254
No ratings yet
Practical File OF Computer Organization & Architecture ETCS - 254
45 pages
Cs3691 Embedded Systems and Iot Laboratory Record..
No ratings yet
Cs3691 Embedded Systems and Iot Laboratory Record..
50 pages
GE8261-EPL Lab Manual 2018-19 Final
100% (1)
GE8261-EPL Lab Manual 2018-19 Final
97 pages
Cs3351-Digital Principles and Computer Organization-1955364779-Dpsd All Unit
No ratings yet
Cs3351-Digital Principles and Computer Organization-1955364779-Dpsd All Unit
270 pages
CST294 - Ktu Qbank
No ratings yet
CST294 - Ktu Qbank
22 pages
Verilog Code For A Comparator
100% (1)
Verilog Code For A Comparator
2 pages
Lab File BCS351
No ratings yet
Lab File BCS351
8 pages
STLD Question Bank
0% (1)
STLD Question Bank
6 pages
Unit 1 Fuzzy Logic
No ratings yet
Unit 1 Fuzzy Logic
29 pages
VTU Exam Question Paper With Solution of 18CS34 Computer Organization Dec-2019-Gopika D
No ratings yet
VTU Exam Question Paper With Solution of 18CS34 Computer Organization Dec-2019-Gopika D
19 pages
DE LAB MANUAL Finalised PDF
No ratings yet
DE LAB MANUAL Finalised PDF
55 pages
Adders and Multipliers
No ratings yet
Adders and Multipliers
59 pages
Unit - 3 of Computer Architecture
No ratings yet
Unit - 3 of Computer Architecture
59 pages
Computer Organization
No ratings yet
Computer Organization
1 page
STLD Lab Experiments
No ratings yet
STLD Lab Experiments
30 pages
KTU - CST202: I/O Organization - T M S
No ratings yet
KTU - CST202: I/O Organization - T M S
34 pages
CS3351 Unit WISE Question Bank
No ratings yet
CS3351 Unit WISE Question Bank
12 pages
EC 6302 2-Marks and 16 Marks Questions
No ratings yet
EC 6302 2-Marks and 16 Marks Questions
12 pages
Ec3401 Networks and Security L T P C
No ratings yet
Ec3401 Networks and Security L T P C
2 pages
Condition Monitoring Engineer Resume
100% (1)
Condition Monitoring Engineer Resume
6 pages
Microprocessors and Microcontrollers Answer Key
No ratings yet
Microprocessors and Microcontrollers Answer Key
14 pages
Digital Electronics Questions
No ratings yet
Digital Electronics Questions
8 pages
MCQ On Unit 1 2 PDF
0% (1)
MCQ On Unit 1 2 PDF
22 pages
Ecad Lab Manual
100% (3)
Ecad Lab Manual
55 pages
Arm MC Lab Manual
No ratings yet
Arm MC Lab Manual
30 pages
Feritscope FMP30: Operators Manual
No ratings yet
Feritscope FMP30: Operators Manual
240 pages
Vlsi Implementation For High Speed Adders
100% (1)
Vlsi Implementation For High Speed Adders
6 pages
Software For Embedded System
No ratings yet
Software For Embedded System
39 pages
(IMPORTANT) Questions of FPGA & CPLD
0% (3)
(IMPORTANT) Questions of FPGA & CPLD
1 page
DLD GTU Question Bank: Chapter-1 Binary System
No ratings yet
DLD GTU Question Bank: Chapter-1 Binary System
129 pages
MPMC I Mid & Quiz Question Paper Format
No ratings yet
MPMC I Mid & Quiz Question Paper Format
6 pages
Lecture 3: Logic Systems, Data Types, and Operators For Modeling in Verilog HDL
No ratings yet
Lecture 3: Logic Systems, Data Types, and Operators For Modeling in Verilog HDL
24 pages
Module - Week 5 STS
No ratings yet
Module - Week 5 STS
8 pages
BCD 2 Binary
No ratings yet
BCD 2 Binary
25 pages
Chapter 7 Electronic Analysis of CMOS Logic Gates
No ratings yet
Chapter 7 Electronic Analysis of CMOS Logic Gates
42 pages
Proyecto Traducción
No ratings yet
Proyecto Traducción
9 pages
STLD Bits
No ratings yet
STLD Bits
18 pages
Digital Design Through Verilog HDL Course Outcomes For Lab
No ratings yet
Digital Design Through Verilog HDL Course Outcomes For Lab
1 page
Chap Lesson Interrupt Sources
100% (1)
Chap Lesson Interrupt Sources
28 pages
DLD Lab Experiment
No ratings yet
DLD Lab Experiment
1 page
Internship Presentation YASH
No ratings yet
Internship Presentation YASH
21 pages
Construction Specifications Writing Principles
No ratings yet
Construction Specifications Writing Principles
8 pages
Abhay Mishra Result
No ratings yet
Abhay Mishra Result
1 page
Digital Communications A Discretetime Approach Rice Michael PDF Download
No ratings yet
Digital Communications A Discretetime Approach Rice Michael PDF Download
74 pages
Data Science PDF
No ratings yet
Data Science PDF
11 pages
RK23EUA12 - Skill Based Assignment 1 - INT428 - Final Patent - Ai
No ratings yet
RK23EUA12 - Skill Based Assignment 1 - INT428 - Final Patent - Ai
14 pages
BIOS Beep Codes
No ratings yet
BIOS Beep Codes
6 pages
Cable Pellet Extrusion Line: Initial Data
No ratings yet
Cable Pellet Extrusion Line: Initial Data
23 pages
Networking in College
No ratings yet
Networking in College
26 pages
Half Yearly Datesheet
No ratings yet
Half Yearly Datesheet
2 pages
Time Complexity
No ratings yet
Time Complexity
9 pages
Lecture 2 - CS50's Web Programming With Python and JavaScript
No ratings yet
Lecture 2 - CS50's Web Programming With Python and JavaScript
17 pages
Lesson 4 5 GEE 5 IT Era
No ratings yet
Lesson 4 5 GEE 5 IT Era
16 pages
Forecasting in Uncertainity MKmetric Presentation Cracow 20110309-Final
No ratings yet
Forecasting in Uncertainity MKmetric Presentation Cracow 20110309-Final
53 pages
Computer Studies - Worksheet 1
No ratings yet
Computer Studies - Worksheet 1
4 pages
Bridging Faults
No ratings yet
Bridging Faults
2 pages
Jurnal Ind Iam
No ratings yet
Jurnal Ind Iam
8 pages
ViramKushwaha InternshalaResume
No ratings yet
ViramKushwaha InternshalaResume
2 pages
Styling Your Text!
No ratings yet
Styling Your Text!
17 pages
Barangay Peace and Order (POIS)
No ratings yet
Barangay Peace and Order (POIS)
3 pages
An Empirical Investigation of Catastrophic Forgeti
No ratings yet
An Empirical Investigation of Catastrophic Forgeti
10 pages
Pikachu Official Artwork Gallery Pokémon Database
No ratings yet
Pikachu Official Artwork Gallery Pokémon Database
1 page
The World Wide Web: Propuesta B
No ratings yet
The World Wide Web: Propuesta B
2 pages
Quectel Rm500Q-Gl: Iot/Embb-Optimized 5G Sub-6 GHZ M.2 Module
No ratings yet
Quectel Rm500Q-Gl: Iot/Embb-Optimized 5G Sub-6 GHZ M.2 Module
3 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

CS 3351 Digital Principles and Computer Organization

Uploaded by

CS 3351 Digital Principles and Computer Organization

Uploaded by

UNIT 4 PROCESSOR

4.1 Instruction Cycle

 It can fetch instruction from memory.

Figure 4.1 Instruction Cycle

 The address instruction to be implemented is held at the program counter.

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 1

 These two methods associate and complete the execute cycle.

Instruction cycle state transition diagram

Figure 4.2 State transition Diagram for Instruction Cycle

Instruction execution needs the following steps, which are

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 2

4.2Building a Data path

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 3

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 7

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 8

Figure 4.7. The data path for a branch instruction

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 9

4.3 Designing a Control Unit:

Figure 4.9 Block Diagram Of Control Unit

Functions of the Control Unit –

Types of Control Unit –

There are two types of control units:

4.4 Hardwired Control Unit

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 11

The matrices are built in a similar way as a programmable logic arrays.

Figure 4.10 Block Diagram Of a hardwired control unit

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 12

4.5 Micro programmable control unit

With a single-level control store:

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 13

With a two-level control store:

In such a control unit, microinstructions do not contain encoded control signals.

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 14

MIPS instructions classically take five steps:

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 15

Time between instructions pipelined =Time between instruction nonpipelined

2. Memory operands appear only in loads or stores in MIPS.

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 16

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 17

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 18

Pipelined data path for Load and Store instructions

 The right half of registers or memory is highlighted during read operation.

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 19

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 21

Figure 4.17 Pipelined dependences

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 22

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 23

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 24

Figure 4.19 Datapath with Hazard Detection and forwarding unit

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 26

Figure 4.20 the states in a 2-bit prediction scheme

What is the number of cycles needed to execute the following loop?

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 28

The phase-time diagram is-

Problem 2 Consider a pipelined processor with the following four stages-

ADD R2, R1, R0 R2 ← R0 + R1

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 29

The phase-time diagram is-

Dr.V.Kalaivaazhi B.E.,M.Tech.,Ph.D Page 30

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.