COA - Chapter # 9
COA - Chapter # 9
FUNCTION
CHAPTER # 9 Computer Organization & Architecture
CPU Function
S H E H E R YAR MALI K
CPU must
Fetch instruction
The processor reads an instruction from memory (register, cache, main
memory)
Interpret instruction
The instruction is decoded to determine what action is required
Fetch data
The execution of an instruction may require reading data from memory or an
I/O module
Process data
The execution of an instruction may require performing some arithmetic or
logical operation on data
Write data
The results of an execution may require writing data to memory or an I/O
module
User-visible registers
Enable the machine or assembly language programmer to
minimize main memory references by optimizing use of
registers
Control and status registers
Used by the control unit to control the operation of the
processor and by privileged, operating system programs to
control the execution of programs
General Purpose
Data
Address
Condition Codes
Between 8 - 32
Fewer = more memory references
More does not reduce memory references and takes
up processor real estate
R8, R9, R10, R11, R12, R13, R14, R15 are the new
registers and have no other names
R0D – R15D are the lowermost 32 bits of each
register
For example, R0D is EAX
R0W – R15W are the lowermost 16 bits of each
register
For example, R0W is AX
R0L – R15L are the lowermost 8 bits of each register
for example, R0L is AL
IR is examined
If indirect addressing, indirect cycle is performed
Right most N bits of MBR transferred to MAR
Control unit requests memory read
Result (address of operand) moved to MBR
Simple
Predictable
Current PC saved to allow resumption after interrupt
Contents of PC copied to MBR
Special memory location (e.g. stack pointer) loaded to MAR
MBR written to memory
PC loaded with address of interrupt handling routine
Next instruction (first of interrupt handler) can be fetched
Laundry Example
Nazim, Botir, Babar, Temur
each have one load of clothes A B C D
to wash, dry, and fold
“Washing” takes 30 minutes
6 PM 7 8 9 10 11 12 1 2 AM
30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30
T Time
a A
s
k
B
O
r C
d
e
r
D
6 PM 7 8 9 10 11 12 1 2 AM
30 30 30 30 30 30 30 Time
T
a A
s
k
B
O
r C
d
e
r
D
Fetch instruction
Decode instruction
Calculate operands address
Fetch operands
Execute instructions
Write result
ADD instruction does not update EAX until end of stage 5, at clock cycle 5
SUB instruction needs value at beginning of its stage 2, at clock cycle 4
Pipeline must stall for two clocks cycles
Without special hardware and specific avoidance algorithms, results in
inefficient pipeline usage
Chapter # 9 Computer Organization & Architecture 57
Data Hazard Diagram
S H E H E R YAR MALI K
Multiple Streams
Prefetch Branch Target
Loop buffer
Branch prediction
Delayed branching
Predict by Opcode
Some instructions are more likely to result in a jump than
others
Can get up to 75% success
Taken/Not taken switch
Based on previous history
Good for loops
Refined by two-level or correlation-based branch history
Correlation-based
In more complex structures, branch direction correlates
with that of related branches
Use recent branch history as well
Delayed Branch
Do not take jump until you have to
Rearrange instructions
Fetch
From cache or external memory
Put in one of two 16-byte prefetch buffers
Fill buffer with new data as soon as old data consumed
Average 5 instructions fetched per load
Independent of other stages to keep buffers full
Decode stage 1
Opcode & address-mode info
At most first 3 bytes of instruction
Can direct D2 stage to get rest of instruction
Decode stage 2
Expand opcode into control signals
Computation of complex address modes
Execute
ALU operations, cache access, register update
Writeback
Update registers & flags
Results sent to cache & bus interface write buffers