Coa III Unit
Coa III Unit
TECHNOLOGY , Ghaziabad
Computer Organization And Architecture
BCS-302
Unit: III
COA
Instruction Format
15 14 12 11 0
I Opcode Address
Addressing
mode
ADDRESSING MODES
• The address field of an instruction can represent either
– Direct address: the address in memory of the data to use (the address of the
operand), or
– Indirect address: the address in memory of the address in memory of the data to use
300 1350
457 Operand
1350 Operand
+ +
AC AC
List of BC Registers
DR 16 Data Register Holds memory operand
AR 12 Address Register Holds address for memory
AC 16 Accumulator Processor register
IR 16 Instruction Register Holds instruction code
PC 12 Program Counter Holds address of instruction
TR 16 Temporary Register Holds temporary data
INPR 8 Input Register Holds input character
OUTR 8 Output Register Holds output character
COMMON BUS SYSTEM
The registers in the Basic Computer are connected using a bus
S2
S1 Bus
S0
Memory unit 7
4096 x 16
Address
Write Read
AR 1
LD INR CLR
PC 2
LD INR CLR
DR 3
LD INR CLR
E
ALU AC 4
LD INR CLR
INPR
IR 5
LD
TR 6
LD INR CLR
OUTR
Clock
LD
16-bit common bus
COMMON BUS SYSTEM
Read
INPR
Memory Write
4096 x 16
Address E ALU
AC
L I C
L I C L
L I C DR IR L I C
PC TR
AR OUTR LD
L I C
7 1 2 3 4 5 6
S2 S1 S0 Register
0 0 0 x
0 0 1 AR
0 1 0 PC
0 1 1 DR
1 0 0 AC
1 0 1 IR
1 1 0 TR
1 1 1 Memory
• Either one of the registers will have its load signal activated, or the memory will
have its read signal activated
– Will determine where the data from the bus gets loaded
• The 12-bit registers, AR and PC, have 0’s loaded onto the bus in the high order 4 bit
positions
• When the 8-bit register OUTR is loaded from the bus, the data comes from the low
order 8 bits on the bus
BASIC COMPUTER INSTRUCTIONS
• Basic Computer Instruction Format
15 14 12 11 0
I Opcode Address
Instructions
BASIC COMPUTER INSTRUCTIONS
Hex Code
Symbol I=0 I=1 Description
AND 0xxx 8xxx AND memory word to AC
ADD 1xxx 9xxx Add memory word to AC
LDA 2xxx Axxx Load AC from memory
STA 3xxx Bxxx Store content of AC into memory
BUN 4xxx Cxxx Branch unconditionally
BSA 5xxx Dxxx Branch and save return address
ISZ 6xxx Exxx Increment and skip if zero
Instructions
INSTRUCTION CYCLE
• After an instruction is executed, the cycle starts again at step 1, for the
next instruction
T0
AR PC
T1
IR M[AR], PC PC + 1
T2
Decode Opcode in IR(12-14),
AR IR(0-11), I IR(15)
T3 T3 T3 T3
Execute Execute AR M[AR] Nothing
input-output register-reference
instruction instruction
SC 0 SC 0 Execute T4
memory-reference
instruction
SC 0
D'7IT3: AR M[AR]
D'7I'T3: Nothing
D7I'T3: Execute a register-reference instr.
D7IT3: Execute an input-output instr.
REGISTER REFERENCE INSTRUCTIONS
Register Reference Instructions are identified when
- D7 = 1, I = 0
- Register Ref. Instr. is specified in b0 ~ b11 of IR
- Execution starts with timing signal T3
r: SC 0
CLA rB11: AC 0
CLE rB10: E0
CMA rB9: AC AC’
CME rB8: E E’
CIR rB7: AC shr AC, AC(15) E, E AC(0)
CIL rB6: AC shl AC, AC(0) E, E AC(15)
INC rB5: AC AC + 1
SPA rB4: if (AC(15) = 0) then (PC PC+1)
SNA rB3: if (AC(15) = 1) then (PC PC+1)
SZA rB2: if (AC = 0) then (PC PC+1)
SZE rB1: if (E = 0) then (PC PC+1)
HLT rB0: S 0 (S is a start-stop flip-flop)
MEMORY REFERENCE INSTRUCTIONS
Operation
Symbol Symbolic Description
Decoder
AND D0 AC AC M[AR]
ADD D1 AC AC + M[AR], E Cout
LDA D2 AC M[AR]
STA D3 M[AR] AC
BUN D4 PC AR
BSA D5 M[AR] PC, PC AR + 1
ISZ D6 M[AR] M[AR] + 1, if M[AR] + 1 = 0 then PC PC+1
- The effective address of the instruction is in AR and was placed there during timing signal
T2 when I = 0, or during timing signal T3 when I = 1
- Memory cycle is assumed to be short enough to complete in a CPU cycle
- The execution of MR instruction starts with T4
AND to AC
D0T4: DR M[AR] Read operand
D0T5: AC AC DR, SC 0 AND with AC
ADD to AC
D1T4: DR M[AR] Read operand
D1T5: AC AC + DR, E Cout, SC 0 Add to AC and store carry in E
MEMORY REFERENCE INSTRUCTIONS
LDA: Load to AC
D2T4: DR M[AR]
D2T5: AC DR, SC 0
STA: Store AC
D3T4: M[AR] AC, SC 0
BUN: Branch Unconditionally
D4T4: PC AR, SC 0
BSA: Branch and Save Return Address
M[AR] PC, PC AR + 1
AR = 135 135 21
136 Subroutine PC = 136 Subroutine
BSA:
D5T4: M[AR] PC, AR AR + 1
D5T5: PC AR, SC 0
Memory-reference instruction
D0 T 4 D1 T 4 D2 T 4 D 3T 4
DR M[AR] DR M[AR] DR M[AR] M[AR] AC
SC 0
D0 T 5 D1 T 5 D2 T 5
AC AC DR AC AC + DR AC DR
SC 0 E Cout SC 0
SC 0
D4 T 4 D5 T 4 D6 T 4
PC AR M[AR] PC DR M[AR]
SC 0 AR AR + 1
D5 T 5 D6 T 5
PC AR DR DR + 1
SC 0
D6 T 6
M[AR] DR
If (DR = 0)
then (PC PC + 1)
SC 0
INPUT-OUTPUT AND INTERRUPT
• Input-Output Configuration
Input-output Serial Computer
terminal communication registers and
interface
flip-flops
Receiver
Printer interface OUTR FGO
AC
Transmitter
Keyboard interface INPR FGI
INPR Input register - 8 bits
OUTR Output register - 8 bits Serial Communications Path
FGI Input flag - 1 bit Parallel Communications Path
FGO Output flag - 1 bit
IEN Interrupt enable - 1 bit
INPUT-OUTPUT AND INTERRUPT
FGI=0 FGO=1
Start Input Start Output
FGI 0
AC Data
yes yes
FGI=0
FGO=0
no
no
AC INPR
OUTR AC
D7IT3 = p
IR(i) = Bi, i = 6, …, 11
p: SC 0 Clear SC
INP pB11: AC(0-7) INPR, FGI 0 Input char. to AC
OUT pB10: OUTR AC(0-7), FGO 0 Output char. from AC
SKI pB9: if(FGI = 1) then (PC PC + 1) Skip on input flag
SKO pB8: if(FGO = 1) then (PC PC + 1) Skip on output flag
ION pB7: IEN 1 Interrupt enable on
IOF pB6: IEN 0 Interrupt enable off
The Central Processing Unit (CPU)
• The CPU has four main components:
1. The Control Unit (along with the IR) interprets the machine language instruction
and issues the control signals to make the CPU execute that instruction.
2. The ALU (Arithmetic Logic Unit) that does the arithmetic and logic.
3. The Register Set (Register File) that stores temporary results related to the
computations. There are also Special Purpose Registers used by the Control
Unit.
3x8
decoder
7 6543 210
D0
I Combinational
D7 Control Control
logic signals
T15
T0
15 14 . . . . 2 1 0
4 x 16
decoder
• Step 1: Identify the distinct phases in the flowchart. Employ log p number
of flip flops to handle p number distinct phases
Start R 1
Modulo –K
End S counter
Reset
0
Clock
Reset
…..
Step 2: Identify the maximum number of distinct steps, k, in each of
the phases .Employ a mod k counter to generate control signals for I/k Decoder
each of the k steps
Step 3 : design a combinational logic circuit to generate the
sequence of control signals to control the micro operations of each
phase ……
c1c2 ck
Delay element method
Begin
C01: MAR PC; (d1=t1-t0)
t0 Delay block Di
D1
C02: MDR M(MAR),
t1
c03: PC PC+1 ; (d2=t2-t1) Ci+1,j’
D2
t2
C04: IR MDR; (d3=t3-t2)
D3
t3 C05: F 0; E 1; (d4=t4-t3)
D4
t4
Execute
State table method
Combinational logic
FFs
Hard-wired Control Unit
Advantages :
1. Minimizes the average number of clock cycles needed per instruction
2. High efficiency in terms of operation speed
3. Occupies a relatively small area (typically 10%) of the CPU chip area
4. Is to minimize cost of the circuit
Disadvantages :
1. Complex sequencing & micro-operation logic
2. Difficult to design and test
3. Inflexible design
4. Large design turn around time for complex design
5. Difficult to add new instructions
Micro programmed Control
Micro programmed Control : A control unit whose binary control variable is
stored in the memory (Control Memory) is called as micro programmed control.
Micro programme Consists of microinstructions.
Microinstruction : - Contains a control word and a sequencing word
Control Word - All the control information required for one clock cycle
Sequencing Word - Information needed to decide the next microinstruction
address.
- Vocabulary to write a microprogram.
Control Memory(Control Storage: CS)
- Storage in the microprogrammed control unit to store the microprogram
Writeable Control Memory(Writeable Control Storage:WCS)
- CS whose contents can be modified
-> Allows the microprogram can be changed
-> Instruction set can be changed or modified
Dynamic Microprogramming
- Computer system whose control unit is implemented with a microprogram in
WCS
- Microprogram can be changed by a systems programmer or a user
Micro programmed Control
The concept of micro programmed control, employ the following steps:
1. Any instruction to be executed on a CPU can be broken down into a set of
sequential micro operations – each specifying a RTL operation on the data path.
The set of micro operations to be executed on the RTL components at any time
step is referred as microinstructions.
3. To implement an instruction on the data path , the control signals stored in the
ROM can be accessed
4. The control signals read from the ROM are used to control the micro operations
associated with a microinstruction to be executed at any time step
6. The steps 3,4 and 5 are repeated till the set of sequential microinstructions
associated with the instruction is executed
Micro programmed Control
Starting
IR address
generator
Clock P C
Control
store CW
Next-address
generator
Sequencer
Control address
register
Control address
Address
Control
memory
(R OM)
Data
Microinstruction
6 Uses of the pipeline are simple in RISC. Uses of the pipeline are difficult in CISC.
It uses a limited number of instruction that It uses a large number of instruction that requires more
7
requires less time to execute the instructions. time to execute the instructions.
It uses LOAD and STORE that are independent
It uses LOAD and STORE instruction in the memory-
8 instructions in the register-to-register a program's
to-memory interaction of a program.
interaction.
9 RISC has more transistors on memory registers. CISC has transistors to store complex instructions.
10 The execution time of RISC is very short. The execution time of CISC is longer.
RISC architecture can be used with high-end
CISC architecture can be used with low-end
11 applications like telecommunication, image
applications like home automation, security system, etc.
processing, video processing, etc.
12 It has fixed format instruction. It has variable format instruction.
The program written for RISC architecture needs Program written for CISC architecture tends to take less
13
to take more space in memory. space in memory.
Pipelining
To improve the performance of a CPU we have two options:
1) Improve the hardware by introducing faster circuits.
2) Arrange the hardware such that more than one operation can be performed at
the same time.
Since there is a limit on the speed of hardware and the cost of faster circuits is
quite high, we have to adopt the 2nd option.
Now, the first instruction is going to take ‘k’ cycles to come out of the pipeline
but the other ‘n – 1’ instructions will take only ‘1’ cycle each,
i.e, a total of ‘n – 1’ cycles.
So, time taken to execute ‘n’ instructions in a pipelined processor:
ETpipeline = k + n – 1 cycles = (k + n – 1) Tp
In the same case, for a non-pipelined processor, the execution time of ‘n’
instructions will be:
ETnon-pipeline = n * k * Tp
Pipelining
So, speedup (S) of the pipelined processor over the non-pipelined processor, when
‘n’ tasks are executed on the same processor is:
S = Performance of pipelined processor / Performance of non-pipelined
processor
As the performance of a processor is inversely proportional to the execution time,
we have,
S = ETnon-pipeline / Etpipeline
S = [n * k * Tp] / [(k + n – 1) * Tp]
S = [n * k] / [k + n – 1]When the number of tasks ‘n’ is significantly larger than k,
that is, n >> k
S = n * k / n S = kwhere ‘k’ are the number of stages in the pipeline.