Chapter N°2 Main Components of Computers
Chapter N°2 Main Components of Computers
Computer Architecture
Mr. A. LAHMISSI - January 2025
1
X R
2 Buses
3 CPU Registers
2
H
Communication Bus State 0
State 1
3rd state HZ
R/W
50% reduction of wires V
Registers Registers
bi-directional D Q
uni-directional uni-directional CK
3
H
Communication Bus Exercise Identify below Components and Buses
R/W
Mux/D-Mux
4
Communication Bus
5
Bus wires
1
0 10110100
1
1
0 8 wires bus of 8-bits
1
0
0
6
Type of Buses
7
H
Type of Buses
(bi-directional)
Data bus
Peripherals
Main I/O
CPU
memory Unit
(uni-directional)
Address bus
Control bus
(bi-directional)
(Primarily uni-directional)
8
H
Address Bus for Memory Addressing (Addressing capabilities)
The Memory Address bus : is a unidirectional bus which allows to transport the
addresse code generated by the CPU for addressing the Main Memory.
The Memory Address bus with a size of n bits can addresses
2n bytes (2n of DATA Register).
One Data Register can be a bytes, a word or a multiple of them.
DATA Register of One Byte contains 8 bits 1 kBytes is 210 Bytes (1024 B )
DATA Register of One Word contains 16 bits 1 MBytes is 220 Bytes (1048 576 B )
DATA Register of Double Word contains 32 bits 1 GBytes is 230 Bytes
DATA Register of Quadruple Word contains 64 bits 1 TBytes is 240 Bytes
Data bus: this is a bidirectional bus, it ensures the transfer of data between
the microprocessor and its environment, and vice versa. The size of this
bus specifies the possible values of the data that can be carried by this bus.
If the size of the DATA Bus is of n bits, then:
o The possible unsigned DATA values that can be carried by the Bus are:
from ( 0 ) to ( +2n - 1 )
o The possible signed DATA values that can be carried by the Bus are :
from ( -2n – 1 ) to ( + 2n –1 - 1 )
DATA Bus Size Unsigned mode DATA range Signed mode DATA range
4 bits 0 , 15 (0H – FH) -8 ~ +7 (08H ~7H)
8 bits 0 , 255 (00H – FFH) -128 ~ +127 (80H ~ 7FH)
16 bits 0 , 65535 (0000H-FFFFH) -32768 ~ +32767 (8000H ~ 7FFFH)
32 bits FFFF FFFFh ( 8000 0000 ~ 7FFF FFFFH )
64 bits FFFF FFFF FFFF FFFFh ( 8000 0000 0000 0000H
~ 7FFF FFFF FFFF FFFFH )
10
H
Control Bus
Control bus: is typically unidirectional but sometimes could be bidirectional. it is used
to carry control signals between the microprocessor and other
components such as memory and I/O devices.
M/IO
M/IO = 1: memory access, = 0: Peripheral access
R/W = 1: Read operation, = 0: Write operation CPU R/W
DEN =1 : Invalid data , = 0: Valid data, DEN
it's worth noting that while the control bus is primarily unidirectional, some control signals
need a response from the receiving devices, but this is generally not considered a
bidirectional communication in the same sense as data buses, which are often bidirectional.
11
H
The ALU
Arithmetical Logical
(+, -, %, *…) (AND, OR, XOR, …)
Exemple
Instruction1: ADD A, B // (ADD): Binary Operator (A,B): Operands
Instruction2: NOT A // (NOT): Unary Operator (A): Operand
12
H
The ALU
A combinational circuit combines a series of logic gates, including AND, OR and NOT gates...
13
H
The ALU
C
(Command Input)
A
(Op1)
UAL S
(Result)
B
(Op2)
PSW
(Flags)
C.C.ADD
A
C.C.SOUS
S
(Result)
C.C.AND
B C.C.CMP
PSW
(Flags)
(C0C1)
Command
Code
C.C : is a Combinational Circuit which combines a series of logic gates, including AND, OR and NOT gates...
15
H
ALU Design Example
C.C.OR
A
C.C.SOUS
S
C.C.NAND
B C.C.ADD
(C0C1)
Command
Code
C.C : is a Combinational Circuit which combines a series of logic gates, including AND, OR and NOT gates...
16
H
Multiplexer (4 to 1) Multiplexer (8 to 1)
C1 C0 S
0 0 E0
0 1 E1
1 0 E2
1 1 E3
17
H
Multiplexer (4 to 1)
Table of codes assigned to line commands
C
S
C2 C1 C0
C = 011
0 0 0 Add
0 0 1 Sous
0 1 0 Multiplicate
MUX S = Div
0 1 1 Div
1 0 0 Shift Right
1 0 1 Shift Left
1 1 0 (A or B) xor A
1 1 1 Not
18
H
Logical gates designs
OR OR NOT
NAND
XNOR= 2 x NAND + OR
XNOR=NOR or AND
XOR=OR and NAND
(3 gates)
NOR NOR
19
H
Half-Adder (H.A) – 1 bits
It is a circuit that adds two binary numbers of one bit in each, let it be A and B. however
it returns two outputs, S and R of one bit in each.
S: is the result of the addition
R: is the carry out of the addition
A0 B0 S0 R0
0 0 0 0
0 1 1 0
1 0 1 0
1 1 0 1
20
H
Full -Adder (F.A) – 1 bits
It is a circuit that adds three numbers of one bit in each, let it be An, Bn and Rn-1;
however it returns two outputs, Sn and Rn of one bit in each.
S : is the result of the addition (on N bits).
R : is the carry out of the addition (on 1 bits ).
N bits
(H.A)
0 0 0 0 0 (H.A)
0 0 1 1 0
0 1 0 1 0
0 1 1 0 1
1 0 0 1 0
1 0 1 0 1
1 1 0 0 1
1 1 1 1 1
21
H
Full -Adder (F.A) – 4 bits
22
H
Full -Adder (F.A) – 4 bits
23
H
The PSW - (Register) – the FLAGS C
C.C.ADD
A
C.C.SOUS
C.C.AND
B MUX S
C.C.CMP
C.C.XOR
ZF SF CF OF … PSW
(Flags)
PSW = Program Status Word (Register)
C.C : is a Combinational Circuit which combines a series of logic gates, including AND, OR and NOT gates...
24
H
The PSW - (Register) – the FLAGS
PSW = Program Status Word (Register)
25
H
Signed values and unsigned values
In signed mode the following byte In signed mode the byte is presented by C2
Binary Code 1111 1111 Example (-1) is presented in Binary Code as:
Is in Decimal code (-1) 1000 0001 ~↦C2↦~ 1111 1111
ZF SF CF OF …
UnSigned values and unsigned values
26
H
ALU
C
N-bits C.C.OP1
A
C.C.OP2
N-bits
N-bits C.C.OP3
B MUX S
C.C.OP4
C.C.OPi
…
Z S C O
27
H
ALU
28
H
CPU Registers Type of Registers
29
H
CPU Registers Type of Registers
30
R H
Accumulator (AX)
Base (BX)
Counter (CX)
Data (DX)
31
R H
EU (Execution Unit)
BIU (Bus Interface Unit)
both work simultaneously.
33
R H
34
R H
EBX (Extended Base Counter) : is used for addressing, particularly when dealing with
arrays and strings and it also can be used as a data register when not used for
addressing.
EDX (Extended data register) : is used for In and Out instructions (input, output), also
used to store partial results of Mul and Div operations and in other cases, can be used
as a data register.
35
R H
36
R H
AX = Extended accumulator
BX = Extended Base
CX = Extended Counter CX : can be used by some
instructions as a Counter
DX = Extended Data register
37
H
CPU Registers Type of Registers
16 bits
16 bits
15 0 7 07 0
AX Accumulator Register AH AL
BX Base Register BH BL H : High part
CX Counter Register CH CL L : Low part
DX Data Register DH DL
SP Stack Pointer SP
BP Base Pointer BP
DI Destination Index DI
SI Source Index SI
General register
Specific register
38
H
CPU Registers The PSW Register – - (the FLAGS)
15 0
X X X X OF DF IF TF SF ZF X AF X PF X CF PSW
39
H
Signed Mode of Operations
40
H
CPU Registers The PSW Register – - (the FLAGS)
15 0
X X X X OF DF IF TF SF ZF X AF X PF X CF PSW
DF: Direction Flag, IF: interruption Flag, TF: Trap Flag
Control Bits (bleu)
These are writeable bits, configured by the programmer.
DF=0: SI, DI will be incremented by concerned instructions (instruction : CLD // DF=0)
DF=1: SI, DI will be decremented by concerned instructions (instruction : STD // DF=1)
IF=0 : interruptions ignored (instruction : CLI // IF=0)
IF=1 : interruptions allowed (instruction : STI // IF=1)
TF=0: step-by-step execution
TF=1: block execution
41
H
CPU Registers Segment Registers
Segment Registers
DS (Data Segment Register ) : contains DATA Segment address, points to the data memory area
42
H
Internal Memories
Registers UCT
(CPU) There is
No direct
access
Main memory
44
H
The central Memories RAM (Random Access Memory)
45
H
The central Memories
Central Memory RAM
Cell Address 0
0 0 0 1 1 1 1 0
Cell
Cell
Cell length = Number of bits
(here is 8 bits) .
.
Memory Size = cell length × Nb_cells .
(unit is bytes « octets » ) .
.
46
H
Communication between CPU & RAM Reading
Order to read the current
Control bus address
(order to read) Addressing (0010)
Cell
CPU Cell
.
10100110 .
.
Cell
. Adresse 0100
.
10000000 Adresse 0011
Cell
CPU Cell
.
10111110 .
.
Cell
. Adresse 0100
.
10111110 Adresse 0011
Data bus transports
10111110 10100110 Adresse 0010
49
H
Memory Segmentation
Segmentation is the subdivision (logical, not physical) of RAM into multiple equal-sized
areas. (typically 64K-bytes memory). It is done by partitioning the bits of the address bus
any Memory address consist of a segment address plus an offset address.
Actual Segment address defines the beginning address of any 64K-bytes memory segment
offset address: selects the desired location within the 64K-bytes memory segment.
Address syntax 0 0 0 0
[Segment : Offset] Segment 00 0 0 0 1
0 0 1 0
0 0 1 1
[00:10] 0 1 0 0
0 1 0 1
Segment 01 0 1 1 0
[01:11] 0 1 1 1
1 0 0 0
Segment 10 1 0 0 1
[11:01] 1 0 1 0
1 0 1 1
Segment 11 1 1 0 0
1 1 0 1
1 1 1 0
1 1 1 1
50
H
Memory Segmentation
@Segment
@Offset
Address syntax 0 0 0 0
[Segment : Offset] Segment 00 0 0 0 1
0 0 1 0
0 0 1 1
[00:10] 0 1 0 0
0 1 0 1
Segment 01 0 1 1 0
[01:11] 0 1 1 1
1 0 0 0
Segment 10 1 0 0 1
[11:01] 1 0 1 0
1 0 1 1
Segment 11 1 1 0 0
1 1 0 1
1 1 1 0
1 1 1 1
51
H
Memory Segmentation
The Memory words are numbered 0,1,2…. The processor dispose of an instruction set and
Each instruction in the instruction set fits into one memory word.
The processor also has a number of General Purpose Registers. There are three special
registers :
Stack Pointer (SP): generally used to point to the last record of the stack and is
normally initialized immediately below the global data of the program. When data is
pushed on to the stack (using the PUSH instruction) the (SP) gets
automatically incremented. Thus, the stack grows towards higher memory locations.
Instruction Pointer (IP) carries the address of the current instruction under execution
and is automatically incremented to point to the next instruction to be executed. The IP
register always hold the address of the next instruction to be executed.
Base Pointer (BP) is generally used to store the base address of an activation record for
procedure evocations. Although any other register can act as the base pointer, but the
availability of an explicit base pointer gives better structure and clarity to program
compilation.
MOV AX, [BP] // indirect addressing
MOV AX, [BX+0002H] //data are accessed via a base register with displacement
52
H
Memory Segmentation Stack Operations
Stack Segment Register (SS): is used for addressing Stack Segment in the memory.
The Stack Segment is the memory segment which is used to store Stack Data.
One of the most ingenious temporary uses of the stack is to call other sub-procedures. In
this case, the Stack Data contains return addresses after interrupts and subroutine calls.
Also, it can be used for storing temporary data and so a sub-procedure can use the stack for
its own local variables. Initially, the (SP) value always gets started to point the top of the
stack segment.
0000
0001
The stack operations follow the principle of LIFO 0010
(Last In First Out) 0011
The stack operations can be performed by : 0100
0101
PUSH and POP instructions 0110
interrupts or subroutine calls. 0111
Segment 1000 Offset
SS=10 1001 SP=01
1010
xxxxx 1011
1100
1101
1110
1111
53
H
Memory Segmentation Stack Operations Using PUSH and POP instructions
after some
operations
AX changed
54
H
Memory Segmentation Stack Operations Using Subroutine calls
For an intersegment (near) return, the address on the stack is a segment offset that is
popped onto the IP.
For an intersegment (far) return, the address on the stack is a long pointer. The offset is
popped first, followed by the segment selection.
55
R H
AX = Extended accumulator
BX = Extended Base
CX = Extended Counter CX : can be used by some
instructions as a Counter
DX = Extended Data register
56
H
ID Segments
Memory Segmentation Stack Operations Using Subroutine calls ID Offsets
Segments 0000
01: offset1: Ins 1
0001
01: offset2: Ins 2 DS=00 0010 Offsets
01: offset3: CALL PROC 0011
01: offset4: Ins 4 0100
CS=01 0101
PROC 0110
01:offset31: Inst 1 0111
01:offset32: Inst 2 1000
RET 1001 SP=01
SS=10 1010
xxxxx 1011
1100
1101
ES=11 1110
1111
Near Procedure (Near Call )
CALL : store IP_Return from IP in the stack ⇒ (SP=SP-2) and load IP to Call position
RET : reload IP_Return to IP from the stack and empty the stack location ⇒ (SP=SP+2)
57
H
Memory Segmentation Stack Operations Using Subroutine calls
SRAM DRAM
Used in Cache Memory Used in Central Memory
Short access time (short latency) Long access time (long latency)
59
H
SRAM DRAM
it stores information as long as the power is supplied. it stores information as long as the power is supplied or a
few milliseconds when the power is switched off.
Transistors are used to store information in SRAM. Capacitors are used to store data in DRAM.
Capacitors are not used hence no refreshing is required. To store information for a longer time, the contents of the
capacitor need to be refreshed periodically.
SRAM is faster compared to DRAM. (high data transfer rate) DRAM provides slow access speed. (lower data transfer rate)
bits are stored in voltage form. bits are stored in the form of electrical charges.
These are used in cache memories. These are used in main memories.
Consumes less power and generates less heat. Uses more power and generates more heat.
SRAMs has lower latency DRAM has more latency than SRAM
SRAM is used in high-speed cache memory DRAM is used in lower-speed main memory.
SRAM is used in high performance applications DRAM is used in general purpose applications
60
H
Cache Memory Mapping functions Important Concepts
Cache memory operates between 10 to 100 times faster than RAM, requiring only a few nanoseconds
to respond to a CPU request.
OUTPUT
INPUT UNITS
CPU UNITS
CENTRAL SECONDARY
MEMORY MEMORY
61
H
Cache Memory Mapping functions Important Concepts
Cache memory operates between 10 to 100 times faster than RAM, requiring only a few nanoseconds
to respond to a CPU request.
OUTPUT
INPUT UNITS
CPU UNITS
63
H
Cache Memory Mapping functions Important Concepts
Levels of Memory
Level 1 or Register: It is a type of memory in which data is stored and accepted that
are immediately stored in the CPU. The most commonly used registers are
Accumulator, Program counter , Address Register, etc.
Level 2 or Cache memory: It is the fastest memory that has faster access time where
data is temporarily stored for faster access.
Level 3 or Main Memory: It is the memory on which the computer works currently. It
is small in size and once power is off data no longer stays in this memory.
Level 4 or Secondary Memory: It is external memory that is not as fast as the main
memory but data stays permanently in this memory.
64
H
Cache Memory Mapping functions Important Concepts Principle of locality
Principle of locality
The locality principle states that the information that the processor will access
has a high probability of being located in a spatial window and a temporal
window.
Temporal Locality Spatial Locality
65
H
Cache Memory Mapping functions Important Concepts Principle of locality
66
H
Cache Memory Mapping functions Important Concepts
When the processor needs to read or write a location in the main memory, it first checks
for a corresponding entry in the cache.
Cache hit occurs, when the required word is found in the cache memory. Then the
required word would be delivered to the CPU from the cache memory.
Cache miss occurs, when the required word is not present in the cache memory.
So, the page containing the required word has to be mapped from the main memory.
This mapping is performed using cache mapping techniques.
For a cache miss, the cache allocates a new entry to the cache and copies the rested data
from the main memory, then the request is fulfilled from the contents of the cache.
Cache Mapping defines the
method of how the contents of
the main memory are brought
into the cache memory. It
defines how a block from the
main memory is mapped to the
cache memory in case of a
cache miss.
67
H
Cache Memory Mapping functions Important Concepts
We can improve Cache performance using higher cache line size, and higher associativity, reduce miss
rate, reduce miss penalty, and reduce the time to hit in the cache.
68
H
Cache Memory Mapping functions Important Concepts
Exercise
The access time of cache memory is 100ns and that of the main memory is 1 µsec.
80% of the memory operations are for reading requests and the rest of the others are for writing.
The hit ratio for reading is (0,9).
Calculate the total average access time of the system for both read and write requests
[Answers]
69
H
Cache Memory Mapping functions Important Concepts
Exercise
a cache memory is needs an access time of 30ns and main memory 150ns,
What is the average access time of CPU to read from memories (assume hit ratio = 80%)?
[Answers]
70
H
Cache Memory Mapping functions Important Concepts
The Main_memory is divided into equal size partitions called as “blocks” or frames.
The Cache_memory is divided into partitions having the same size as that of those blocks,
and these Cache_partitions are called “Lines”.
During cache mapping, a certain number of blocks are copied to the cache and they are read
directly from the cache when they are needed for processing. So, only needed blocks are
copied into the cache and a kind of Mapping technique is needed to do that.
Block3
Page2
BlockM
71
H
Cache Memory Mapping functions Important Concepts
Block M
Tag Tag is the higher-order bits (size = K
of the memory address. Cells)
Address decoder
Memory
(Multiplexer)
73
H
Cache Memory Mapping functions Cache Mapping Forms
The cache size is much smaller than the memory size. A strategy method for copying
data blocks into the cache must be defined. This strategy method is called mapping.
74
H
Cache Memory Mapping functions Direct Mapping
75
H
Cache Memory Mapping functions Direct Mapping
In below presentation, the cache memory is divided into ‘n’ number of lines. a block ‘j’ of
the main memory can map only to a specific line number (j mod n) of the cache.
The line number of cache to which a particular block can map is given by :
Cache line number = (J ) Modulo (n)
Modulo = is the rest of a division operation. Ex: (12 modulo 10 = 2)
Direct Mapping Pattern
Example
Line N° Block N°
0 0
1 1
Repeated
2 2 mapping
3 3 pattern
4 4
5
6 Repeated
7 mapping
8 pattern
4 modulo 5 = 4
5 modulo 5 = 0 9
6 modulo 5 = 1 10
76
H
Cache Memory Mapping functions Direct Mapping
The memory blocks are mapped to cache lines using a specified mechanism. When
requesting a memory address, the memory address components are divided into three
partitions. Those three partitions are the tag, the index, and the offset.
The number of bits of each partition depends on : the size of the cache, the size of the
main memory, and the number of blocks ( which is a multiple number of number of
cache lines).
The Tag bits represents the higher-order bits of the memory address.
The index bits indicate the cache Line number to which the memory block is mapped.
the memory block number is identified by the Tag bits and the index bits together.
While the offset bits specify the position of the Cell data within the line or the
memory Block
77
H
Cache Memory Mapping functions Direct Mapping
Number of cells
in memory
In direct mapping all the lines contain the same Tag value at the same time. So, one comparator is
enough to compare all line’s Tag, with the requested Tag)
79
H
Cache Memory Mapping functions Direct Mapping
Cache Memory is a
Direct Mapping procedure
matrix of Cells
(here 8x8 Matrix)
Total number of Blocks = (Main Memory size) divided by (total Number of lines in the Cache)
80
H
Cache Memory Mapping functions Direct Mapping
m1 thrash=defeat
me; 30.11.2024
H
Cache Memory Mapping functions Direct Mapping
Cache
Division of the Memory Address in Direct Mapping size
In direct mapping, the physical address is divided as :
The Tag, Line Number and the Offset . Tag Cache Matrix
They are all derived from the requested
memory address
Memory Address
Main Memory
Block Number
Block Cell offset
82
H
Cache Memory Mapping functions Direct Mapping
83
H
Cache Memory Mapping functions Direct Mapping
Exercise
if there are N (=8) lines in the cache,. Find the number of bits of the line index.
[Answers]
if there are N lines in the cache, the number of bits of the line index in direct mapping :
line index =log2(N) bits , that is N=2index bits=23 , so number of bits of the line index = 3 bits
Exercise
Consider an 8-bits memory address (Physical address) [01000011] and cache with 8 lines, the blocks
containing 8 cells of Bytes. Find:
The total number of Blocks in the main memory.
Give the Block number of memory address 01000011, its mapped line and its Tag
[Answers]
We use 3bits for the index (since cache lines hold 8=23 lines ).
and 3bits for the offset (since blocks containing 8 Bytes 8=23 Bytes )
The remaining two bits are for the tag.
The total number of Blocks = total memory size/ Block size = 28 /8 =32 Blocks
Block number =01000; line number=(01000)2 mod 8 = 8 mod 8 = 0 and Tag = 01
84
H
Cache Memory Mapping functions Direct Mapping
Important Results
Here are a few crucial results for a direct-mapped cache:
The block j of the main memory is mapped to only one specific line number in
the cache ( j mod lines in cache).
The size of every multiplexer = The total number of lines present in the cache
Total number of required comparators = 1
The size of the comparator = The total number of bits present in the tag
Hit latency = Comparator latency + Multiplexer latency
There is no requirement for a replacement algorithm (all the cache changes).
85
H
Cache Memory Mapping functions Direct Mapping
Advantages of Direct-mapping
Simple method of implementation.
Low hardware complexity. (Only one comparator)
short access time.
There is no requirement for a replacement algorithm.
Disadvantages of Direct-mapping
Unpredictable Cache performances.
poor handling of spatial locality.
Inefficient Use of cache space.
High Confronting missing cases.
86
H
Cache Memory Mapping functions Fully Associative Mapping
Memory Address
Fully Associative Mapping allows a memory block to be loaded into any cache line. This
makes fully associative mapping more flexible than direct mapping. It is considered to be
the fastest and most flexible mapping form.
Fully associative Mapping tend to have the fewest cache-misses for a given cache capacity,
but they require more hardware for additional tag comparisons to compare every line Tag.
They are best suited to relatively small caches because of the large number
of comparators.
A fully associative cache contains a single set with K ways, where K is the number of lines
in the set. A memory address can map to a block DATA in any of these ways.
87
H
Cache Memory Mapping functions Fully Associative Mapping
In fully associative mapping, a block of the main memory can map to any line of the cache
freely available at that moment. fully associative mapping is more flexible than direct
mapping.
Block M
88
Cache Memory Mapping functions Fully Associative Mapping
Cache
Division of Memory Address in Fully Associative Mapping size
In Fully Associative Mapping, the memory
Cache Line
address is divided into a Tag field and a Matrix Number
Block/line offset.
The Tag uniquely identifies the block Tag Line offset
number and it is used to compare with
the cache Tags to find a Tag match (a Hit).
The offset identifies the required word
within the Tagged block/line .
In associative mapping, there is no index Main Memory Main Memory
Block Number Block offset
bits. The line number is not indicated.
Main
Memory Size
89
Cache Memory Mapping functions Fully Associative Mapping
90
D D
92 68 R H
Cache Memory Mapping functions Fully Associative Mapping
91
D D
92 68 R H
Cache Memory Mapping functions Fully Associative Mapping
92
R H
Cache Memory Mapping functions Fully Associative Mapping
93
R H
Cache Memory Mapping functions Fully Associative Mapping
94
H
Cache Memory Mapping functions Fully Associative Mapping Mapping Performance
The performance of cache memory is frequently measured in terms of a quantity called Hit
ratio (HR).
cache mapping`s performance is directly proportional to the Hit ratio.
Fully Associative Mapping is more flexible than direct mapping and has higher possible Hit
ratio (HR) for a given cache capacity compared with direct mapping but requires more
hardware with longer access time. It is considered to be the most flexible mapping form
but with more hardware. They are best suited to relatively small caches because of the
large number of hardware comparators.
We can improve Cache performance using higher cache line size, and higher associativity, reduce miss
rate, reduce miss penalty, and reduce the time to hit in the cache.
95
H
Cache Memory Mapping functions Fully Associative Mapping
96
H
Cache Memory Mapping functions K-way-Set Associative Mapping
Memory Address
the cache memory is divided into many sets of lines with a complete associability in within.
This form of mapping is an enhanced form of direct mapping where the drawbacks of
direct mapping are removed. Set associative addresses the problem of possible thrashing
in the direct mapping method. It does this by enabling that instead of having exactly one
line that a block can map to in the cache, we will group a few lines together creating a set .
Then a block in memory can map to any one of the lines of a specific set.
Set associative cache mapping combines the best of direct and associative cache mapping
techniques all together.
97
H
Cache Memory Mapping functions K-way-Set Associative Mapping
Associated Cache set number = (concerned Block Number) Modulo (Number of sets)
Memory Address
Special Cases
• If k = 1, then k-way set associative mapping becomes direct mapping.
• If k = Total number of lines in the cache, then k-way set associative mapping becomes
fully associative mapping.
98
H
Cache Memory Mapping functions K-way-Set Associative Mapping
In Set-associative mapping, each cell present in the cache set maps other cells in the
main memory with the same set number.
99
H
Cache Memory Mapping functions K-way-Set Associative Mapping
Associated Cache set number = (concerned Block Number) Modulo (Number of sets)
Block M
100
H
Cache Memory Mapping functions K-way-Set Associative Mapping
102
H
Cache Memory Mapping functions K-way-Set Associative Mapping
103
H
Cache Memory Mapping functions K-way-Set Associative Mapping
105
H
Cache Memory Mapping functions K-way-Set Associative Mapping
Exercise
From the presented cache
mapping, find :
The total size of the main
memory,
Total Number of sets
Total Number of ways
Total Number of Blocks
Total Number of Bytes.
Redo the same question if
there are 8 ways instead
of two, and explain the
advantage.
[Answers] Memory size =16GB , 2 ways, Number of sets =4, Number of Blocks = 230 Number of Bytes = 4x232
=16GB . if 8 ways, nothing changes except cache performance will be better ( MR will reduce)
107
H
Cache Memory Mapping functions K-way-Set Associative Mapping
Exercise
If the total Number of Sets in a cache is 4 and the Total number of ways (lines) within one set
is 2, Find the Set number for the Blocks 00, 01,06,11, 54
108
H
Cache Memory Mapping functions K-way-Set Associative Mapping
109
H
Cache Memory Mapping functions K-way-Set Associative Mapping
Set associative mapping is a combination of direct mapping and fully associative mapping.
It uses fully associative mapping within each set. Thus, the set associative mapping needs
a replacement algorithm.
110
H
Cache Memory Mapping functions K-way-Set Associative Mapping
111
H
Summary
Levels of Memory
Level 1 or Register: It is a type of memory in which data is stored and accepted that are
immediately stored in the CPU. The most commonly used register is
Accumulator, Program counter, Address Register, etc.
Level 2 or Cache memory: It is the fastest memory that has faster access time where
data is temporarily stored for faster access.
Level 3 or Main Memory: It is the memory on which the computer works currently. It is
small in size and once power is off data no longer stays in this memory.
Level 4 or Secondary Memory: It is external memory that is not as fast as the main
memory but data stays permanently in this memory.
Cache memory operates between 10 to 100 times faster than RAM, requiring only a few
nanoseconds to respond to a CPU request. The name of the actual hardware that is used
for cache memory is high-speed static random access memory (SRAM).
112
H
Summary
Cache mapping is a technique that defines how contents of main memory are brought
into cache. Cache Mapping Techniques are :
Direct mapping is Simple for implementation with Low hardware complexity (Only
one Tag comparator) and it has short access time.
Fully Associative Mapping is more flexible than direct mapping and has higher
possible Hit ratio (HR) for a given cache capacity compared with direct mapping
but requires more hardware. It is considered to be the fastest and most flexible
mapping form but with more hardware. They are best suited to relatively small
caches because of the large number of hardware comparators.
We can improve Cache performance using higher cache line size, and higher associability, reduce miss
rate, reduce miss penalty, and reduce the time to hit in the cache.
113
H
Summary
Direct-mapped cache, is commonly used in microcontrollers and simple embedded systems with
limited hardware resources.
The direct-mapped cache is simple and fast, but it may not be suitable for all memory access
patterns and workloads due to its higher conflict and miss rates and limited associability. Direct-
mapped cache, there is no requirement for a replacement algorithm.
One of the major use cases is in embedded systems where power consumption, simplicity, and
determinism are critical considerations. These applications often operate with limited resources and
require efficient memory access with predictable timing.
Direct-mapped caches offer constant access times and predictable cache behavior, making them
suitable for real-time systems like aerospace, automotive, or industrial control applications.
We also use them when low latency is a critical requirement. Due to their simple and
straightforward design, direct-mapped caches can offer low access times for specific memory access
patterns.
114
H
Summary
115
H
Summary
Temporal and spatial locality insure that nearly all references can be found in smaller
memories and at the same time gives the illusion of a large, fast memory being presented to
the processor.
Cache locality (both temporal and spatial) concept helps you to manage your program to
run faster by making sure that frequently accessed data is kept close by in fast-access
memory. By understanding and applying this concept, you can write more efficient code
that takes full advantage of the computer's caching system.
In this code example, if result, data1, and data2 are pointers to 0x00, 0x40 and 0x80 respectively
then this loop will cause repeated accesses to memory locations that all map to the same line in
the basic cache. The same thing will happen on each iteration of the loop and our software will
perform poorly. Direct mapped caches are therefore not typically used in the main caches
116
117