Cache Memory Virtual Memory
Cache Memory Virtual Memory
Chapter 8
Chapter 8 <1>
Chapter 8 :: Topics
• Introduction
• Memory System Performance
Analysis
• Caches
• Virtual Memory
• Memory-Mapped I/O
• Summary
Chapter 8 <2>
1
06-05-2019
Introduction
• Computer performance depends on:
– Processor performance
– Memory system performance
Memory Interface
CLK CLK
MemWrite WE
Address ReadData
Processor Memory
WriteData
Chapter 8 <3>
Processor-Memory Gap
In prior chapters, assumed access memory in 1 clock
cycle – but hasn’t been true since the 1980’s
Chapter 8 <4>
2
06-05-2019
Chapter 8 <5>
Memory Hierarchy
Access Bandwidth
Technology Price / GB
Time (ns) (GB/s)
DRAM $10 10 - 50 10
Speed
Main Memory
Chapter 8 <6>
3
06-05-2019
Locality
Exploit locality to make memory accesses fast
• Temporal Locality:
– Locality in time
– If data used recently, likely to use it again soon
– How to exploit: keep recently accessed data in higher
levels of memory hierarchy
• Spatial Locality:
– Locality in space
– If data used recently, likely to use nearby data soon
– How to exploit: when access data, bring nearby data
into higher levels of memory hierarchy too
Chapter 8 <7>
Memory Performance
• Hit: data found in that level of memory hierarchy
• Miss: data not found (must go to next level)
4
06-05-2019
Chapter 8 <9>
Chapter 8 <10>
5
06-05-2019
Chapter 8 <11>
Chapter 8 <12>
6
06-05-2019
Cache
• Highest level in memory hierarchy
• Fast (typically ~ 1 cycle access time)
• Ideally supplies most data to processor
• Usually holds most recently accessed data
Chapter 8 <14>
7
06-05-2019
Chapter 8 <15>
Chapter 8 <16>
8
06-05-2019
Cache Terminology
• Capacity (C):
– number of data bytes in cache
• Block size (b):
– bytes of data brought into cache at once
• Number of blocks (B = C/b):
– number of blocks in cache: B = C/b
• Degree of associativity (N):
– number of blocks in a set
• Number of sets (S = B/N):
– each memory address maps to exactly one cache set
Chapter 8 <17>
9
06-05-2019
Chapter 8 <19>
00...00100100 mem[0x00...24]
00...00100000 mem[0x00..20] Set Number
00...00011100 mem[0x00..1C] 7 (111)
00...00011000 mem[0x00...18] 6 (110)
00...00010100 mem[0x00...14] 5 (101)
00...00010000 mem[0x00...10] 4 (100)
00...00001100 mem[0x00...0C] 3 (011)
00...00001000 mem[0x00...08] 2 (010)
00...00000100 mem[0x00...04] 1 (001)
00...00000000 mem[0x00...00] 0 (000)
10
06-05-2019
8-entry x
(1+27+32)-bit
SRAM
27 32
Hit Data
Chapter 8 <21>
Chapter 8 <22>
11
06-05-2019
Chapter 8 <24>
12
06-05-2019
28 32 28 32
= =
1
Hit Data
Chapter 8 <26>
13
06-05-2019
Chapter 8 <27>
Chapter 8 <28>
14
06-05-2019
V Tag Data V Tag Data V Tag Data V Tag Data V Tag Data V Tag Data V Tag Data V Tag Data
Chapter 8 <29>
Spatial Locality?
• Increase block size:
– Block size, b = 4 words
– C = 8 words
– Direct mapped (1 block per set)
– Number of blocks, B = 2 (C/b = 8/4 = 2)
Block Byte
Tag Set Offset Offset
Memory
00
Address
27 2
V Tag Data
Set 1
Set 0
27 32 32 32 32
11
10
01
00
32
=
Hit Data
Chapter 8 <30>
15
06-05-2019
Block Byte
Tag Set Offset Offset
Memory
00
Address
27 2
V Tag Data
Set 1
Set 0
27 32 32 32 32
11
10
01
00
32
=
Hit Data
Chapter 8 <31>
Chapter 8 <32>
16
06-05-2019
10
01
00
32
=
Hit Data
Chapter 8 <33>
Chapter 8 <34>
17
06-05-2019
Capacity Misses
• Cache is too small to hold all data of interest at once
• If cache full: program accesses data X & evicts data Y
• Capacity miss when access Y again
• How to choose Y to minimize chance of needing it again?
• Least recently used (LRU) replacement: the least recently
used block in a set evicted
Chapter 8 <35>
Types of Misses
• Compulsory: first time data accessed
• Capacity: cache too small to hold all data of
interest
• Conflict: data of interest maps to same
location in cache
Chapter 8 <36>
18
06-05-2019
LRU Replacement
# MIPS assembly
lw $t0, 0x04($0)
lw $t1, 0x24($0)
lw $t2, 0x54($0)
Way 1 Way 0
Chapter 8 <37>
LRU Replacement
# MIPS assembly
lw $t0, 0x04($0)
lw $t1, 0x24($0)
lw $t2, 0x54($0)
Way 1 Way 0
19
06-05-2019
Cache Summary
• What data is held in the cache?
– Recently used data (temporal locality)
– Nearby data (spatial locality)
• How is data found?
– Set is determined by address of data
– Word within block also determined by address
– In associative caches, data could be in one of several
ways
• What data is replaced?
– Least-recently used way in the set
Chapter 8 <39>
Adapted from Patterson & Hennessy, Computer Architecture: A Quantitative Approach, 2011
Chapter 8 <40>
20
06-05-2019
Multilevel Caches
• Larger caches have lower miss rates, longer
access times
• Expand memory hierarchy to multiple levels of
caches
• Level 1: small and fast (e.g. 16 KB, 1 cycle)
• Level 2: larger and slower (e.g. 256 KB, 2-6
cycles)
• Most modern PCs have L1, L2, and L3 cache
Chapter 8 <42>
21
06-05-2019
Chapter 8 <43>
Virtual Memory
• Gives the illusion of bigger memory
• Main memory (DRAM) acts as cache for hard
disk
Chapter 8 <44>
22
06-05-2019
Memory Hierarchy
Access Bandwidth
Technology Price / GB
Time (ns) (GB/s)
DRAM $10 10 - 50 10
Speed
Main Memory
Hard Disk
Magnetic
Disks
Read/Write
Head
Chapter 8 <46>
23
06-05-2019
Virtual Memory
• Virtual addresses
– Programs use virtual addresses
– Entire virtual address space stored on a hard drive
– Subset of virtual address data in DRAM
– CPU translates virtual addresses into physical addresses
(DRAM addresses)
– Data not in DRAM fetched from hard drive
• Memory Protection
– Each program has own virtual to physical mapping
– Two programs can use same virtual address for different data
– Programs don’t need to be aware others are running
– One program (or virus) can’t corrupt memory used by
another
Chapter 8 <47>
Chapter 8 <48>
24
06-05-2019
Chapter 8 <49>
Chapter 8 <50>
25
06-05-2019
Address Translation
Chapter 8 <51>
Chapter 8 <52>
26
06-05-2019
Chapter 8 <53>
Chapter 8 <54>
27
06-05-2019
Chapter 8 <55>
Chapter 8 <56>
28
06-05-2019
Chapter 8 <57>
Physical
V Page Number
0
0
1 0x0000
VPN is index 1 0x7FFE
0
into page table
Page Table
0
0
1 0x0001
0
0
1 0x7FFF
0
0
15 12
Hit
Physical
0x7FFF 47C
Address
Chapter 8 <58>
29
06-05-2019
address of virtual 0
0
1 0x0000
address 0x5F20? 1
0
0x7FFE
Page Table
0
0
0
1 0x0001
0
0
1 0x7FFF
0
0
Chapter 8 <59>
address of virtual 0
0
1 0x0000
address 0x5F20? 1
0
0x7FFE
Page Table
0
– VPN = 5
– Entry 5 in page table 0
0
VPN 5 => physical 1 0x0001
0
page 1 0
1 0x7FFF
– Physical address: 0
0
0x1F20 Hit
15 12
Physical
0x0001 F20
Address
Chapter 8 <60>
30
06-05-2019
address of virtual V
0
Page Number
0
address 0x73E0? 1 0x0000
1 0x7FFE
0
Page Table
0
0
0
1 0x0001
0
0
1 0x7FFF
0
0
15
Hit
Chapter 8 <61>
address of virtual V
0
Page Number
0
address 0x73E0? 1 0x0000
1 0x7FFE
– VPN = 7 0
Page Table
0
– Entry 7 is invalid
– Virtual page must be 0
0
paged into physical 1 0x0001
0
memory from disk 0
1 0x7FFF
0
0
15
Hit
Chapter 8 <62>
31
06-05-2019
Chapter 8 <63>
Chapter 8 <64>
32
06-05-2019
TLB
• Page table accesses: high temporal locality
– Large page size, so consecutive loads/stores likely to
access same page
• TLB
– Small: accessed in < 1 cycle
– Typically 16 - 512 entries
– Fully associative
– > 99 % hit rates typical
– Reduces # of memory accesses for most loads/stores
from 2 to 1
Chapter 8 <65>
Entry 1 Entry 0
= =
1
15 12
Physical
Hit Address 0x7FFF 47C
Chapter 8 <66>
33
06-05-2019
Memory Protection
• Multiple processes (programs) run at once
• Each process has its own page table
• Each process can use entire virtual address
space
• A process can only access physical pages
mapped in its own page table
Chapter 8 <67>
Chapter 8 <68>
34
06-05-2019
Memory-Mapped I/O
• Processor accesses I/O devices just like
memory (like keyboards, monitors, printers)
• Each I/O device assigned one or more
address
• When that address is detected, data
read/written to I/O device instead of
memory
• A portion of the address space dedicated to
I/O devices
Chapter 8 <69>
Chapter 8 <70>
35
06-05-2019
MemWrite WE
Address ReadData
Processor Memory
W riteData
Chapter 8 <71>
WEM
RDsel1:0
CLK CLK
MemWrite WE
Address
Processor Memory
WriteData
CLK
00
I/O ReadData
01
EN Device 1 10
I/O
EN Device 2
Chapter 8 <72>
36
06-05-2019
Chapter 8 <73>
Chapter 8 <74>
37
06-05-2019
Digital I/O
// C Code
#include <p3xxxx.h>
int main(void) {
int switches;
TRISD = 0xFF00; // RD[7:0] outputs
// RD[11:8] inputs
while (1) {
// read & mask switches, RD[11:8]
switches = (PORTD >> 8) & 0xF;
PORTD = switches; // display on LEDs
}
}
Chapter 8 <75>
Serial I/O
Chapter 8 <76>
38
06-05-2019
Chapter 8 <78>
39
06-05-2019
40