0% found this document useful (0 votes)

0 views39 pages

CMP3010L09 MemoryII

The document covers the concepts of memory hierarchy, focusing on cache types, performance metrics, and optimization techniques. It discusses fully associative and set associative caches, cache performance measurements, and the differences between cache and virtual memory. Additionally, it explores software optimization strategies for improving cache efficiency, particularly in matrix operations.

Uploaded by

Mostafa Mohamed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

0 views39 pages

CMP3010L09 MemoryII

Uploaded by

Mostafa Mohamed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 39

CMP3010: Computer Architecture

L09: Memory Hierarchy ||

Dina Tantawy
Computer Engineering Department
Cairo University
Agenda
• Review
• Full Associative Cache
• Set Associative Cache
• Cache performance
• Cache vs virtual memory
• Software optimization based on caching
Review: memory hierarchy

3
Review: memory hierarchy

4
Review: memory hierarchy

Principle of locality
States that programs access a relatively small portion of their
address space at any instant of time.

Temporal locality Spatial locality

If an item is referenced, If an item is referenced, items

it will tend to be referenced again whose addresses are close by will tend
soon. to be referenced soon.

‹#›
Review: Terminologies
•Hit: data appears in some block in the upper level
–Hit Rate: the fraction of memory access found in the upper level
–Hit Time: Time to access the upper level which consists of
cache access time + Time to determine hit/miss

•Miss: data needs to be retrieved from a block in the lower level

–Miss Rate = 1 - (Hit Rate)
–Miss Penalty = Time to replace a block in the upper level + Time to deliver the
block to the processor

•Hit Time << Miss Penalty

6
Review: The Basics of Caches
• How do we know if a data item is in the cache?
• How do we find it?

7
Review: Direct Mapped Cache

8
Review: Read and Write Policies
• Two write options when the data block is in the memory :
• Write Through: write to cache and memory at the same time.
• Isn’t memory too slow for this?

• Write Back: write to cache only. Write the cache block to memory
when that cache block is being replaced on a cache miss.
• Need a “dirty” bit for each cache block
• Control can be complex

9
Review: Write Miss Policies
• Write allocate (also called fetch on write): data at the missed-
write location is loaded to cache, followed by a write-hit
operation. In this approach, write misses are like read misses.

• No-write allocate (also called write-no-allocate or write

around): data at the missed-write location is not loaded to
cache, and is written directly to the backing store. In this
approach, data is loaded into the cache on read misses only.

‹#›
What about other mappings ?

‹#›
Flexible placement of blocks: Associativity
1111111111 2222222222 33
Block Number 0 1 2 3 4 5 6 7 8 9 0123456789 0123456789 01

Memory

Set Number 0 1 2 3 01234567

Cache

Fully (2-way) Set Direct

Associative Associative Mapped
anywhere anywhere in only into
block 12
can be placed set 0 block 4
(12 mod 4) (12 mod 8)
12
Fully Associative
• Fully Associative Cache -- push the set associative idea to its limit!
• Forget about the Cache Index
• Compare the Cache Tags of all cache entries in parallel
• Example: Block Size = 32 B blocks, we need N 27-bit comparators
• By definition: Conflict Miss = 0 for a fully associative cache
3 4 0
1 Cache Tag (27 bits long) Byte Select
Ex: 0x01

Cache Tag Valid Bit Cache Data

X Byte 31 Byte 1 Byte 0

: :
X Byte 63 Byte 33 Byte 32
X
X

X
: : :
13
Flexible placement of blocks: Associativity

14
A Four-way Set Associative Cache
• N-way set associative: N
entries for each Cache Index
• N direct mapped caches
operates in parallel
• Example: Four-way set
associative cache
• Cache Index selects a “set”
from the cache
• The four tags in the set are
compared in parallel
• Data is selected based on the
tag result

15
Replacement Policy
In an associative cache, which block from a set should be
evicted when the set becomes full?

• Random
• Least-Recently Used (LRU)
• LRU cache state must be updated on every access
• true implementation only feasible for small sets (2-way)

• First-In, First-Out (FIFO) a.k.a. Round-Robin

• used in highly associative caches

Replacement only happens on misses

16
Quiz
Assume a 16 Kbyte cache that holds both instructions and data.
Additional specs for the 16 Kbyte cache include:
- Each block will hold 32 bytes of data
- The cache would be 4-way set associative
- Physical addresses are 32 bits

Q1: How many blocks would be in this cache?

Q2: How many bits of tag are stored with each block entry?

17
Cache Performance

‹#›
Measuring Cache Performance
Impact of cache miss on Performance

19
Example

20
Example: Solution

21
Example Solution

22
Improving Cache Performance

Average memory access time(AMAT) =

Hit time + Miss rate x Miss penalty

To improve performance:
• reduce the hit time
• reduce the miss rate
• reduce the miss penalty

23
Sources of Cache Misses
• Compulsory (cold start, first reference): first access to a block
• Misses that would occur even with infinite cache
• “Cold” fact of life: not a whole lot you can do about it

• Conflict (collision):
• Multiple memory locations mapped to the same cache location
• Solution 1: increase cache size
• Solution 2: increase associativity

• Capacity:
• Cache cannot contain all blocks accessed by the program
• Solution: increase cache size
Reducing Miss Penalty Using Multilevel caches
• Use smaller L1 if there is also L2

• Trade increased L1 miss rate for reduced L1 hit time

and reduced L1 miss penalty

• Reduces average access time

CPU L1 L2 DRAM

25
Performance of Multilevel Caches

26
Effect of Cache Parameters on Performance
• Larger cache size
+ reduces capacity and conflict misses
- hit time will increase

• Higher associativity
+ reduces conflict misses
- may increase hit time

• Larger block size

+ reduces compulsory misses and reload
- increases conflict misses and miss penalty

27
Quiz
• Suppose a processor executes at
• Clock Rate = 1 GHz (1 ns per cycle), Ideal (no misses) CPI = 1.5
• 40% arith/logic, 40% ld/st, 20% control
• Suppose that 5% of memory operations (involving data) get 100 cycle miss penalty
• Suppose that 2% of instructions get same miss penalty

Determine how much faster a processor with a perfect cache that never missed
would run?

28
Is Virtual Memory same as
Caching ?

‹#›
‹#›
Virtual Memory Vs Cache Memory

Virtual Memory Cache Memory

Increases the capacity of main memory. Increase the accessing speed of CPU.

Virtual memory is not a memory unit, its a

Cache memory is a hardware.
technique.
Operating System manages the Virtual
Hardware manages the cache memory.
memory.
The size of virtual memory could be greater The size of cache memory is less than the
than main memory main memory
‹#›
How can we benefit from cache?

‹#›
Software Optimization via Blocking
• When dealing with arrays, we can get good performance from the
memory system if we store the array in memory so that accesses to
the array are sequential in memory. What about the matrix ?

• How Matrix is stored ?

• Row Major (row by row)
• Column Major (column by column)

• A size matrix of 512x512 needs = 1MB, much bigger than level-1

cache. It doesn’t fit in the memory ?!
‹#›
Software Optimization via Blocking
• How Matrix Multiplication is done?

for (int j = 0; j < n; ++j)

{
double cij = C[i+j*n]; /* cij = C[i][j] */
for( int k = 0; k < n; k++ )
cij += A[i+k*n] * B[k+j*n]; /* cij += A[i][k]*B[k][j] */
C[i+j*n] = cij; /* C[i][j] = cij */
}

‹#›
Software Optimization via Blocking

Do we need to store all three

White: not accessed
matrices ? Isn’t that increasing
Light grey: old access
cache misses due to
Dark grey: new access
replacement? ‹#›
Software Optimization via Blocking

Blocked DGEMM ‹#›

Software Optimization via Blocking

Blocked DGEMM ‹#›

Software Optimization via Blocking

‹#›
Thank you

‹#›

CMP3010L03 Pipelining
No ratings yet
CMP3010L03 Pipelining
42 pages
ENSC 20032 - Computer Fundamentals and Programming
No ratings yet
ENSC 20032 - Computer Fundamentals and Programming
55 pages
SCN1501 - 2025 - A1 Questions
No ratings yet
SCN1501 - 2025 - A1 Questions
3 pages
Week12 Updated
No ratings yet
Week12 Updated
60 pages
Chapter # 05
No ratings yet
Chapter # 05
42 pages
Lec8 Memory
No ratings yet
Lec8 Memory
17 pages
Grade 10: 2 Term-Test Bank & Mock Exam Model Answers
No ratings yet
Grade 10: 2 Term-Test Bank & Mock Exam Model Answers
90 pages
IX Science Ch-12 Solutions (Improvement in Food Resources)
No ratings yet
IX Science Ch-12 Solutions (Improvement in Food Resources)
6 pages
Unit 4
No ratings yet
Unit 4
72 pages
Mock Test Key
No ratings yet
Mock Test Key
61 pages
Cache Org
No ratings yet
Cache Org
19 pages
Ch2-MemoryHierarchyDesign Appb
No ratings yet
Ch2-MemoryHierarchyDesign Appb
101 pages
Cache Mapping
No ratings yet
Cache Mapping
23 pages
53-Cache Memory - Principles, Cache Memory Management Techniques-28!02!2025
No ratings yet
53-Cache Memory - Principles, Cache Memory Management Techniques-28!02!2025
38 pages
Week 13 - Lecture 13 - Memory (Cont)
No ratings yet
Week 13 - Lecture 13 - Memory (Cont)
31 pages
05 Density Estimation
No ratings yet
05 Density Estimation
29 pages
10 Cacheperf
No ratings yet
10 Cacheperf
24 pages
02 Training Patterns
No ratings yet
02 Training Patterns
18 pages
Cache Presentation
No ratings yet
Cache Presentation
45 pages
L18 Cache Wrap Up
No ratings yet
L18 Cache Wrap Up
30 pages
Coa PPT
No ratings yet
Coa PPT
158 pages
Lecture 8
No ratings yet
Lecture 8
33 pages
55-Types of Caches, Caches Misses,-04!03!2025
No ratings yet
55-Types of Caches, Caches Misses,-04!03!2025
64 pages
Cache Writing & Performance
No ratings yet
Cache Writing & Performance
23 pages
Unit Iv
No ratings yet
Unit Iv
61 pages
24-Cache Memory Mapping Techniques-14!03!2024
No ratings yet
24-Cache Memory Mapping Techniques-14!03!2024
36 pages
Cache PPT
No ratings yet
Cache PPT
38 pages
CA Lecture 08
No ratings yet
CA Lecture 08
38 pages
Lecture 13 - Introduction To Cache
No ratings yet
Lecture 13 - Introduction To Cache
47 pages
6.module 2 - Part 2
No ratings yet
6.module 2 - Part 2
39 pages
EE6304 Lecture9 Mem Caches
No ratings yet
EE6304 Lecture9 Mem Caches
61 pages
Memory 2
No ratings yet
Memory 2
31 pages
CS2115 Chapter-6
No ratings yet
CS2115 Chapter-6
45 pages
Pattern Classification 08. Gaussian Mixture Model: Abdelmoniem Bayoumi, PHD
No ratings yet
Pattern Classification 08. Gaussian Mixture Model: Abdelmoniem Bayoumi, PHD
12 pages
10 Caches
No ratings yet
10 Caches
34 pages
Cache
No ratings yet
Cache
36 pages
25 e 50 Beb 5 Aad 8 F 60
No ratings yet
25 e 50 Beb 5 Aad 8 F 60
49 pages
Energy Efficiency@festo
No ratings yet
Energy Efficiency@festo
60 pages
5 1
No ratings yet
5 1
39 pages
Samsung GT-N7100 - UM - Open - HongKong - Jellybean - Eng - Rev.1.0 - 120924 - Screen
No ratings yet
Samsung GT-N7100 - UM - Open - HongKong - Jellybean - Eng - Rev.1.0 - 120924 - Screen
135 pages
ch5 Easy
No ratings yet
ch5 Easy
27 pages
001 Unit 3 Disasters and Triage Management Final
No ratings yet
001 Unit 3 Disasters and Triage Management Final
45 pages
Sipass Integrated Afi5100: Installation Manual
No ratings yet
Sipass Integrated Afi5100: Installation Manual
14 pages
Cache Basics and Operation
No ratings yet
Cache Basics and Operation
42 pages
COLT Accessoaries
No ratings yet
COLT Accessoaries
15 pages
Spec Sheet - Bass - Xls - Bass
No ratings yet
Spec Sheet - Bass - Xls - Bass
2 pages
Nomadic Matt's Guide To Road Tripping The United States
No ratings yet
Nomadic Matt's Guide To Road Tripping The United States
70 pages
Application Note: Thermal Management of Golden Dragon LED
No ratings yet
Application Note: Thermal Management of Golden Dragon LED
11 pages
Troubleshooting Ten Step Tango
No ratings yet
Troubleshooting Ten Step Tango
16 pages
Jurnal: PENICILLIN PRODUCTION BY MUTANT OF Penicillium Chrysogenum
No ratings yet
Jurnal: PENICILLIN PRODUCTION BY MUTANT OF Penicillium Chrysogenum
5 pages
Memory Hierarchy Design
No ratings yet
Memory Hierarchy Design
76 pages
STNW3511 Dynamic Standard For Low Voltage EG Connections
No ratings yet
STNW3511 Dynamic Standard For Low Voltage EG Connections
54 pages
The Soil Underfoot - Infinite Possibilities For A Finite Resource (Gnv64)
100% (2)
The Soil Underfoot - Infinite Possibilities For A Finite Resource (Gnv64)
462 pages
CS252 Graduate Computer Architecture Caches and Memory Systems I
No ratings yet
CS252 Graduate Computer Architecture Caches and Memory Systems I
49 pages
05) Cache Memory Introduction
No ratings yet
05) Cache Memory Introduction
20 pages
Cache
No ratings yet
Cache
34 pages
361 Computer Architecture Lecture 14: Cache Memory
No ratings yet
361 Computer Architecture Lecture 14: Cache Memory
20 pages
Cache Memory: A Safe Place For Hiding or Storing Things
No ratings yet
Cache Memory: A Safe Place For Hiding or Storing Things
34 pages
CMSC 611: Advanced Computer Architecture
No ratings yet
CMSC 611: Advanced Computer Architecture
21 pages
UNIT2 Cahe-Opt
No ratings yet
UNIT2 Cahe-Opt
134 pages
Supplementary Material ADA
No ratings yet
Supplementary Material ADA
5 pages
15IF11 Multicore B
No ratings yet
15IF11 Multicore B
36 pages
Efficient Algorithms and Structures with Heaps: Definitive Reference for Developers and Engineers
From Everand
Efficient Algorithms and Structures with Heaps: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Memory Hierarchies (Part 2) Review: The Memory Hierarchy
No ratings yet
Memory Hierarchies (Part 2) Review: The Memory Hierarchy
7 pages
Question: Who Cares About The Memory Hierarchy?: Caches and Memory Systems I
No ratings yet
Question: Who Cares About The Memory Hierarchy?: Caches and Memory Systems I
13 pages
2015Sp CS61C L16 Kavs Caches3
No ratings yet
2015Sp CS61C L16 Kavs Caches3
25 pages
Werkstatthandbuch Linhai 310 420 600
100% (2)
Werkstatthandbuch Linhai 310 420 600
514 pages
Curing Cancer With Carrots PDF Ebook - by Ann Cameron, Ralph Cole
94% (18)
Curing Cancer With Carrots PDF Ebook - by Ann Cameron, Ralph Cole
66 pages
VZ7 - Viking Z Seven - Datasheet
No ratings yet
VZ7 - Viking Z Seven - Datasheet
2 pages
Intuition
100% (6)
Intuition
337 pages
Memory Hierarchy Design
No ratings yet
Memory Hierarchy Design
115 pages
Nature vs. Nuture PDF
No ratings yet
Nature vs. Nuture PDF
4 pages
AC14L08 Memory Hierarchy
No ratings yet
AC14L08 Memory Hierarchy
20 pages
Cache Memory: A Safe Place For Hiding or Storing Things
100% (1)
Cache Memory: A Safe Place For Hiding or Storing Things
34 pages
Som All Theory Question and Answers Shaikh Sir Notes
No ratings yet
Som All Theory Question and Answers Shaikh Sir Notes
7 pages
Computer Architecture: Memory Hierarchy Design
No ratings yet
Computer Architecture: Memory Hierarchy Design
60 pages
Memory Hierarchy - Introduction: Cost Performance of Memory Reference
No ratings yet
Memory Hierarchy - Introduction: Cost Performance of Memory Reference
52 pages
Memory Hierarchy Design-Aca
No ratings yet
Memory Hierarchy Design-Aca
15 pages
Lecture 5: Memory Hierarchy and Cache Traditional Four Questions For Memory Hierarchy Designers
No ratings yet
Lecture 5: Memory Hierarchy and Cache Traditional Four Questions For Memory Hierarchy Designers
10 pages
Memory Cache
No ratings yet
Memory Cache
18 pages
Sampriya Chandra Cache Memory
No ratings yet
Sampriya Chandra Cache Memory
36 pages
ACA Unit-5
No ratings yet
ACA Unit-5
54 pages
William Stallings Computer Organization and Architecture 7th Edition Cache Memory
No ratings yet
William Stallings Computer Organization and Architecture 7th Edition Cache Memory
51 pages
Brakes System Activities
No ratings yet
Brakes System Activities
13 pages
Comparative Study of Finite Element
No ratings yet
Comparative Study of Finite Element
6 pages
Understand CPU Caching Concepts
No ratings yet
Understand CPU Caching Concepts
14 pages
Philosophy of Law
No ratings yet
Philosophy of Law
8 pages
BS EN 10277 5 2008 Bright Steel Products Steel For Quenching and Tempering Part 5 General
No ratings yet
BS EN 10277 5 2008 Bright Steel Products Steel For Quenching and Tempering Part 5 General
11 pages
FreeBSD Mastery: Storage Essentials: IT Mastery, #4
From Everand
FreeBSD Mastery: Storage Essentials: IT Mastery, #4
Michael W. Lucas
No ratings yet
Storage Area Networks For Dummies
From Everand
Storage Area Networks For Dummies
Christopher Poelker
3.5/5 (2)
Articulation Styles:: The Tongue Moves in An Up, Then
No ratings yet
Articulation Styles:: The Tongue Moves in An Up, Then
3 pages
Cache Design
No ratings yet
Cache Design
59 pages
Axial Fans PDF
No ratings yet
Axial Fans PDF
10 pages
C & C++ Interview Questions You'll Most Likely Be Asked
From Everand
C & C++ Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

CMP3010L09 MemoryII

Uploaded by

CMP3010L09 MemoryII

Uploaded by

CMP3010: Computer Architecture

L09: Memory Hierarchy ||

Temporal locality Spatial locality

If an item is referenced, If an item is referenced, items

•Miss: data needs to be retrieved from a block in the lower level

•Hit Time << Miss Penalty

• No-write allocate (also called write-no-allocate or write

Set Number 0 1 2 3 01234567

Fully (2-way) Set Direct

Cache Tag Valid Bit Cache Data

• First-In, First-Out (FIFO) a.k.a. Round-Robin

Replacement only happens on misses

Q1: How many blocks would be in this cache?

Average memory access time(AMAT) =

• Trade increased L1 miss rate for reduced L1 hit time

• Reduces average access time

• Larger block size

Virtual Memory Cache Memory

Virtual memory is not a memory unit, its a

• How Matrix is stored ?

• A size matrix of 512x512 needs = 1MB, much bigger than level-1

for (int j = 0; j < n; ++j)

Do we need to store all three

Blocked DGEMM ‹#›

Blocked DGEMM ‹#›

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.