0% found this document useful (0 votes)

48 views

Understand CPU Caching Concepts

The document discusses CPU caching concepts including: - Caches are faster but smaller memories that store copies of frequently accessed main memory data to bridge the speed gap between CPUs and memory. - The concept of locality of reference, where data near recently accessed locations is often needed next, drives caching. - Caches are organized in multiple levels from fastest/smallest L1 cache to larger/slower L2, L3 caches before main memory. - Cache entries contain tags to identify memory blocks, indexes to map blocks to cache lines, and valid bits. - Cache hits provide fast access while cache misses require retrieving data from slower levels or main memory.

Uploaded by

abhijit-k_rao

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

48 views

Understand CPU Caching Concepts

Uploaded by

abhijit-k_rao

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 14

Understand CPU Caching

Concepts
Concept of Caching
Need for Cache has come about due to reasons :

The concept of Locality of reference.

-> 5 percent of the data is accessed 95 percent of the times, so

makes sense to cache the 5 percent of the data.

The gap between CPU and main memory speeds.

-> In analogy to producer consumer problem, the CPU is the

consumer and RAM, hard disks act as producers. Slow
producers limit the performance of the consumer.
Locality of Reference
 Spatial locality : If a particular memory location say nth location is
referenced at a particular time, then it is likely that (n+1) th memory
location will be referenced in the near future
The actual piece of data that was requested is called the critical word,
and the surrounding group of bytes that gets fetched along with it is
called a cache line or cache block.

 Temporal Locality: If at one point in time say T a particular memory

location is referenced, then it is likely that the same location will be
referenced again at time T+ delta.
This is very similar to the concept of working set, i.e., set of pages which
the CPU frequently accesses.
CPU Cache and its operation
A CPU cache is a smaller, faster memory which stores copies of the
data from the most frequently used main memory locations. The
concept of locality of reference drives caching concept, we cache the
most frequently used, data, instruction for faster data access.

CPU cache could be data cache, instruction cache. Unlike RAM, cache
is not expandable.

The CPU first checks in the L1 cache for data, if it does not find it at
L1, it moves over to L2 and finally L3. If not found at L3, it’s a cache
miss and RAM is searched next, followed by the hard drive.

If the CPU finds the requested data in cache, it’s a cache hit, and if
not, it’s a cache miss.
Levels of caching and speed, size
comparisons
Level Access Typical Technology Managed
Time Size By
Level 1 2-8 ns 8 KB-128 KB SRAM Hardware
Cache (on-
chip)
Level 2 5-12 ns 0.5 MB - 8 SRAM Hardware
Cache (off- MB
chip)
Main Memory 10-60 ns 64 MB - 2 DRAM Operating
GB System
Hard Disk 3,000,000 - 100 GB - 2 Magnetic Operating
10,000,000 ns TB System
Cache organization

When the processor needs to read or write a location in main

memory, it first checks whether that memory location is in the cache.
This is accomplished by comparing the address of the memory location
to all tags in the cache that might contain that address.

If the processor finds that the memory location is in the cache, we say
that a cache hit has occurred; otherwise, we speak of a cache miss.
Cache Entry structure
Cache row entries usually have the following structure:

Data Displacem
Tag Index Valid bit
blocks ent

The data blocks (cache line) contain the actual data fetched from the main memory.
The memory address is split into the tag, the index and the displacement (offset),
while the valid bit denotes that this particular entry has valid data.

•The index length is bits and describes which row the data has been put in.

•The displacement length is and specifies which block of the ones we have stored we
need.

•The tag length is address − index − displacement

Cache organization - 1
Cache is divided into blocks. The blocks form the basic unit of cache
organization. RAM is also organized into blocks of the same size as the
cache's blocks

When the CPU requests a byte from a particular RAM block, it needs to
be able to determine three things very quickly:

1. Whether or not the needed block is actually in the cache

2. The location of the block within the cache

3. The location of the desired byte within the block

Mapping RAM blocks to cache block
 Fully associative : Any RAM block can be stored in any available block
frame. The problem with this scheme is that if you want to retrieve a
specific block from the cache, you have to check the tag of every single
block frame in the entire cache because the desired block could be in any
of the frames
 Direct mapping : In a direct-mapped cache, each block frame can
cache only a certain subset of the blocks in main memory. For Eg. Ram
block X whose modulo results in 1 are always stored in Cache block 1.
The problem with this approach is certain cache blocks could remain
unused and there could be frequent eviction of cache entries for certain
cache blocks.
 N way associative : Ram block X, could be either mapped to Cache
Block X or Y.
Handling Cache Miss
In order to make room for the new entry on a cache miss, the cache has to evict
one of the existing entries.

The heuristic that it uses to choose the entry to evict is called the replacement
policy. The fundamental problem with any replacement policy is that it must
predict which existing cache entry is least likely to be used in the future. Some
of the replacement policies are :

 Random Eviction: Removal of any cache entry by random choice.

 LIFO: Evicting the latest cache entry.

 FIFO: Evicting the oldest cache entry.

 LRU: Evicting the Least recently used cache entry.

Mirroring Cache to Main memory
If data are written to the cache, they must at some point be written to
main memory and other higher order cache as well. The timing of this
write is controlled by what is known as the write policy.

A Write-through cache, every write to the cache causes a write to main

memory and higher order cache like L2, L3.

Write-back or copy-back cache, writes are not immediately mirrored to

the main memory. Instead, the cache tracks which locations will be
evicted. Such entries are written to main memory, higher order cache just
before eviction of the cache entry
Stale data in cache
The data in main memory being cached may be changed by other entities
(e.g. peripherals using direct memory access or multi-core processor), in
which case the copy in the cache may become out-of-date or stale.

Alternatively, when the CPU in a multi-core processor updates the data in

the cache, copies of data in caches associated with other cores will
become stale.

Communication protocols between the cache managers. Which keep the

data consistent are known as cache coherence protocols. Eg. Snoopy
based, directory based, token based.
State of the Art today
 Current day research on cache design, handling cache coherence, is
more biased to multicore architectures.

References
Wikipedia : http://en.wikipedia.org/wiki/CPU_cache

ArsTechnica : http://arstechnica.com/

http://software.intel.com

What Every Programmer Should Know About Memory -

- Ulrich Drepper, Red Hat, Inc.
Q/A

Cache Memory Presentation Slides
No ratings yet
Cache Memory Presentation Slides
25 pages
Understand CPU Caching Concepts
No ratings yet
Understand CPU Caching Concepts
11 pages
Cache Memory
No ratings yet
Cache Memory
20 pages
Cache Entries
100% (1)
Cache Entries
13 pages
6.Module 2_Part 2
No ratings yet
6.Module 2_Part 2
39 pages
13_Large and Fast Exploiting Memory Hierarchy Final
No ratings yet
13_Large and Fast Exploiting Memory Hierarchy Final
118 pages
Wk10a Cache PDF
No ratings yet
Wk10a Cache PDF
25 pages
Cache1 2
No ratings yet
Cache1 2
30 pages
CPU Cache: Details of Operation
No ratings yet
CPU Cache: Details of Operation
18 pages
Sampriya Chandra Cache Memory
No ratings yet
Sampriya Chandra Cache Memory
36 pages
Computer Architecture: Memory Hierarchy Design
No ratings yet
Computer Architecture: Memory Hierarchy Design
60 pages
Unit 5 Dpco
No ratings yet
Unit 5 Dpco
20 pages
15IF11 Multicore B
No ratings yet
15IF11 Multicore B
36 pages
Advanced Computer Architecture: BY Dr. Radwa M. Tawfeek
No ratings yet
Advanced Computer Architecture: BY Dr. Radwa M. Tawfeek
32 pages
CPU Cache
No ratings yet
CPU Cache
19 pages
cache_ppt
No ratings yet
cache_ppt
38 pages
CPU Cache: From Wikipedia, The Free Encyclopedia
No ratings yet
CPU Cache: From Wikipedia, The Free Encyclopedia
19 pages
Cache Memory: A Safe Place For Hiding or Storing Things
100% (1)
Cache Memory: A Safe Place For Hiding or Storing Things
34 pages
Cache&Virtual Memory
No ratings yet
Cache&Virtual Memory
50 pages
361 Computer Architecture Lecture 14: Cache Memory
No ratings yet
361 Computer Architecture Lecture 14: Cache Memory
20 pages
William Stallings Computer Organization and Architecture 7th Edition Cache Memory
No ratings yet
William Stallings Computer Organization and Architecture 7th Edition Cache Memory
51 pages
Cache Memory
No ratings yet
Cache Memory
39 pages
Input Output Organization(2.3)
No ratings yet
Input Output Organization(2.3)
151 pages
Cache + Associations Ch-4
No ratings yet
Cache + Associations Ch-4
52 pages
Cache Memory
No ratings yet
Cache Memory
60 pages
UNIT2 Cahe-Opt
No ratings yet
UNIT2 Cahe-Opt
134 pages
Cache Presentation
No ratings yet
Cache Presentation
45 pages
Computer Organization and Architecture: Cache Memory
100% (1)
Computer Organization and Architecture: Cache Memory
57 pages
Memory Cache
No ratings yet
Memory Cache
18 pages
Memory Hierarchy Design
No ratings yet
Memory Hierarchy Design
115 pages
Chap 6
No ratings yet
Chap 6
48 pages
Memory Organization AndCache Mapping Study 13
No ratings yet
Memory Organization AndCache Mapping Study 13
55 pages
ACA Unit-5
No ratings yet
ACA Unit-5
54 pages
EE6304 Lecture9 Mem Caches
No ratings yet
EE6304 Lecture9 Mem Caches
61 pages
Lec 23 CAOCache Memory
No ratings yet
Lec 23 CAOCache Memory
11 pages
Cache Memory
No ratings yet
Cache Memory
57 pages
COA_PPT
No ratings yet
COA_PPT
158 pages
Cache Memory: A Safe Place For Hiding or Storing Things
No ratings yet
Cache Memory: A Safe Place For Hiding or Storing Things
34 pages
cache_memory
No ratings yet
cache_memory
51 pages
AC14L08 Memory Hierarchy
No ratings yet
AC14L08 Memory Hierarchy
20 pages
Cache Basics and Operation
No ratings yet
Cache Basics and Operation
42 pages
Conspect of Lecture 7
No ratings yet
Conspect of Lecture 7
13 pages
5 Memory Hierarchy
No ratings yet
5 Memory Hierarchy
99 pages
Characteristics Location Capacity Unit of Transfer Access Method Performance Physical Type Physical Characteristics Organisation
No ratings yet
Characteristics Location Capacity Unit of Transfer Access Method Performance Physical Type Physical Characteristics Organisation
53 pages
Lec8 - Caches
No ratings yet
Lec8 - Caches
55 pages
Caching: Acknowledgements
No ratings yet
Caching: Acknowledgements
6 pages
CMSC 611: Advanced Computer Architecture
No ratings yet
CMSC 611: Advanced Computer Architecture
21 pages
William Stallings Computer Organization and Architecture 7th Edition
No ratings yet
William Stallings Computer Organization and Architecture 7th Edition
57 pages
Large and Fast: Exploiting Memory Hierarchy
No ratings yet
Large and Fast: Exploiting Memory Hierarchy
48 pages
William Stallings Computer Organization and Architecture 7th Edition Cache Memory
No ratings yet
William Stallings Computer Organization and Architecture 7th Edition Cache Memory
57 pages
CH 4.ppt Type I
No ratings yet
CH 4.ppt Type I
60 pages
Computer Architecture: Cache Memory
No ratings yet
Computer Architecture: Cache Memory
28 pages
4.1 Computer Memory System Overview
No ratings yet
4.1 Computer Memory System Overview
12 pages
6.cache Memory - BVK
No ratings yet
6.cache Memory - BVK
47 pages
Memory Organization: by Saniya Mhatre
No ratings yet
Memory Organization: by Saniya Mhatre
10 pages
Cache and Caching: Electrical and Electronic Engineering
No ratings yet
Cache and Caching: Electrical and Electronic Engineering
15 pages
55-Types of Caches, Caches Misses,-04!03!2025
No ratings yet
55-Types of Caches, Caches Misses,-04!03!2025
64 pages
Memory Basics Explained
From Everand
Memory Basics Explained
Alisa Turing
No ratings yet
LPIC-3 Exam 306-300 Mastery: 500 Practice Questions on High Availability & Storage Clusters
From Everand
LPIC-3 Exam 306-300 Mastery: 500 Practice Questions on High Availability & Storage Clusters
Steve Brown
No ratings yet
PostgreSQL Replication - Second Edition
From Everand
PostgreSQL Replication - Second Edition
Hans-Jurgen Schonig
No ratings yet
Grade 10 ICT Frist Term Test Model Paper Answers 2015
No ratings yet
Grade 10 ICT Frist Term Test Model Paper Answers 2015
1 page
Canon EOS 400D
No ratings yet
Canon EOS 400D
178 pages
Mastercycler 5331 - 5334 Service Manual English
100% (2)
Mastercycler 5331 - 5334 Service Manual English
82 pages
Alinuxmaterial
No ratings yet
Alinuxmaterial
192 pages
Chapter 4 - Data Acquisition
No ratings yet
Chapter 4 - Data Acquisition
39 pages
1 s2.0 S0167739X11001440 Main
No ratings yet
1 s2.0 S0167739X11001440 Main
14 pages
Functions of OS
No ratings yet
Functions of OS
33 pages
SLHT Grade 7 CSS Week 2
No ratings yet
SLHT Grade 7 CSS Week 2
6 pages
Gr12 - IT Theory LB PRINT
No ratings yet
Gr12 - IT Theory LB PRINT
147 pages
Precision m4800 Workstation - Owners Manual - en Us PDF
No ratings yet
Precision m4800 Workstation - Owners Manual - en Us PDF
81 pages
Ab Initio Session 4 Introduction To Ab Initio
No ratings yet
Ab Initio Session 4 Introduction To Ab Initio
37 pages
StellarOne 3.1 Patch1 Installation Guide
No ratings yet
StellarOne 3.1 Patch1 Installation Guide
110 pages
Course Code: CSE 2203 Course Title: Digital Techniques
No ratings yet
Course Code: CSE 2203 Course Title: Digital Techniques
14 pages
Chapter 6
No ratings yet
Chapter 6
46 pages
Security Awareness Training: Cjis Security Policy V5.7 Policy Area 5.2
100% (1)
Security Awareness Training: Cjis Security Policy V5.7 Policy Area 5.2
102 pages
Libelf by Example
No ratings yet
Libelf by Example
61 pages
AUTOSAR RS Features
No ratings yet
AUTOSAR RS Features
86 pages
1st Year New CHP 1
No ratings yet
1st Year New CHP 1
2 pages
(Tips & Tutorials) Miui Rom Faq: Recovery/Fastboot/Global Roms Explained
No ratings yet
(Tips & Tutorials) Miui Rom Faq: Recovery/Fastboot/Global Roms Explained
4 pages
Common Competency-Perform Computer Operations
No ratings yet
Common Competency-Perform Computer Operations
100 pages
h8816 Symmetrix Vmax 10k Spec
No ratings yet
h8816 Symmetrix Vmax 10k Spec
6 pages
Image Filter: Aim of The Project
No ratings yet
Image Filter: Aim of The Project
3 pages
Spark 101
No ratings yet
Spark 101
25 pages
02 - Computer Function Interconnection
No ratings yet
02 - Computer Function Interconnection
72 pages
Unidrive M400 Parameter Reference Guide (Open-Loop)
No ratings yet
Unidrive M400 Parameter Reference Guide (Open-Loop)
329 pages
Practical exercises - Windows
No ratings yet
Practical exercises - Windows
3 pages
p1895-Song
No ratings yet
p1895-Song
13 pages
Full download Embedded Systems Real Time Operating Systems for Arm Cortex M Microcontrollers 4th Edition Valvano pdf docx
100% (10)
Full download Embedded Systems Real Time Operating Systems for Arm Cortex M Microcontrollers 4th Edition Valvano pdf docx
85 pages
AccuMark Explorer Function Map
100% (1)
AccuMark Explorer Function Map
13 pages
Lecture Two
No ratings yet
Lecture Two
35 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Understand CPU Caching Concepts

Uploaded by

Understand CPU Caching Concepts

Uploaded by

Understand CPU Caching

The concept of Locality of reference.

-> 5 percent of the data is accessed 95 percent of the times, so

The gap between CPU and main memory speeds.

-> In analogy to producer consumer problem, the CPU is the

 Temporal Locality: If at one point in time say T a particular memory

When the processor needs to read or write a location in main

•The tag length is address − index − displacement

1. Whether or not the needed block is actually in the cache

2. The location of the block within the cache

3. The location of the desired byte within the block

 Random Eviction: Removal of any cache entry by random choice.

 LIFO: Evicting the latest cache entry.

 FIFO: Evicting the oldest cache entry.

 LRU: Evicting the Least recently used cache entry.

A Write-through cache, every write to the cache causes a write to main

Write-back or copy-back cache, writes are not immediately mirrored to

Alternatively, when the CPU in a multi-core processor updates the data in

Communication protocols between the cache managers. Which keep the

What Every Programmer Should Know About Memory -

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.