0% found this document useful (0 votes)

125 views25 pages

2015Sp CS61C L16 Kavs Caches3

The document discusses caches and memory hierarchies. It begins by reviewing cache parameters like block size, associativity, capacity, write policies, and replacement policies. It then discusses the three sources of cache misses: compulsory, capacity, and conflict misses. The document ends by discussing how changing cache parameters like block size, associativity, and capacity can impact performance metrics like hit time and miss rate. Specifically, increasing block size can initially lower miss rate but increase conflict misses; increasing associativity lowers miss rate but increases hit time; and increasing cache capacity lowers miss rate and hit time.

Uploaded by

MaiDung

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

125 views25 pages

2015Sp CS61C L16 Kavs Caches3

Uploaded by

MaiDung

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 25

CS 61C: Great Ideas in Computer

Architecture (Machine Structures)

Caches Part 3

Instructors:
Krste Asanovic & Vladimir Stojanovic
http://inst.eecs.berkeley.edu/~cs61c/
You Are Here!
Software Hardware
• Parallel Requests
Warehouse Smart
Assigned to computer Scale Phone
e.g., Search “Katz” Computer
Harness
• Parallel Threads Parallelism &
Assigned to core Achieve High
e.g., Lookup, Ads Performance Computer
• Parallel Instructions Core … Core
Today’s
>1 instruction @ one time Memory (Cache)
Lecture
e.g., 5 pipelined instructions
Input/Output Core
• Parallel Data
Instruction Unit(s) Functional
>1 data item @ one time Unit(s)
e.g., Add of 4 pairs of words A0+B0 A1+B1 A2+B2 A3+B3
• Hardware descriptions
Main Memory
All gates @ one time
Logic Gates
• Programming Languages
2
Caches Review
• Direct-Mapped vs. Set-Associative vs. Fully
Associative
• AMAT = Hit Time + Miss Rate * Miss Penalty
• 3 Cs of cache misses: Compulsory, Capacity,
Conflict
• Effect of cache parameters on performance

3
Primary Cache Parameters
• Block size (aka line size)
– how many bytes of data in each cache entry?
• Associativity
– how many ways in each set?
– Direct-mapped => Associativity = 1
– Set-associative => 1 < Associativity < #Entries
– Fully associative => Associativity = #Entries
• Capacity (bytes) = Total #Entries * Block size
• #Entries = #Sets * Associativity
4
Other Cache Parameters
• Write Policy
• Replacement policy

5
Write Policy Choices
• Cache hit:
– write through: writes both cache & memory on every access
• Generally higher memory traffic but simpler pipeline & cache design
– write back: writes cache only, memory `written only when dirty
entry evicted
• A dirty bit per line reduces write-back traffic
• Must handle 0, 1, or 2 accesses to memory for each load/store
• Cache miss:
– no write allocate: only write to main memory
– write allocate (aka fetch on write): fetch into cache

• Common combinations:
– write through and no write allocate
– write back with write allocate

6
Replacement Policy
In an associative cache, which line from a set should be
evicted when the set becomes full?
• Random
• Least-Recently Used (LRU)
• LRU cache state must be updated on every access
• True implementation only feasible for small sets (2-way)
• Pseudo-LRU binary tree often used for 4-8 way
• First-In, First-Out (FIFO) a.k.a. Round-Robin
• Used in highly associative caches
• Not-Most-Recently Used (NMRU)
• FIFO with exception for most-recently used line or lines

This is a second-order effect. Why?

Replacement only happens on misses

7
Sources of Cache Misses (3 C’s)
• Compulsory (cold start, first reference):
– 1st access to a block, “cold” fact of life, not a lot you can
do about it.
• If running billions of instructions, compulsory misses are
insignificant
• Capacity:
– Cache cannot contain all blocks accessed by the program
• Misses that would not occur with infinite cache
• Conflict (collision):
– Multiple memory locations mapped to same cache set
• Misses that would not occur with ideal fully associative cache

8
Impact of Cache Parameters on
Performance
• AMAT = Hit Time + Miss Rate * Miss Penalty
– Note, we assume always first search cache, so
must charge hit time for both hits and misses!
• For misses, characterize by 3Cs

9
CPU-Cache Interaction
(5-stage pipeline)

0x4 Add E
M
A
we
Decode, ALU Y addr
bubble Primary
IR Register B
Data rdata
PC addr inst Fetch Cache R
D wdata hit?
hit? wdata
PCen Primary
Instruction MD1 MD2
Cache
Stall entire
CPU on data
cache miss
To Memory Control

Cache Refill Data from Lower Levels of

Memory Hierarchy
10
Increasing Block Size?
• Hit time as block size increases?
– Hit time unchanged, but might be slight hit-time
reduction as number of tags is reduced, so faster to
access memory holding tags
• Miss rate as block size increases?
– Goes down at first due to spatial locality, then
increases due to increased conflict misses due to
fewer blocks in cache
• Miss penalty as block size increases?
– Rises with longer block size, but with fixed constant
initial latency that is amortized over whole block

11
Increasing Associativity?
• Hit time as associativity increases?
– Increases, with large step from direct-mapped to >=2 ways,
as now need to mux correct way to processor
– Smaller increases in hit time for further increases in
associativity
• Miss rate as associativity increases?
– Goes down due to reduced conflict misses, but most gain is
from 1->2->4-way with limited benefit from higher
associativities
• Miss penalty as associativity increases?
– Unchanged, replacement policy runs in parallel with
fetching missing line from memory

12
Increasing #Entries?
• Hit time as #entries increases?
– Increases, since reading tags and data from larger
memory structures
• Miss rate as #entries increases?
– Goes down due to reduced capacity and conflict
misses
– Architects rule of thumb: miss rate drops ~2x for every
~4x increase in capacity (only a gross approximation)
• Miss penalty as #entries increases?
– Unchanged

13
Administrivia
• Project 2, Part 2 due 3/22
• No assigned work over spring break
• Next assignment, HW5, due 04/05
• Midterm II is 04/09
– Conflict? Email Sagar
– DSP will receive email about accommodations
soon

14
How to Reduce Miss Penalty?
• Could there be locality on misses from a
cache?
• Use multiple cache levels!
• With Moore’s Law, more room on die for
bigger L1 caches and for second-level (L2)
cache
• And in some cases even an L3 cache!
• IBM mainframes have ~1GB L4 cache off-chip.
15
Review: Memory Hierarchy
Processor
Increasing
Inner distance from
Level 1 processor,
Levels in decreasing
memory Level 2 speed
hierarchy Level 3
Outer ...
Level n

Size of memory at each level

As we move to outer levels the latency goes up
and price per bit goes down.
16
From Lecture 11: In the News
• At ISSCC 2015 in San Francisco yesterday, latest IBM
mainframe chip details
• z13 designed in 22nm SOI technology with
seventeen metal layers, 4 billion transistors/chip
• 8 cores/chip, with 2MB L2 cache, 64MB L3 cache,
and 480MB L4 off-chip cache.
• 5GHz clock rate, 6 instructions per cycle, 2
threads/core
• Up to 24 processor chips in shared memory node

17
IBM z13 Memory Hierarchy

18
Local vs. Global Miss Rates
• Local miss rate – the fraction of references to
one level of a cache that miss
• Local Miss rate L2$ = $L2 Misses / L1$ Misses
• Global miss rate – the fraction of references that
miss in all levels of a multilevel cache
• L2$ local miss rate >> than the global miss rate

19
L1 Cache: 32KB I$, 32KB D$
L2 Cache: 256 KB
L3 Cache: 4 MB

6/10/2018 Fall 2013 -- Lecture #13 20

Local vs. Global Miss Rates
• Local miss rate – the fraction of references to one
level of a cache that miss
• Local Miss rate L2$ = $L2 Misses / L1$ Misses
• Global miss rate – the fraction of references that
miss in all levels of a multilevel cache
• L2$ local miss rate >> than the global miss rate
• Global Miss rate = L2$ Misses / Total Accesses
= L2$ Misses / L1$ Misses x L1$ Misses / Total Accesses
= Local Miss rate L2$ x Local Miss rate L1$
• AMAT = Time for a hit + Miss rate x Miss penalty
• AMAT = Time for a L1$ hit + (local) Miss rate L1$ x
(Time for a L2$ hit + (local) Miss rate L2$ x L2$ Miss penalty)

21
Clickers/Peer Instruction
• Overall, what are L2 and L3 local miss rates?
A: L2 > 50%, L3 > 50%
B: L2 ~ 50%, L3 < 50%
C: L2 ~ 50%, L3 ~ 50%
D: L2 < 50%, L3 < 50%
E: L2 > 50%, L3 ~50%

22
23
CPI/Miss Rates/DRAM Access
SpecInt2006
Data Only Data Only Instructions and Data

6/10/2018 Fall 2013 -- Lecture #12 24

In Conclusion, Cache Design Space
• Several interacting dimensions Cache Size

– Cache size
– Block size Associativity

– Associativity
– Replacement policy
– Write-through vs. write-back
Block Size
– Write-allocation
• Optimal choice is a compromise
– Depends on access characteristics Bad
• Workload
• Use (I-cache, D-cache)
– Depends on technology / cost Good Factor A Factor B

• Simplicity often wins Less More

Business Analyst - Practice Exam 1
100% (1)
Business Analyst - Practice Exam 1
42 pages
SIEM&XDR Demo Guide v1.2 February2023
No ratings yet
SIEM&XDR Demo Guide v1.2 February2023
26 pages
Quectel Embedded GSM Antenna User Guide V1.2
No ratings yet
Quectel Embedded GSM Antenna User Guide V1.2
11 pages
ITM Integration With Omnibus Multi Tier Architecure v1
No ratings yet
ITM Integration With Omnibus Multi Tier Architecure v1
40 pages
IBM Mainframe Life Cycle History V3.0 - April 8, 2025
No ratings yet
IBM Mainframe Life Cycle History V3.0 - April 8, 2025
17 pages
HC2021.C1.3 IBM Cristian Jacobi Final
No ratings yet
HC2021.C1.3 IBM Cristian Jacobi Final
22 pages
Ny Naspa 2016 04 Z SW Performance Optimization
No ratings yet
Ny Naspa 2016 04 Z SW Performance Optimization
16 pages
Z IIPCapacity and Performance
No ratings yet
Z IIPCapacity and Performance
50 pages
Ibm Hardware Innovation: Gse Zexpertenforum April 2-3, 2019
No ratings yet
Ibm Hardware Innovation: Gse Zexpertenforum April 2-3, 2019
27 pages
T1 D1 S1 Z Systems z13 Update
No ratings yet
T1 D1 S1 Z Systems z13 Update
50 pages
zEC12 & zBC12 Vs z13 & z14
No ratings yet
zEC12 & zBC12 Vs z13 & z14
38 pages
Redefining The Cyber Resiliency With IBM Storage and Power 10
No ratings yet
Redefining The Cyber Resiliency With IBM Storage and Power 10
21 pages
Ibm C4040 250
No ratings yet
Ibm C4040 250
59 pages
SRDF Interfamily Connectivity Information
No ratings yet
SRDF Interfamily Connectivity Information
15 pages
The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms
No ratings yet
The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms
22 pages
AU14D08-Now I Have AESEv3
No ratings yet
AU14D08-Now I Have AESEv3
44 pages
IDz Workbench - Fault Analyzer
No ratings yet
IDz Workbench - Fault Analyzer
112 pages
IBM - Actualcerts.a4040 224.v2014!02!28.by - Pam.32q
No ratings yet
IBM - Actualcerts.a4040 224.v2014!02!28.by - Pam.32q
12 pages
IDz Workbench - File Manager - v12.1
No ratings yet
IDz Workbench - File Manager - v12.1
136 pages
IBM PowerVM Disk-Tape Virtualization
No ratings yet
IBM PowerVM Disk-Tape Virtualization
23 pages
Encrypt Data Everywhere z15 Supports Cloud-Native Application Development
No ratings yet
Encrypt Data Everywhere z15 Supports Cloud-Native Application Development
56 pages
MQ Dist and zOS
No ratings yet
MQ Dist and zOS
43 pages
Unit 01 - AIX - Overview
No ratings yet
Unit 01 - AIX - Overview
36 pages
DB2 Performance Tuning Using Omegamon DB2 Performance Expert Akiko - HOSHIKAWA
No ratings yet
DB2 Performance Tuning Using Omegamon DB2 Performance Expert Akiko - HOSHIKAWA
54 pages
Share Session 8874 02-28-11
No ratings yet
Share Session 8874 02-28-11
17 pages
zsl03437 Usen 02 - ZSL03437USEN
No ratings yet
zsl03437 Usen 02 - ZSL03437USEN
2 pages
Matrix
No ratings yet
Matrix
47 pages
Step by Step SRDF
No ratings yet
Step by Step SRDF
5 pages
02 - Accelerate Your Journey To AI With IBM Power System - v1 - Nguyen Si Duy
No ratings yet
02 - Accelerate Your Journey To AI With IBM Power System - v1 - Nguyen Si Duy
28 pages
May AIX and PowerSW Monthly Meeting
No ratings yet
May AIX and PowerSW Monthly Meeting
36 pages
IBM Z Systems Processor Optimization SHARE Aug 2016
No ratings yet
IBM Z Systems Processor Optimization SHARE Aug 2016
27 pages
Virtual Storage Platform e Series Family Matrix
No ratings yet
Virtual Storage Platform e Series Family Matrix
5 pages
H CP Management A Pi Reference
No ratings yet
H CP Management A Pi Reference
192 pages
What To Look For in A OMEGAMON Statistics Report - IBM Documentation
No ratings yet
What To Look For in A OMEGAMON Statistics Report - IBM Documentation
5 pages
2023 Gartner Primary Storage
No ratings yet
2023 Gartner Primary Storage
18 pages
Unit 07 - IBM I - Licensing - 3448710
No ratings yet
Unit 07 - IBM I - Licensing - 3448710
13 pages
IBM TS1150 - NonVolatile Caching - Jun 2017
No ratings yet
IBM TS1150 - NonVolatile Caching - Jun 2017
10 pages
Demo ZVM A First Contact With LinuxOne's Main Hypersivor
No ratings yet
Demo ZVM A First Contact With LinuxOne's Main Hypersivor
29 pages
GSE - 201710 - IBM - Z14-Technical - Overview - Final SUPER GOOD ONE
No ratings yet
GSE - 201710 - IBM - Z14-Technical - Overview - Final SUPER GOOD ONE
156 pages
Hitachi Command Suite
No ratings yet
Hitachi Command Suite
280 pages
03.technical Overview For FlashSystem 900 & V9000
No ratings yet
03.technical Overview For FlashSystem 900 & V9000
54 pages
Unit 00
No ratings yet
Unit 00
13 pages
1SP2 IBM SmartCloud Entry On Power Systems Development Plan v1
No ratings yet
1SP2 IBM SmartCloud Entry On Power Systems Development Plan v1
14 pages
E 05 THEORY0
No ratings yet
E 05 THEORY0
171 pages
The New IBM z15 A-Technical Review of The Processor Design New Features IO Cards and Crypto 2020
No ratings yet
The New IBM z15 A-Technical Review of The Processor Design New Features IO Cards and Crypto 2020
55 pages
Quantum Decade: A Playbook For Achieving Awareness, Readiness, and Advantage Third Edition
No ratings yet
Quantum Decade: A Playbook For Achieving Awareness, Readiness, and Advantage Third Edition
140 pages
21AC - What's New in Db2 For I PDF
No ratings yet
21AC - What's New in Db2 For I PDF
47 pages
Storage Insights Pro, Spectrum Control L2 Seller Part3
No ratings yet
Storage Insights Pro, Spectrum Control L2 Seller Part3
16 pages
SCC2560 Solution Selling - April 2022
No ratings yet
SCC2560 Solution Selling - April 2022
82 pages
VSP 5x00, G1x00, F1500, Gxx0, Fxx0, VSP, HUS VM AIX - Support - Matrix - 121219
No ratings yet
VSP 5x00, G1x00, F1500, Gxx0, Fxx0, VSP, HUS VM AIX - Support - Matrix - 121219
16 pages
SRDF S Lab
No ratings yet
SRDF S Lab
9 pages
System P - IBM Nutanix Hyperconverged Cloud System Presentation - AP - MEA - Final.075dpi
100% (1)
System P - IBM Nutanix Hyperconverged Cloud System Presentation - AP - MEA - Final.075dpi
23 pages
What's New For App Modernization On IBM Power - June 06 (Jason L, RH)
No ratings yet
What's New For App Modernization On IBM Power - June 06 (Jason L, RH)
31 pages
IBM Cloud Professional Certification Program: Study Guide Series
No ratings yet
IBM Cloud Professional Certification Program: Study Guide Series
36 pages
Accelerate 831 UpdateV2
No ratings yet
Accelerate 831 UpdateV2
78 pages
Power Compete
No ratings yet
Power Compete
24 pages
Gse 2019 Mainframes and The Moon PDF
No ratings yet
Gse 2019 Mainframes and The Moon PDF
48 pages
Unit 02 - IBM I - Integration - 3448250
No ratings yet
Unit 02 - IBM I - Integration - 3448250
19 pages
QV0121 SG
No ratings yet
QV0121 SG
110 pages
PT 2023 State Ibm I Security Study
No ratings yet
PT 2023 State Ibm I Security Study
22 pages
PDF Orientation Deck I IBM Data Visualization Micro-Internship
No ratings yet
PDF Orientation Deck I IBM Data Visualization Micro-Internship
34 pages
Cache
No ratings yet
Cache
34 pages
CMP3010L09 MemoryII
No ratings yet
CMP3010L09 MemoryII
39 pages
Sample Code For SHT21: Supporting Communication Software
No ratings yet
Sample Code For SHT21: Supporting Communication Software
14 pages
2015Sp CS61C L05 Kavs M1
No ratings yet
2015Sp CS61C L05 Kavs M1
36 pages
m35 at Command
No ratings yet
m35 at Command
186 pages
TOIEC
No ratings yet
TOIEC
10 pages
CS 61C: Great Ideas in Computer Architecture (Machine Structures)
No ratings yet
CS 61C: Great Ideas in Computer Architecture (Machine Structures)
32 pages
Module Secondary SMT User Guide V2.1
No ratings yet
Module Secondary SMT User Guide V2.1
16 pages
Paper+02+Final+Revised+ (898 912)
No ratings yet
Paper+02+Final+Revised+ (898 912)
15 pages
Aqa 75162 MS Jun22
No ratings yet
Aqa 75162 MS Jun22
19 pages
Ali Raza: Brand Impact Marketing, Dubai - Full Stack Web
No ratings yet
Ali Raza: Brand Impact Marketing, Dubai - Full Stack Web
2 pages
25 Advanced JavaScript Questions
No ratings yet
25 Advanced JavaScript Questions
13 pages
M SC - Computer-Science
No ratings yet
M SC - Computer-Science
41 pages
Vikakoodit BB - Bosch Serie-TL-SL-ENG
No ratings yet
Vikakoodit BB - Bosch Serie-TL-SL-ENG
26 pages
Coderprog: Popular Posts
No ratings yet
Coderprog: Popular Posts
8 pages
Final Examination in Empowerment Technologies
No ratings yet
Final Examination in Empowerment Technologies
3 pages
Glassdoor - Resume - Rajesh Design Engineer
No ratings yet
Glassdoor - Resume - Rajesh Design Engineer
2 pages
How To Run Android in Ubuntu
No ratings yet
How To Run Android in Ubuntu
2 pages
Ramp-Up Guide DevOps
No ratings yet
Ramp-Up Guide DevOps
5 pages
Urodyn 1000 System: 6Wdwlrqdu/8Uriorz5Hfrughu
No ratings yet
Urodyn 1000 System: 6Wdwlrqdu/8Uriorz5Hfrughu
16 pages
Python Notes
No ratings yet
Python Notes
4 pages
Cuestionario NS4
No ratings yet
Cuestionario NS4
20 pages
C Projects For Resume
100% (2)
C Projects For Resume
8 pages
Weight Master09Nov EMAIL
No ratings yet
Weight Master09Nov EMAIL
4 pages
Fundamentals of Cat (Aero) : Computer Aided Engineering (CAE) / Finite Element Analysis
No ratings yet
Fundamentals of Cat (Aero) : Computer Aided Engineering (CAE) / Finite Element Analysis
3 pages
BETa2024 TeamPresentation Booklet-0
No ratings yet
BETa2024 TeamPresentation Booklet-0
51 pages
Zeydoo - Quick Start Guide - EN
No ratings yet
Zeydoo - Quick Start Guide - EN
28 pages
2025 Jce Computer Studies Ned Mock Computer
No ratings yet
2025 Jce Computer Studies Ned Mock Computer
14 pages
Ab Initio - Intro
100% (1)
Ab Initio - Intro
43 pages
Moore Tutorial
No ratings yet
Moore Tutorial
20 pages
OpenMP 4.0 Syntax Examples
No ratings yet
OpenMP 4.0 Syntax Examples
4 pages
Alarm Description: HUAWEI SE2900 Session Border Controller
No ratings yet
Alarm Description: HUAWEI SE2900 Session Border Controller
32 pages
Technical Manual of Intel Q170 Express Chipset: Trademark
No ratings yet
Technical Manual of Intel Q170 Express Chipset: Trademark
40 pages
STQA Crossword - Crossword Labs
No ratings yet
STQA Crossword - Crossword Labs
2 pages
Quectel ECx00U&EG91xU Series Jamming Detection Application Note V1.0.0 Preliminary 20221013
No ratings yet
Quectel ECx00U&EG91xU Series Jamming Detection Application Note V1.0.0 Preliminary 20221013
17 pages
Murex Tech Ops Agile
No ratings yet
Murex Tech Ops Agile
27 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

2015Sp CS61C L16 Kavs Caches3

Uploaded by

2015Sp CS61C L16 Kavs Caches3

Uploaded by

CS 61C: Great Ideas in Computer

Architecture (Machine Structures)

This is a second-order effect. Why?

Replacement only happens on misses

Cache Refill Data from Lower Levels of

Size of memory at each level

6/10/2018 Fall 2013 -- Lecture #13 20

6/10/2018 Fall 2013 -- Lecture #12 24

• Simplicity often wins Less More

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.