0% found this document useful (0 votes)

4 views35 pages

02c BranchPred

The document discusses branch prediction in computer architecture, highlighting the importance of accurately predicting branch directions and target addresses to improve instruction fetching efficiency. It covers various prediction techniques, including static and dynamic methods, one-bit and two-bit predictors, and advanced hybrid predictors that utilize both local and global history. Additionally, it explains the use of branch target buffers and return address stacks to enhance prediction accuracy and reduce execution delays.

Uploaded by

Hu Da

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views35 pages

02c BranchPred

Uploaded by

Hu Da

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 35

CS 6290

Branch Prediction
Outline
• Control Dependence • Micro-architecture
and Branch – Branch Target Buffer
• Static Branch – Return Address
Prediction Stack
• Dynamic Branch • Branch Prediction in
Prediction Real World
– One Bit
– Two Bits
• Global Branch History
• Hybrid Branch
Predictor
Control Dependencies

• Branches are very frequent

– Approx. 20% of all instructions
• Can not wait until we know where it
goes
– Long pipelines
• Branch outcome known after B cycles
• No scheduling past the branch until outcome
known
– Superscalars (e.g., 4-way)
• Branch every cycle or so!
• One cycle of work, then bubbles for ~B cycles?
Surviving Branches: Prediction

• Predict Branches
– And predict them well!
• Fetch, decode, etc. on the predicted
path
– Option 1: No execute until branch
resovled
– Option 2: Execute anyway (speculation)
• Recover from mispredictions
– Restart fetch from correct path
Branch Prediction
• Need to know two things
– Whether the branch is taken or not (direction)
– The target address if it is taken (target)

• Direct jumps, Function calls

– Direction known (always taken), target easy to
compute
• Conditional Branches (typically PC-relative)
– Direction difficult to predict, target easy to compute
• Indirect jumps, function returns
– Direction known (always taken), target difficult
Branch Prediction: Direction

• Needed for conditional branches

– Most branches are of this type
• Many, many kinds of predictors for
this
– Static: fixed rule, or compiler annotation
(e.g. “BEQL” is “branch if equal likely”)
– Dynamic: hardware prediction
• Dynamic prediction usually history-
based
– Example: predict direction is the same
as the last time this branch was executed
Static Prediction
• Always predict NT
– easy to implement
– 30-40% accuracy … not so good
• Always predict T
– 60-70% accuracy
• BTFNT
– loops usually have a few iterations, so
this is like always predicting that the loop
is taken
– don’t know target until decode
One-Bit Branch Predictor
Branch history
K bits of branch
table of 2^K entries,
instruction address
1 bit per entry

Index Use this entry to

predict this branch:

0: predict not taken

1: predict taken

When branch direction resolved,

go back into the table and
update entry: 0 if not taken, 1 if taken
One-Bit Branch Predictor
(cont’d)

0xDC08: for(i=0; i < 100000; i++)

{
0xDC44: if( ( i % 100) == 0 )
T
tick( );

0xDC50: if( (i & 1) == 1)

odd( ); N

}
The Bit Is Not Enough!

• Example: short loop (8 iterations)

– Taken 7 times, then not taken once
– Not-taken mispredicted (was taken previously)
• Execute the same loop again
– First always mispredicted
(previous outcome was not taken)
– Then 6 predicted correctly
– Then last one mispredicted again
• Each fluke/anomaly in a stable pattern
results in two mispredicts per loop
Examples

DC08: TTTTTTTTTTT ... TTTTTTTTTTNTTTTTTTTT …

100,000 iterations
NT
How often is branch outcome != previous outcome? TN
2 / 100,000
99.998%
DC44: TTTTT ... TNTTTTT … TNTTTTT …
Prediction
2 / 100 Rate
98.0%
DC50: TNTNTNTNTNTNTNTNTNTNTNTNTNTNT …

2/2 0.0%
Two Bits are Better Than One

Predict NT
Predict T
Transistion on T outcome
2 3
Transistion on NT outcome

0 1
0 1

FSM for Last-Outcome FSM for 2bC

Prediction (2-bit Counter)
Example

1bC: Initial Training/Warm-up

0 1 1 1 1 1 1 0 1 1
… …
T T T T T T N T T T

         
2bC:
0 1 2 3 3 3 3 2 3 3
… …
T T T T T T N T T T

         

Only 1 Mispredict per N branches now!

DC08: 99.999% DC04: 99.0%
Still Not Good Enough

These are We can

good live with
these

This is bad!
Importance of Branches
• 98%  99%
– Who cares?
– Actually, it’s 2% misprediction rate  1%
– That’s a halving of the number of mispredictions
• So what?
– If misp rate equals 50%, and 1 in 5 insts is a branch, then
number of useful instructions that we can fetch is:
5*(1 + ½ + (½)2 + (½)3 + … ) = 10
– If we halve the miss rate down to 25%:
5*(1 + ¾ + (¾)2 + (¾)3 + … ) = 20
– Halving the miss rate doubles the number of useful
instructions that we can try to extract ILP from
How about the Branch at
0xdc50?
• 1bc and 2bc don’t do too well (50% at
best)
• But it’s still obviously predictable
• Why?
– It has a repeating pattern: (NT)*
– How about other patterns? (TTNTN)*

• Use branch correlation

– The outcome of a branch is often related to
previous outcome(s)
Idea: Track the History of a
Branch
An (m, n) predictor uses the behavior of the last m branches to choose from 2^m
branch predictors, each of which is an n-bit predictor for a single branch.
(1, 2) predictor
Previous Outcome
PC
Counter if prev=0
1 3 0
Counter if prev=1

1 3 3 prev = 1 3 0 prediction = N
prev = 0 3 0 prediction = T
prev = 1 3 3 prediction = T prev = 1 3 0 prediction = N

prev = 0 3 2 prediction = T prev = 0 3 0 prediction = T

prev = 1 3 2 prediction = T
prev = 1 3 3 prediction = T
Deeper History Covers More
Patterns
(3, 2) predictor

Last 3 Outcomes Counter if prev=000

Counter if prev=001

PC Counter if prev=010

0 0 1 1 3 1 0 3 2 0 2

Counter if prev=111

• What pattern has this branch predictor entry

learned?
001  1; 011  0; 110  0; 100  1
00110011001… (0011)*
Global vs. Local Branch History
• Local Behavior
– What is the predicted direction of Branch
A given the outcomes of previous
instances of Branch A?
• Global Behavior
– What is the predicted direction of Branch
Z given the outcomes of all* previous
branches A, B, …, X and Y?
* number of previous branches tracked limited by the history
length
Why Global Correlations Exist
• Example: related branch conditions

A: p = findNode(foo);
if ( p is parent )
do something;

do other stuff; /* may contain more branches */

Outcome of second
branch is always
B: if ( p is a child ) opposite of the first
branch
do something else;
Other Global Correlations
• Testing same/similar conditions
– code might test for NULL before a function call,
and the function might test for NULL again
– in some cases it may be faster to recompute a
condition rather than save a previous
computation in memory and re-load it
– partial correlations: one branch could test for
cond1, and another branch could test for cond1 &&
cond2 (if cond1 is false, then the second branch
can be predicted as false)
– multiple correlations: one branch tests cond1, a
second tests cond2, and a third tests cond1 
cond2 (which can always be predicted if the first
two branches are known).
Tournament Predictors

• No predictor is clearly the best

– Different branches exhibit different
behaviors
• Some “constant”, some global, some local
• Idea:
Let’s have a predictor to predict
which predictor will predict better 
Tournament Hybrid Predictors
Meta- table of 2-/3-bit counters
Pred0 Pred1
Predictor

Meta
Pred0 Pred1
Final Prediction Update
  ---
If meta-counter MSB = 0,
use pred0 else use pred1   Inc
  Dec
  ---
Common Combinations
• Global history + Local history
• “easy” branches + global history
– 2bC and gshare
• short history + long history

• Many types of behaviors, many

combinations
Direction Predictor Accuracy
Target Address Prediction
• Branch Target Buffer
– IF stage: need to know fetch addr every cycle
– Need target address one cycle after fetching a branch
– For some branches (e.g., indirect) target known
only after EX stage, which is way too late
– Even easily-computed branch targets need to wait
until instruction decoded and direction predicted in ID
stage
(still at least one cycle too late)
– So, we have a quick-and-dirty predictor for the target
that only needs the address of the branch instruction
Branch Target Buffer

• BTB indexed by instruction address

• We don’t even know if it is a branch!
Direction prediction
can be factored out
• If address matches a BTB entry, it intois
separate table
predicted to be a branch
• BTB entry tells whether it is taken
(direction) and where it goes if taken
• BTB takes only the instruction address, so
while we fetch one instruction in the IF
stage
we are predicting where to fetch the next
one from
Branch-Target Buffer
• Need high instruction bandwidth!
– Branch-Target buffers
• Next PC prediction buffer, indexed by current PC

Branch Folding
• Optimization:
– Larger branch-target buffer
– Add target instruction into buffer to deal
with longer decoding time required by
larger buffer
– “Branch folding”

Return Address Stack (RAS)
• Function returns are frequent, yet
– Address is difficult to compute
(have to wait until EX stage done to know it)
– Address difficult to predict with BTB
(function can be called from multiple places)
• But return address is actually easy to
predict
– It is the address after the last call instruction
that we haven’t returned from yet
– Hence the Return Address Stack
Return Address Stack (RAS)
• Call pushes return address into the RAS
• When a return instruction decoded,
pop the predicted return address from RAS
• Accurate prediction even w/ small RAS
Example 1: Alpha 21264

• Hybrid predictor
– combines local history and global history
components with a meta-predictor
Example 2: Pentium-M

• Also hybrid, but uses tag-based

selection mechanism
Pentium-M (cont’d)

• Local component also has support for

loops
– accurately predict branches of the form
(TkN)*
Pentium-M (cont’d)

• Special target prediction for indirect

branches
– common in object-oriented code (vtables)
– assumes correlation with global history

08 Process Control
No ratings yet
08 Process Control
84 pages
11 Threads
No ratings yet
11 Threads
72 pages
Lecture 2 数据中心网络
No ratings yet
Lecture 2 数据中心网络
40 pages
03 File IO
No ratings yet
03 File IO
52 pages
01 Introduction
No ratings yet
01 Introduction
48 pages
Electric Duct Heater
No ratings yet
Electric Duct Heater
3 pages
Howto HA Zimbra8
No ratings yet
Howto HA Zimbra8
13 pages
MC 2022 Scheme Lab Manual
No ratings yet
MC 2022 Scheme Lab Manual
27 pages
Pipeline Part 2 and Data Hazards
No ratings yet
Pipeline Part 2 and Data Hazards
11 pages
03 TLP
No ratings yet
03 TLP
33 pages
Tas 5707 A
No ratings yet
Tas 5707 A
59 pages
CA L15b BranchPrediction DynamicPredictors
No ratings yet
CA L15b BranchPrediction DynamicPredictors
25 pages
Lecture #3
No ratings yet
Lecture #3
12 pages
Thesis On Java Programming
100% (3)
Thesis On Java Programming
4 pages
9 Types of Two Level Branch Predictor
No ratings yet
9 Types of Two Level Branch Predictor
4 pages
10 Branchprediction
No ratings yet
10 Branchprediction
49 pages
02a ILP Pipeline
No ratings yet
02a ILP Pipeline
40 pages
Cs146-Lecture7 2
No ratings yet
Cs146-Lecture7 2
17 pages
V'Smart Academy: CA Inter Courses
No ratings yet
V'Smart Academy: CA Inter Courses
27 pages
Implementing A Branch Predictor
No ratings yet
Implementing A Branch Predictor
7 pages
13 Daemon
No ratings yet
13 Daemon
15 pages
ANN BRANCH PREDICTION - Compressed
No ratings yet
ANN BRANCH PREDICTION - Compressed
15 pages
01 Introduction
No ratings yet
01 Introduction
20 pages
Lect09 Adv Branch Prediction
No ratings yet
Lect09 Adv Branch Prediction
55 pages
C Programming Unit 4
No ratings yet
C Programming Unit 4
31 pages
SY - Core Java - Syllabus-Nep
No ratings yet
SY - Core Java - Syllabus-Nep
6 pages
Anch Prediction
No ratings yet
Anch Prediction
25 pages
Red Hat Satellite 6.13 Installing Satellite Server in A Connected Network Environment
No ratings yet
Red Hat Satellite 6.13 Installing Satellite Server in A Connected Network Environment
102 pages
WRL TN 36
No ratings yet
WRL TN 36
29 pages
Lec4 Supp Branch Prediction
No ratings yet
Lec4 Supp Branch Prediction
45 pages
CA L15a BranchPrediction Intro and StaticPredictors
No ratings yet
CA L15a BranchPrediction Intro and StaticPredictors
19 pages
BLIIoT S275 - User Manual - V1.3
No ratings yet
BLIIoT S275 - User Manual - V1.3
86 pages
Sony-XMSD22X Caramp PDF
No ratings yet
Sony-XMSD22X Caramp PDF
24 pages
Branch Pred
No ratings yet
Branch Pred
27 pages
8 - Branch Prediction
No ratings yet
8 - Branch Prediction
29 pages
L10 PipelineHazards 3
No ratings yet
L10 PipelineHazards 3
35 pages
Branch Prediction - Everything You Need To Know - The Startup Medium
No ratings yet
Branch Prediction - Everything You Need To Know - The Startup Medium
13 pages
L11 PipelineHazards 4
No ratings yet
L11 PipelineHazards 4
30 pages
17.L15 BranchPrediction
No ratings yet
17.L15 BranchPrediction
38 pages
Hadoop Map Reduce
No ratings yet
Hadoop Map Reduce
53 pages
Aca Unit-4 Notes
No ratings yet
Aca Unit-4 Notes
23 pages
Gamestop Security
No ratings yet
Gamestop Security
38 pages
8 DynamicBranchPrediction
No ratings yet
8 DynamicBranchPrediction
8 pages
Dynamic Branch Prediction With Perceptrons
No ratings yet
Dynamic Branch Prediction With Perceptrons
10 pages
07 Branch Prediction
No ratings yet
07 Branch Prediction
35 pages
Folien BranchPredictionOptimization
No ratings yet
Folien BranchPredictionOptimization
5 pages
Ue21ec341b 20240412163937
No ratings yet
Ue21ec341b 20240412163937
22 pages
Branch Prediction
No ratings yet
Branch Prediction
6 pages
18 740 Fall15 Lecture05 Branch Prediction Afterlecture
No ratings yet
18 740 Fall15 Lecture05 Branch Prediction Afterlecture
93 pages
RISC-V Pipeline P3
No ratings yet
RISC-V Pipeline P3
24 pages
L12 - Advanced Branch Preiction
No ratings yet
L12 - Advanced Branch Preiction
9 pages
M04 Designing Program Logic
100% (1)
M04 Designing Program Logic
57 pages
EEE 105: Instruction Set Examples: Snap Densing
No ratings yet
EEE 105: Instruction Set Examples: Snap Densing
3 pages
Branch Prediction Maryamhamza
No ratings yet
Branch Prediction Maryamhamza
12 pages
Branch Prediction
No ratings yet
Branch Prediction
38 pages
CA Lecture 4 Module 3
No ratings yet
CA Lecture 4 Module 3
27 pages
Branch Prediction
No ratings yet
Branch Prediction
41 pages
البحث الثاني
No ratings yet
البحث الثاني
10 pages
Software-Based and Hardware-Based Branch Prediction Strategies and Performance Evaluation
No ratings yet
Software-Based and Hardware-Based Branch Prediction Strategies and Performance Evaluation
19 pages
Branch Predictors
No ratings yet
Branch Predictors
41 pages
Branch Prediction: Case For Branch Prediction When Issue N Instructions Per Clock Cycle
No ratings yet
Branch Prediction: Case For Branch Prediction When Issue N Instructions Per Clock Cycle
13 pages
5 4-Pipelining
No ratings yet
5 4-Pipelining
10 pages
What About Branches?: Branch Outcomes Are Not Known Until EXE What Are Our Options?
No ratings yet
What About Branches?: Branch Outcomes Are Not Known Until EXE What Are Our Options?
27 pages
05 - Pipelining - Branch Prediction
No ratings yet
05 - Pipelining - Branch Prediction
20 pages
9.1.0 Branch Prediction Pentiums IBM PPC
No ratings yet
9.1.0 Branch Prediction Pentiums IBM PPC
163 pages
Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture - 16 Branch Prediction
No ratings yet
Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture - 16 Branch Prediction
26 pages
Sequentix P3 OS4-4.5 2
No ratings yet
Sequentix P3 OS4-4.5 2
7 pages
Computer Architecture: Branching
No ratings yet
Computer Architecture: Branching
37 pages
CS252 Graduate Computer Architecture Prediction (Con't) (Dependencies, Load Values, Data Values) February 22, 2010
No ratings yet
CS252 Graduate Computer Architecture Prediction (Con't) (Dependencies, Load Values, Data Values) February 22, 2010
54 pages
SAURABH KORANGLEKAR, B-41, C2C Day1 Assignment: Company Name Job Title Job Qualification
No ratings yet
SAURABH KORANGLEKAR, B-41, C2C Day1 Assignment: Company Name Job Title Job Qualification
7 pages
Branch Prediction
No ratings yet
Branch Prediction
2 pages
Electrical Symbols vs. Electrical Signs
No ratings yet
Electrical Symbols vs. Electrical Signs
1 page
Correlating (Global) Branch Predictors Correlating Branch Predictors
No ratings yet
Correlating (Global) Branch Predictors Correlating Branch Predictors
3 pages
Branch Prediction Techniques: Prof. Pimal Khanpara Department of Computer Science & Engineering
No ratings yet
Branch Prediction Techniques: Prof. Pimal Khanpara Department of Computer Science & Engineering
20 pages
Branch Prediction: Joel Emer
No ratings yet
Branch Prediction: Joel Emer
36 pages
Ci 2400bs Eng
No ratings yet
Ci 2400bs Eng
52 pages
Branch Prediction: Jeroen Lichtenauer
No ratings yet
Branch Prediction: Jeroen Lichtenauer
23 pages
LP Micro Controller
No ratings yet
LP Micro Controller
4 pages
GNS312 Chapter 1 (Module 1) Slides
No ratings yet
GNS312 Chapter 1 (Module 1) Slides
32 pages
Satellite Network Configurations
No ratings yet
Satellite Network Configurations
19 pages
Mini Hi-Fi System: Service Manual
No ratings yet
Mini Hi-Fi System: Service Manual
45 pages
Comp Arch Proj Report 2
No ratings yet
Comp Arch Proj Report 2
11 pages
Finding Difficult Branches
No ratings yet
Finding Difficult Branches
19 pages
Branch Handling
No ratings yet
Branch Handling
23 pages
Branch Prediction: Prof. Mikko H. Lipasti University of Wisconsin-Madison
No ratings yet
Branch Prediction: Prof. Mikko H. Lipasti University of Wisconsin-Madison
22 pages
Teardown Manual For Ipad Wi-Fi
No ratings yet
Teardown Manual For Ipad Wi-Fi
34 pages
Branch Prediction
No ratings yet
Branch Prediction
5 pages
Dynamic Branch Prediction
No ratings yet
Dynamic Branch Prediction
7 pages
Sysh1800 Plus 003
No ratings yet
Sysh1800 Plus 003
1 page
Manuel de L'ordi Digital CURSOR 13500
No ratings yet
Manuel de L'ordi Digital CURSOR 13500
32 pages
VI Characteristics of Diode
No ratings yet
VI Characteristics of Diode
5 pages
2784
No ratings yet
2784
4 pages
Computer and Internet MCQs For All Competitive Exams
No ratings yet
Computer and Internet MCQs For All Competitive Exams
18 pages
Dynamic Branch Prediction
No ratings yet
Dynamic Branch Prediction
17 pages
Penetration Testing With Metasploit Framework
No ratings yet
Penetration Testing With Metasploit Framework
16 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

02c BranchPred

Uploaded by

02c BranchPred

Uploaded by

CS 6290

• Branches are very frequent

• Direct jumps, Function calls

• Needed for conditional branches

Index Use this entry to

0: predict not taken

When branch direction resolved,

0xDC08: for(i=0; i < 100000; i++)

0xDC50: if( (i & 1) == 1)

• Example: short loop (8 iterations)

DC08: TTTTTTTTTTT ... TTTTTTTTTTNTTTTTTTTT …

FSM for Last-Outcome FSM for 2bC

1bC: Initial Training/Warm-up

Only 1 Mispredict per N branches now!

These are We can

• Use branch correlation

prev = 0 3 2 prediction = T prev = 0 3 0 prediction = T

Last 3 Outcomes Counter if prev=000

• What pattern has this branch predictor entry

do other stuff; /* may contain more branches */

• No predictor is clearly the best

• Many types of behaviors, many

• BTB indexed by instruction address

Copyright © 2012, Elsevier Inc. All rights

Copyright © 2012, Elsevier Inc. All rights

• Also hybrid, but uses tag-based

• Local component also has support for

• Special target prediction for indirect

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.