0% found this document useful (0 votes)
67 views28 pages

CH06

This document discusses intermediate code generation and optimization in compilers. It describes how producing an intermediate representation facilitates retargeting a compiler to different machines and allows for machine-independent optimizations. Common intermediate representations include graphs, postfix notation, and three-address code. The document outlines various machine-independent optimizations that can improve the intermediate code, such as peephole, local, global, loop, and inter-procedural optimizations. It also discusses basic blocks and how they are constructed from three-address instructions.

Uploaded by

zemike
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views28 pages

CH06

This document discusses intermediate code generation and optimization in compilers. It describes how producing an intermediate representation facilitates retargeting a compiler to different machines and allows for machine-independent optimizations. Common intermediate representations include graphs, postfix notation, and three-address code. The document outlines various machine-independent optimizations that can improve the intermediate code, such as peephole, local, global, loop, and inter-procedural optimizations. It also discusses basic blocks and how they are constructed from three-address instructions.

Uploaded by

zemike
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

CHAPTER SIX

Intermediate Code Generation and


Optimization

Outline
 Introduction
 Intermediate-Code Generation
 Machine-Independent Optimizations
6.1 Introduction: Structure of a Compiler
6.2 Intermediate Code Generation

 Although a compiler can directly produce a target language


(i.e. machine code or assembly of the target machine),
producing a machine independent intermediate representation
has the following benefits.
 Retargeting to another machine is facilitated.
 Intermediate code representation is neutral in relation to target
machine, so the same intermediate code generator can be
shared for all target languages (machines).
 Build a compiler for a new machine by attaching a new code
generator to an existing front-end
 Machine independent code optimization can be applied to
intermediate code.
Compiling Process without
Intermediate Representation

C SPARC

Pascal HP PA

FORTRAN x86

C++ IBM PPC


Compiling Process with Intermediate
Representation

C SPARC

Pascal HP PA
IR
FORTRAN x86

C++ IBM PPC

10
Methods of Intermediate Code (IC) Generation

Intermediate language can be many different languages,


and the designer of the compiler decides this intermediate
Language. Common IRs:
 Graphical Representation: such as syntax trees, AST
(Abstract Syntax Trees), DAG
 Postfix Notation: the abstract syntax tree is linearized as a
sequence of data references and operations.
 For instance, the tree for : a * ( 9 + d ) can be mapped to the
equivalent postfix notation: a9d+*
 Three-address Code: All operations are represented as a 4-
part list in quadruples:
 (op, arg1, arg2, result). E.g., x := y + z -> (+ y z x)
Direct Acyclic Graph (DAG) Representation

 Example: F = ((A+B*C) * (A*B*C))+C


= =
F + +
F

* *
C
+ * + *
A
* * *
A
B C B C A
B C
DAG Syntax tree
A syntax tree depicts the natural hierarchical structure of a
source program. A DAG gives the same information but in
compact way because common expressions are identified
Postfix Notation: PN

 A mathematical notation wherein every operator follows all


of its operands.
 Or a list of nodes of a tree in which a node appears
immediately next to its children.
Example: PN of expression a* (b+c) is abc+*
How about (a+b)/(c-d)
 Form Rules:
 If E is a variable/constant, the PN of E is E itself.
 If E is an expression of the form E1 op E2, the PN of E is
E1 ’E2 ’op (E1 ’ and E2 ’ are the PN of E1 and E2,
respectively.)
 If E is a parenthesized expression of form (E1), the PN
of E is the same as the PN of E1.
Three Address Code
 The general form:x = y op z
 x,y,and z are names, constants, compiler-generated temporaries
 op stands for any operator such as +,-,….

 We use the term “three-address code” because each statement


usually contains three addresses (two for operands, one for the
result).
 A popular form of intermediate code used in optimizing
compilers is three-address statements.
 Linearized representation of syntax tree with explicit names
given to interior nodes.
 There is only one operator in the right. Thus a source language
expression like : a+b*c might be translated into a sequence with
temporaries t1 and t2
t1 = b* c
t2 = a + t1
DAG vs. Three Address Code
 Three address code is a linearized representation of
a syntax tree (or a DAG) in which explicit names
(temporaries) correspond to the interior nodes of the
graph.
Expression: F = ((A+B*C) * (A*B*C))+C
=
T1 := A T1 := B * C
F + T2 := C T2 := A+T1
T3 := B * T2 T3 := A*T1
T4 := T1+T3 T4 := T2*T3
* T5 := T1*T3 T5 := C
+ * T6 := T4 * T5 T6 := T4 + T5
T7 := T6 + T2 F := T6
A
* F := T7
B C
Syntax tree DAG

Question: Which IR code sequence is better?


Implementation of Three Address Code

• Quadruples
Four fields: op, arg1, arg2, result
Array of struct {op, *arg1, *arg2, *result}
 x:=y op z is represented as op y, z, x
arg1, arg2 and result are usually pointers to symbol table
entries.
May need to use many temporary names.
Many assembly instructions are like quadruple, but arg1,
arg2, and result are real registers.
• Triples
Three fields: op, arg1, and arg2. Result become implicit.
arg1 and arg2 can be pointers to the symbol table.
5/31/2015 \course\cpeg621-10F\Topic-1a.ppt 11
Types of Three-Address Statements

 Assignment statements:
 x := y op z, where op is a binary operator add a,b,c
 x := op z, where op is a unary operator not a, ,c or intoreal a, ,c
 Copy statements
 x := y mov a, ,c
 The unconditional jumps:
 goto L jump , ,L1
 Conditional jumps:
 if x relop y goto L jmprelop y,z,L or if y relop z goto L
 param x and call p, n and return y relating to procedure calls
Eg: f(x+1,y)  add x,1,t1
param t1, ,
param y, ,
call f,2,
 Indexed assignments:
 x := y[i]
 x[i] := y
 Address and pointer assignments:
 x := &y, x := *y, and *x = y
6.3 Code Optimization:
Summary of Front End

Lexical Analyzer (Scanner)


+
Syntax Analyzer (Parser)
+ Semantic Analyzer

Front
Abstract Syntax Tree w/Attributes End

Intermediate-code Generator

Error Non-optimized Intermediate Code


Message
5/31/2015 \course\cpeg621-10F\Topic-1a.ppt 13
Code Optimization

• The machine-independent code-optimization phase attempts to


improve the intermediate code so that better target code will
result.
• Usually better means faster, but other objectives may be
desired, such as shorter code, or target code that consumes less
power.
• A simple intermediate code generation algorithm followed by
code optimization is a reasonable way to generate good target
code.
How Compiler Improves Performance
• Execution time = Operation count * Machine cycles per
operation
• Minimize the number of operations
• Arithmetic operations, memory accesses
• Replace expensive operations with simpler ones
• E.g., replace 4-cycle multiplication with1-cycle shift
• Minimize cache misses
• Both data and instruction accesses
• Perform work in parallel
• Instruction scheduling within a thread
• Parallel execution across multiple threads
Code Optimization

• There is a great variation in the amount of code optimization


different compilers perform.
• In those that do the most, the so called “optimizing compilers”,
take significant time in this phase.
• Trade off between compilation time and degree of optimization
Why to use optimization:
• There are simple optimizations that significantly improve the
running time of target program without slowing down
compilation too much
Types of Optimization

• Peephole
• Local
• Global
• Loop
• Inter-procedural, whole-program or link-time
• Machine code
• ….
Basic Blocks

 Basic blocks are maximal sequences of consecutive three-


address instructions.
 The flow of control can only enter the basic block through the
first instruction in the block. (no jumps into the middle of the
block )
 Control will leave the block without halting or
branching, except possibly at the last instruction in the
block.
 The basic blocks become the nodes of a flow graph,
whose edges indicate which blocks can follow which
other blocks.
Construction of Basic Blocks
 Input: A sequence of three-address instructions
 Output: A list of the basic blocks for that sequence in
which each instruction is assigned to exactly one basic
block
 Method: Determine instructions in the intermediate code that
are leaders:
 The rules for finding leaders are:
 The first three-address instruction in the intermediate code
 Any instruction that is the target of a conditional or
unconditional jump
 Any instruction that immediately follows a conditional or
unconditional Jump is a leader
Construction Partitioning Three-address
Instructions in to Basic Blocks
1. i=1
 First, instruction 1 is a leader by rule (1).
2. j=1
Jumps are at instructions 6, 8, and 11. By 3. t1 = 10 * i
rule (2), the targets of these jumps are 4. t2 = t1 + j
leaders ( instructions 3, 2, and 10, 5. j=j+1
respectively) 6. if j <= 10 goto (3)
 By rule (3), each instruction following a 7. i=i+1
jump is a leader; instructions 7 and 9. 8. if i <= 10 goto (2)
 Leaders are instructions 1, 2, 3, 7, 9 and 9. i=1
10. t3 = i – 1
10. The basic block of each leader
11. if i <= 10 goto (10)
contains all the instructions from itself
until just before the next leader.
Flow Graphs
 Flow Graph is the representation of control flow between
basic blocks. The nodes of the flow graph are the basic blocks.
 There is an edge from block B to block C if and only if it is
possible for the first instruction in block C to immediately
follow the last instruction in block B. There are two ways that
such an edge could be justified:
1. There is a conditional or unconditional jump from the end

of B to the beginning of C.
2. C immediately follows B in the original order of the three-
address instructions, and B does not end in an
unconditional jump.
 B is a predecessor of C, and C is a successor of B.
Flow Graphs: Example
Flow Graph Example of program in Example(1).
The block led by first statement of the program is the
start, or entry node.
Entry
Exit
B1: i = 1
B6: t3 = i – 1
B2: j = 1 if i <= 10 goto (10)

B3: t1 = 10 * i B5: i = 1
t2 = t1 + j
j=j+1 B4: i = i + 1
if j <= 10 goto (3) if i <= 10 goto (2)

22
Representation of Basic Blocks

• Each basic block is represented by a record


consisting of
– a count of the number of statements
– a pointer to the leader
– a list of predecessors
– a list of successors

23
Peephole Optimization
• Improve the performance of the target program by
examining and transforming a short sequence of
target instructions
• Depends on the window size
• May need repeated passes over the code
Examples Redundant loads and stores
MOV R0, a
MOV a, Ro
• Algebraic Simplification
x := x + 0
x := x * 1
• Constant folding
x := 2 + 3 x := 5
y := x + 3 y := 8
Local Optimizations
 Analysis and transformation performed within a basic block
 No control flow information is considered
 Examples of local optimizations:
 Local common sub expression elimination
analysis: same expression evaluated more than once.
transformation: replace with single calculation
 Local constant folding or elimination
analysis: expression can be evaluated at compile time
transformation: replace by constant, compile-time value
 Dead code elimination

25
Global Optimizations:

Intraprocedural
 Global versions of local optimizations
 Global common sub-expression elimination
 Global constant propagation
 Dead code elimination

 Loop optimizations
 Reduce code to be executed in each iteration

26
Examples

• Unreachable code
#define debug 0
if (debug) (print debugging information)

if 0 <> 1 goto L1
print debugging
information L1:

if 1 goto L1
print debugging information
L1:
27
Examples

• Flow-of-control optimization

goto L1 goto L2
… …
L1: goto L2 L2: …

goto L1 if a < b goto L2


… …
L1: if a < b goto L2

28

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy