0% found this document useful (0 votes)
54 views50 pages

Acd-Unit 5

The document outlines the phases of a compiler, focusing on code generation and optimization techniques such as peephole optimization, which improves target code by eliminating redundant instructions and optimizing flow control. It discusses the role of the code generator, issues in code generation including memory management and instruction selection, as well as strategies for register allocation and assignment. The document also introduces a generic code generation algorithm that utilizes address and register descriptors for efficient code generation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views50 pages

Acd-Unit 5

The document outlines the phases of a compiler, focusing on code generation and optimization techniques such as peephole optimization, which improves target code by eliminating redundant instructions and optimizing flow control. It discusses the role of the code generator, issues in code generation including memory management and instruction selection, as well as strategies for register allocation and assignment. The document also introduces a generic code generation algorithm that utilizes address and register descriptors for efficient code generation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

Vallurupalli Nageswara Rao Vignana Jyothi Institute of

Engineering &Technology

Department of Computer Science & Engineering

SUBJECT: Automata and Compiler


Design
Subject Code:22PC1CB303

Topic Name: Phases of a compiler


III year-I sem, sec: B and D

Dr. M.Gangappa
Associate Professor
Email: gangappa_m@vnrvjiet
<Web link of your created resource if any>

Dr.M.Gangappa,Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 1
Syllabus
▪ Code Generation: Machine dependent code generation, object code
forms, generic code generation algorithm, Register allocation and
assignment. Using DAG representation of Block.

Dr.M.Gangappa,Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 2
Machine dependent code optimization
PEEPHOLE OPTIMIZATION

▪ A statement-by-statement code-generations strategy often produces target code that


contains redundant instructions. The quality of such target code can be improved by
applying “optimizing” transformations to the target program.

▪ A simple but effective technique for improving the target code is peephole optimization,
a method for trying to improving the performance of the target program by examining
a short sequence of target instructions (called the peephole) and replacing these
instructions by a shorter or faster sequence, whenever possible.

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 3
Characteristics of peephole optimizations
❑ Redundant-instructions elimination
❑ Flow-of-control optimizations
❑ Algebraic simplifications
❑ Use of machine idioms
❑ Unreachable

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 4
Redundant Loads And Stores:

If we see the instructions sequence


(1) MOV R0, a
(2) MOV a, R0

we can delete instructions (2) because whenever (2) is executed (1) will ensure
that the value of a is already in register R0.

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 5
Eliminating Multiple Jumps(flow of control optimization)

If we have jumps to other jumps, then the unnecessary jumps can be eliminated.

If we have a jump sequence:


goto L1
...
L1: goto L2
then this can be replaced by:
goto L2
...
L1: goto L2
If there are now no jumps to L1, then it may be possible to eliminate
Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 6
Eliminating Unreachable Code
#define debug 0
...
if (debug)
{
print debugging information
}
The above “if “ statement can be translated in the intermediate code to:
If debug = 1 goto L1
goto L2
L1: print debugging information
L2 :

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 7
Algebraic Simplification:

There is no end to the amount of algebraic simplification that can be attempted through
peephole optimization. Only a few algebraic identities occur frequently enough that it is worth
considering implementing them. For example, statements such as
x := x+0
or
x := x * 1

are often produced by straightforward intermediate code-generation algorithms, and they can
be eliminated easily through peephole optimization.

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 8
Strength Reduction

Reduction in strength replaces expensive operations by equivalent cheaper ones


on the target machine. Certain machine instructions are considerably cheaper than
others and can often be used as special cases of more expensive operators.

For example, x² is invariably cheaper to implement as x*x than as a call to an


exponentiation routine. Fixed-point multiplication or division by a power of two
is cheaper to implement as a shift. Floating-point division by a constant can be
implemented as multiplication by a constant, which may be cheaper.

X2 → X*X
Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 9
Use of Machine Idioms

The target machine may have hardware instructions to implement certain specific
operations efficiently. For example, some machines have auto-increment and
auto-decrement addressing modes. These add or subtract one from an operand
before or after using its value. The use of these modes greatly improves the
quality of code when pushing or popping a stack, as in parameter passing. These
modes can also be used in code for statements like i : =i+1.

i:=i+1 → i++
i:=i-1 → i- -

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 10
Role of the Code generator
The final phase in compiler model is the code generator. It takes as input an
intermediate representation of the source program and produces as output an
equivalent target program.

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 11
Issues in the design of a code generator

The code generator needs to address so many issues before


actually generating the code. The following issues arise during
the code generation phase:

1. Input to code generator


2. Target program
3. Memory management
4. Instruction selection
5. Register allocation
6. Evaluation order

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 13
1. Input to code generator
The code generator gets the three-address code as input. The format in which this three-address code is fed as
input need to be decided. The following are some of the ways of feeding input to the code generator.

• Linear – The input could be a string of characters where we use postfix notation to express the input.
• Tables – As discussed in the previous modules, the three-address code is typically represented using
Quadruples, Triples or Indirect triples and this table could be served as input
• Non-linear – Abstract Syntax tree (AST) or Directed Acyclic Graph (DAG) could be used as input to the code
generator after converting the input into AST or DAG representation

Prior to code generation, the front end must be scanned, parsed and translated into
intermediate representation along with necessary type checking. Therefore, input to code
generation is assumed to be error-free.

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 14
2. Target program or object code formats
The (back-end )code generator of a compiler may generate different forms of code, depending
on the machine requirements. The target program code needs to be specified as Absolue code,
Relocatable machine code or Assembly language code.The following are some of the forms .

1. Absolute machine language : Producing an absolute machine language program as output


has the advantage that it can be placed in a fixed location in memory and immediately executed.

2. Relocatable machine code: Producing a relocatable machine language program as output


allows subprograms to be compiled separately. A set of relocatable object modules can be
linked together and loaded for execution by a linking loader.

If the target machine does not handle relocation automatically, the compiler must provide
explicit relocation information to the loader, to link the separately compiled program
segments.
Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 15
Target program or object code formats

3. Assembly language:
Producing an assembly language program as output makes the process of code generation
somewhat easier.

Example:

MOV R0, R1
ADD R1, R2

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 16
3. Memory management

❑Names in the source program are mapped to addresses of data objects in run-
time memory.
❑Address mapping defines the mapping between intermediate representations to
address in the target code.
❑These addresses are based on the runtime environment used like static, stack or
heap.
❑The identifiers are stored in symbol table during declaration of variables or
functions, along with type.

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 17
4. Instruction selection

The factors to be considered during instruction selection are:


❑ The uniformity and completeness of the instruction set.
❑ Instruction speed and machine idioms.
❑ Size of the instruction set.

Instruction selection is important to obtain efficient code. Suppose we need to translate the
following three-address code
x:=y+z
The following would be the instructions with a total cost of 6 (2+2+2)MW(memory words):
MOV y,R0
ADD z,R0
MOV R0,x

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 18
4. Instruction selection (continue…)

Consider another instruction a:=a+1 and if we adopt the same strategy to convert this
instruction to target code, the following would be the result with a cost of 6 (2+2+2)MW”
MOV a,R0
ADD #1,R0
MOV R0,a

if the above code is replaced by the following instruction, the cost would be 2 (1 for INC and 1
for “a”) .
INC a

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 19
5. Register allocation and Assignment

One of the important issues in code generation is register allocations. The number of registers
in any architecture is limited to match the number of variables in a high-level program.
Efficient utilization of the limited set of registers is important to generate good code. Registers
are assigned by
❑ Register allocation to select the set of variables that will reside in registers at a point in the
code
❑ Register assignment to pick the specific register that a variable will reside in

However, finding an optimal register assignment in general is NP-complete.

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 20
6. Evaluation order

❑The order in which computations are performed can affect the efficiency of the
target code.
❑Some computation orders require fewer registers to hold intermediate results
than others.

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 21
Register allocation and assignment
Instructions involving only register operands are faster than those involving memory operands.
Therefore, efficient utilization of registers is vitally important in generating good code. This
section presents various strategies for deciding at each point in a program what values should
reside in registers (register allocation) and in which register each value should reside (register
assignment). The following are the register allocation and assignment strategies:

1 Global Register Allocation


2 Usage Counts
3 Register Assignment for Outer Loops
4 Register Allocation by Graph Coloring

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 22
The following are the strategies of global register allocation :

❑ The global register allocation has a strategy of storing the most frequently used variables in
fixed registers.

❑ Another strategy is to assign some fixed number of global registers to hold the most active
values in each inner loop.

❑ Registers not already allocated may be used to hold values local to one block .

❑ With early C compilers, a programmer could do some register allocation explicitly by using
register declarations to keep certain values in registers for the duration of a procedure.

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 24
2.Usage count

The formula to compute usage count of a variable x in block B of


a flow graph F is given by

෍ 𝑢𝑠𝑒 𝑥, 𝐵 + 2 ∗ 𝑙𝑖𝑣𝑒 𝑥, 𝐵
∀ 𝐵 𝑖𝑛 𝐹

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 25
Formulas for usage count method

In the 3AC, x = y+ z , the variable x is called defined and the variables y and z are
called used variables.

Use(x ,B) = No of times variable x is used in basic block B before defining x in B.

Live(x , B) is computed using the formula

𝟏 𝒊𝒇 𝒙 𝒊𝒔 𝒍𝒊𝒗𝒆
𝒙= ቊ
𝟎 𝒊𝒇 𝒙 𝒊𝒔 𝒌𝒊𝒍𝒍𝒆𝒅

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 26
3.Register assignment for outer loop
Machine instructions:
1. Store : is used to store the data from register to memory / variable.
MOV R0, a or MOV R0,M
2. Load : is used to load the data from variable to register .
MOV a, R0

Register assignment for outer loop:

Consider that there are two loops, L1 is an outer loop and L2 is an inner loop and
allocation of variable “a” is to be done some registers. The approximate scenario is shown
in the next slide.

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 27
3.Register assignment for outer loop
Loop L1
…. L1 – L2
Loop L2

….. L1 – L2

The following are the criteria for register assignment for outer loop:
1. If “a” is allocated in Loop L2 , then it should not be allocated in L1-L2.
2. If “a” is allocated in L1 and it is not allocated in L2, then store “a” on the entrance of L2 and Load “a” while
leaving L2. MOV R0, a
MOV a, R0
3. If “a” is allocated in L2 and it is not in L1, then load “a” on the entrance of L2 and store “a” on the exit
from L2. MOV a , R0
MOV R0 , a

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 28
4.Register allocation by graph coloring

Global register allocation can be seen as a graph coloring problem.

Basic idea:

1. Identify the live range of each variable


2. Build an interference graph that represents conflicts between live ranges (two
nodes are connected if the variables they represent are live at the same moment)
3. Try to assign as many colors to the nodes of the graph as there are registers so
that two neighbors have different colors

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 29
4.Register allocation by graph coloring

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 30
Generic code generation algorithm
The simple code generator algorithm generates target code for a sequence of three-address
statements. The code generator algorithm works by considering individually all the basic blocks.
It uses the next-use information to decide on whether to keep the computation in the register or
move it to a variable so that the register could be reused.

Data structures for the Simple code generator algorithm

This algorithm uses two data structures for generating code :


1. Address descriptor : is used to keep track of the location where the current value of the
variable can be found at run time.
2. Register descriptor : is used to keep track of which variable is currently stored in a register
.

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 31
The Code Generation Algorithm
Algorithm SimpleCodeGenerator( )
Input : Sequence of 3-address statements from a basic block.
Output: Assembly language code

For each statement x := y op z


1. Set location L = getreg(y, z) to store the result of y op z
2. If y ∉ L then generate MOV y’, L
where y’ denotes one of the locations where the value of y is available - choose register if
possible
3. Generate
OP z’, L

where z’ is one of the locations of z;


Update register/address descriptor of x to include L
Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 32
example
Generate the target code for d := (a-b) + (a-c) + (a-c)
Step1 :
Intermediate Target code Register Address
Intermediate code
code Descriptor Descriptor
t: = a-b
t = a - b MOV a, R0 Registers t in R0
u := a-c
SUB b, R0 empty
v := t + u
R0 contains t
d := v +u
u := a-c MOV a,R1 R0 contains t t in R0
Step 2: SUB c,R1 R1 contains v u in R1
v := t + u ADD R1, R0 R0 contains v u in R1
R1 contains u v in R0
d := v + u ADD R1, R0 R0 contains d d in R0
MOV R0, d
Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 34
Cost of the machine instructions

MOV a, R0
SUB b, R0
MOV a,R1
SUB c,R1
ADD R1, R0

ADD R1, R0
MOV R0, d

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 35
Generating code for other types of statements

The code for indexed assignment statements: a := b[i] and a[i] := b


Statement i in register Ri i in Memory Mi i in stack
code cost code cost code cost
a := b[i] MOV b(Ri ), R 2 MOV Mi , R 4 MOV Si(A)i , R 4
MOV b(R), R MOV b(R), R
a[i] := b MOV b,a(Ri) 3 MOV Mi , R 5 MOV Si(A)i , R 5
MOV b, a(R) MOV b, a(R)

Practice Problems:
1. Generate the code sequence for the pointer assignments a=*p and *p = a , when
p is in the register, P is in the Memory and p is in the Stack.
2. Generate the code for the condtional statement : if x < y goto z.
3. Generate the code for x := y + z
if x < 0 goto z
Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 36
1.Generate the code for the assignment statement given below

X = (a–b)+ ( e +(c–d))

2.Generate improved target code for the assignment and also compute the
instructions cost:
a = b + c
d = a + e

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 37
GENERATING CODE FROM DAGs

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 38
❑ Rearranging the order – To optimize the code generation, the
instructions are rearranged and this is referred to as heuristic
reordering
❑ Labeling the tree for register information – To know the
number of registers required to generate code, the labels of
the nodes are numbered which indicate the number of
registers required to evaluate that node.
❑ Heuristic reordering– find a solution close to the best one
and they find it fast and easily.
Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 39
1.Rearranging the order
The order in which computations are done can affect the cost of resulting object
code.
For example, consider the following basic block: MOV a , R0
t1 : = a + b The generated ADD b , R0
t2 : = c - d code sequence : MOV c , R1
t3 : = e + t2 SUB d , R1
t4 : = t1+ t3 MOV R0 , t1
MOV e , R0
ADD R0 , R1
MOV t1 , R0
ADD R1 , R0
MOV R0 , t4
Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 40
1.Rearranging the order

Rearranged basic block:


Now t1 occurs immediately before t4. Revised code sequence:
t2 : = c - d MOV c , R0
t3 : = e + t2 ADD d , R0
t1 : = a + b MOV e, R1
t4 : = t1 + t3 ADD R0 , R1
MOV a , R0
ADD b , R0
ADD R1 , R0
MOV R0 , t4
In this order, two instructions MOV R0 , t1 and MOV t1 , R0 have been saved.

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 41
2. Heuristic ordering for Dags
The term heuristic is used for algorithms which find solutions among all
possible ones ,but they do not guarantee that the best will be found, therefore they
may be considered as approximately and not accurate algorithms. These
algorithms, usually find a solution close to the best one and they find it fast and
easy.

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 42
2. Heuristic ordering for Dags
Node listing algorithm
Obtain all interior nodes, Consider these interior nodes as unlisted interior nodes
while (unlisted interior nodes remain) do
begin
select an unlisted node n, whose parents have been listed ;
list n;
while (the leftmost child m of n has no unlisted parents and is not a leaf )do
/* since n was just listed, m is not yet listed*/
begin
list m ;
n=m;
end
end
Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 43
2. Heuristic ordering for Dags

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 44
2. Heuristic ordering for Dags

Generated target code:

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 45
Apply the node listing algorithm and generate the target code for

+ {(a+(b*c)) – ((b*c)+d)} + {(a+(b*c)) - ((b*c)+d)}

- -

+
+

d
a *

b c

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 46
The Labelling Algorithm

This algorithm works on the tree representation of a sequence of three-address


statements. This algorithm has two parts:

❑The first part labels each node of the tree from bottom to top, with an integer that denotes
the minimum number of registers required to evaluate the tree.

❑The second part of the algorithm is a tree traversal that generates the code during the tree
traversal.

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 47
Assigning the labels
▪ For leaf nodes (external nodes):
▪ If n is the left most child of its parent then
▪ Label(n) = 1
▪ else
▪ Label(n) = 0
▪ For internal nodes:
max(𝑙1 , 𝑙2 ), 𝑙1 ! = 𝑙2
▪ 𝐿𝑎𝑏𝑒𝑙 𝑛 = ቊ
𝑙1 + 1, 𝑙1 == 𝑙2
▪ Where 𝑙1 𝑖𝑠 𝑡ℎ𝑒 𝑙𝑎𝑏𝑒𝑙 𝑓𝑜𝑟 𝑙𝑒𝑓𝑡 𝑐ℎ𝑖𝑙𝑑 𝑎𝑛𝑑 𝑙2 is label for right child of the parent node n.

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 48
Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 49
COMPILER DESIGN
WIT and WIL

UNIT-1 UNIT-2
1.Intro. To Compilers 1. Parser
2. Lexical analysis 2. Top-down & Bottom
3. Regular Expressions up parser tree
4. Lex tool

UNIT-5
1. Object code UNIT-3
representation 1. SDD
2. Register allocation UNIT-4 2. Intermediate code
3. Code generation 1. Code optimization 3. Runtime
algorithms 2. DAG representation environment

Department of Computer Science & Engineering,


VNR VNRVJIET,
VJIET( 2017-18) CSEHyderabad November 21, 2024 50
Practice Questions
▪ What are the target code representation formats?
▪ Illustrate the use of graph coloring in register allocation?
▪ List out the DAG based code generation algorithms.
▪ Do we really need a compiler?

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 51
References
▪ Compilers principles ,tools and techniques by
Aho, Sethi, and Ullman, Chapters 1, 2, 3

https://parasol.tamu.edu/~rwerger/Courses/434/lec1.pdf

https://www.wmlcloud.com/windows/algorithms-for-compiler-design-using-dag-for-code-generation/

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 52
THANK YOU

Department of Computer Science & Engineering, VNRVJIET, Hyderabad November 21, 2024 53

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy