0% found this document useful (0 votes)
28 views39 pages

Unit 5 1 Basicblocks

Unit 5 covers code generation in compilers, focusing on the design of code generators, including instruction selection, register allocation, and instruction ordering. It discusses the importance of correctness and the various issues faced in generating target programs, such as choosing the right instruction set architecture and optimizing register usage. Additionally, it introduces basic blocks and flow graphs as tools for representing control flow in intermediate code.

Uploaded by

Aman Raj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views39 pages

Unit 5 1 Basicblocks

Unit 5 covers code generation in compilers, focusing on the design of code generators, including instruction selection, register allocation, and instruction ordering. It discusses the importance of correctness and the various issues faced in generating target programs, such as choosing the right instruction set architecture and optimizing register usage. Additionally, it introduces basic blocks and flow graphs as tools for representing control flow in intermediate code.

Uploaded by

Aman Raj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 39

UNIT-5

UNIT-5 TOPICS
• Code Generation: Issues in the design of code
generator, The Target language, Basic blocks & flow
graphs, Dag representation of basic blocks, A Simple
Code Generator, Peephole optimization. Transformation
of basic blocks,
• Machine-Independent Optimizations: The principal
sources of optimization: Global common Sub-
Expressions, Copy Propagation, Dead-Code Elimination
CODE GENERATION
• The final phase in the compiler model is the code generator.
• It takes as input the intermediate representation (IR) produced by the
front end of the compiler, along with relevant symbol table information,
and produces as output a semantically equivalent target program, as
shown in Fig.

• A code generator has three primary tasks: instruction selection,


register allocation and assignment, and instruction ordering.
• Instruction selection involves choosing appropriate target-machine
instructions to implement the IR statements.
• Register allocation and assignment involves deciding what values to
keep in which registers.
• Instruction ordering involves deciding in what order to schedule the
execution of instructions
ISSUES IN THE DESIGN OF A CODE GENERATOR
• While the details are dependent on the specifics of the intermediate
representation, the target language, and the run-time system, tasks such as
instruction selection, register allocation and assignment, and instruction
ordering are encountered in the design of almost all code generators.
• The most important criterion for a code generator is that it produce
correct code.
• Correctness takes on special significance because of the number of special
cases that a code generator might face.
• Given the premium on correctness, designing a code generator so it can be
easily implemented, tested, and maintained is an important design goal.
1. Input to the Code Generator
2. The Target Program
3. Instruction Selection
4. Register Allocation
5. Evaluation Order
Issue-1: Input to the Code
Generator
• The input to the code generator is the intermediate
representation of the source program produced by the
front end, along with information in the symbol table that
is used to determine the run-time addresses of the data
objects denoted by the names in the IR.
• The many choices for the IR include three-address
representations such as quadruples, triples, indirect
triples; virtual machine representations such as
bytecodes and stack-machine code; linear
representations such as postfix notation; and graphical
representations such as syntax trees and DAG's
Issue-2: The Target Program
• Instruction set architecture (set of machine instructions that the
target CPU understands) of the target machine (specific hardware
or architecture for which the compiler generates machine code)
has a significant impact on the difficulty of constructing a good code
generator that produces high quality machine code.
• Most common target machine architectures are RISC (Reduced
Instruction Set Computer), CISC (Complex Instruction Set
Computer) and stack based.
• RISC has many registers, 3-Address instructions, simple address modes
and instruction-set architecture.
• CISC has few registers, 2-Address instructions, a variety of addressing
modes, several register classes, variable length instructions.
• In stack-based architectures, machine operations are done by pushing
operands on a stack and then performing the operations on the
operands at the Top of the Stack. Java Virtual Machine (JVM) uses this.
Issue-2: The Target Program
• Producing an absolute machine-language program as
output has the advantage that it can be placed in a fixed
location in memory and immediately executed. Programs
can be compiled and executed quickly.
• Producing a relocatable machine-language program
(object module) as output allows subprograms to be
compiled separately. A set of relocatable object modules
can be linked together and loaded for execution by a
linking loader.
• Producing an assembly-language program as output
makes the process of code generation somewhat easier.
We can generate symbolic instructions and use the macro
facilities of the assembler to help generate code. The
price paid is the assembly step after code generation.
Issue-3: Instruction Selection
• Code generation maps Intermediate Representation (IR)
program into a code sequence that can be created by
the target machine.
• If rich instruction set is provided by Microprocessor
than the number of target code instructions will reduce.
• For example: if target machine has an “increment”
instruction (INC), then the three address statement
a=a+1 may be implemented more efficiently by a
single instruction INC a, rather than the following:
Format: Instruction Source,
Destination
MOV a, R0 // R0=a INC
ADD #1, R0 R0 = R0+1 a
MOV R0, a // a=R0
Issue-4: Register Allocation
• Main issue in code generation is deciding what to hold in
what registers.
• Registers are fastest unit on the target machine.
• Instructions involving register operands are shorter and
faster than those involving operands in memory, so
efficient utilization of registers is important.
• Use of registers is divided into two subproblems:
• Register allocation: during which we select set of variables that
will reside in registers at each point in the program.
• Register assignment: during which we pick the specific register
that a variable will reside in.
Issue-4 Register Allocation
• Finding an optimal assignment of registers to variables
is difficult.
• Certain machine require register-pairs.
• Example: MUL R2, R3  ‘R2’ is the multiplicand (even
register)
‘R3’ is the multiplier
(odd register)
product occupies the entire
even/odd register
Example: DIV R2, R3  Dividend occupies even/odd
register pair
Even register is ‘R2’ and
odd register is ‘R3’
Issue-5 Evaluation Order
• In code generation, evaluation order refers to the
sequence in which expressions are evaluated and
executed in the generated machine code.
• The order can impact efficiency, correctness, and
optimization.
• Some computations require fewer registers to hold
intermediate results than others.
• Types of evaluation orders are:
• Left-to-Right (LTR) Order – Evaluates expressions from left to right.
• Right-to-Left (RTL) Order – Evaluates expressions from right to left.
• Operator Precedence-Based Order – Evaluates based on
precedence and associativity rules.
• Optimized Order – Reorders evaluations to optimize performance
(e.g., reducing memory access, register usage).
The Target Language
• Following kinds of instructions are present in the target
language
INSTRUCTIONS DESCRIPTION
Load operations Loads value in location addr into location dst
LD dst, addr (dst=addr)
Store operations Stores value in register ‘r’ into the location ‘x’.
ST x, r (x=r)
Computation OP is operator like ADD or SUB and dst, src1 and src2 are
operations locations. Example: ADD r1, r2, r3 (r1=r2+r3)
OP dst, src1, src2
Unconditional Causes control to branch to machine instruction with label
jumps: L.
BR L BR stands for branch
Conditional jumps: ‘r’ is register and ‘L’ is the label, cond is the condition.
Bcond r, L Example: BLTZ r, L branches to label ‘L’ if value in register
‘r’ < 0
BASIC BLOCKS AND FLOW GRAPHS

• Here, we study a graph representation of Basic Block B1


intermediate code which is helpful for code
generation. The graph representation
Basic Block B2
• The representation is constructed as follows
• Partition the intermediate code into basic
blocks, which are maximal sequences of Basic Block B3
consecutive three-address instructions with
the properties that
 The flow of control can only enter the basic block
through the first instruction in the block. That is,
there are no jumps into the middle of the block.
 Control will leave the block without halting or
branching, except possibly at the last instruction in
the block.
• The basic blocks become the nodes of a flow
graph, whose edges indicate which blocks
can follow which other blocks.
BASIC BLOCKS
• A basic block is a
sequence of
consecutive
statements in which
flow of control
enters at the
beginning and
leaves at the end
without halt /
possibility of
branching except at
the end.
FLOW GRAPHS
• Once an intermediate-code program is partitioned into basic blocks,
we represent the flow of control between them by a flow graph.
• The nodes of the flow graph are the basic blocks.
• There is an edge from block B to block C if and only if it is possible
for the first instruction in block C to immediately follow the last
instruction in block B.
• There are two ways that such an edge could be justified:
• There is a conditional or unconditional jump from the end of B to the
beginning of C.
• C immediately follows B in the original order of the three-address
instructions, and B does not end in an unconditional jump.
• We say that B is a predecessor of C, and C is a successor of B.
Steps for constructing basic
blocks and flowgraphs
1. For the given high-level code, write the three-
address code
2. Identify leader statements that satisfy the
following:
• 1st three address statement is a leader
• Target of an unconditional or conditional jump is a leader
• Instruction immediately following the jump is a leader
3. Construct flow graph by connecting the basic blocks
Construct basic block and flow graph
for following
1)If (a>b) goto
if (a < b) (4)
then 2)x=x+1 Leader
x++; 3)goto (5)
statements:
4)x=x-1
else (THREE (1) , (2) , (4), (5)
x--; ADDRESS CODE)
1) If (a>b) goto (4) B1

2) x=x+1
B2
3) goto (5)

4) x=x-1 B3

B4
Construct basic block and flow graph for following
(1)if(a>b) goto
if (a < b) { (7) Leader
if (c < d) (2)if (c>d) goto
x = x + 1;
statements:
(5) (1) , (7) , (5), (2),
else
(3)x=x+1
x = x - 1;
(4)goto (8)
(3), (8) gotoB1
(1) if(a>b)
} else { (7)
x = 0; (5)x=x-1
(2) if (c>d) goto
B2
} (6)goto(8) (5)
(7)x=0
3) x=x+1 B3
4) goto (8)
(5) x=x-1 B4
(6) goto (8)

(7) x=0 B5

FLOW (8)
GRAPH
Construct basic blocks and flow graph for the following code

(1)prod:=0
Begin (2)i:=1
prods:=0 (3)t1:=4*i
i:=1 (4)t2:=a[t1] // compute a[i]
do (5)t3:=4*i
Prod:=Prod + (6)t4:=b[t3] // compute b[i]
a[i]*b[i] (7)t5:=t2*t4 // a[i]*b[i]
i:=i+1 (8)t6:=prod+t5
while i<=20 (9)prod:=t6 //
End prod:=prod+a[i]*b[i]
(10) t7:=i+1
(11) i:=t7
INPUT CODE (12) if i<=20 goto (3)
THREE ADDRESS
CODE
(1)prod:=0 B1
(1)prod:=0
(2)i:=1 (2)i:=1
(3)t1:=4*i
(4)t2:=a[t1] // compute a[i]
(5)t3:=4*i 3) t1:=4*i
(6)t4:=b[t3] // compute b[i] 4) t2:=a[t1] // compute a[i]
(7)t5:=t2*t4 // a[i]*b[i] 5) t3:=4*i
B2
(8)t6:=prod+t5 6) t4:=b[t3] // compute b[i]
(9)prod:=t6 // 7) t5:=t2*t4 // a[i]*b[i]
8) t6:=prod+t5
prod:=prod+a[i]*b[i]
9) prod:=t6 // prod:=prod+a[i]*b[i]
(10) t7:=i+1 10) t7:=i+1
(11) i:=t7 11) i:=t7
(12) if i<=20 goto (3) 12) if i<=20 goto (3)

Leaders are:
(1) Since it is the first statement FLOW GRAPH
(3)  Since it is a target of conditional instruction
at (12)
Statement following (12) is a leader but not
shown
2) Construct basic blocks and flow graph for the following code:
c=0
1) c=0 // Initialize c
do 2) if a < b goto (5) // Check condition a < b
{ 3) x=x-1 // If false, decrement x
if (a < b) 4) goto (6)
then 5) x=x+1 // If true, increment x
x++; 6) c=c+1 // Increment c
else 7) if c < 5 goto (2) // Check loop condition,
repeat if c < 5
x–;
Leaders  (1), (2), (3),
c++; (1)
(5), (6)
c=0
} while (c < 5) (2) (if a < b goto
(5)
(3) x=x-1
(4) goto
(6)
(5)
x=x+1
(6)
c=c+1
(7) if c < 5 goto
(2)
3) Construct basic blocks and flow graph for the following code:

while (A < C and B > D) 1) if (A < C) goto (3)


do 2) goto (15)
if A = 1 then C = C + 1 3) if (B > D) goto (5)
else 4) goto (15)
5) if (A = 1) goto (7)
while A <= D
6) goto (10)
do A = A + B
7) T1 = c + 1
8) c = T1
9) goto (1)
10)if (A <= D) goto
(12)
11) goto (1)
Leaders: (1), (2), (3), (4) (5), (6), (7), (10), 12)T2 = A + B
(11), (12) (15), 13)A = T2
14)goto (10)
1) if (A < C) goto (3) 1.if (A < C) goto (3)
10) if (A <= D) goto (12)
2) goto (15)
3) if (B > D) goto (5) 2) goto (15)
11) goto (1)
4) goto (15)
3) if (B > D) goto (5)
5) if (A = 1) goto (7)
6) goto (10)
4) goto (15) 12) T2 = A + B
7) T1 = c + 1
13) A = T2
8) c = T1 5) if (A = 1) goto (7) 14) goto (10)
9) goto (1)
10)if (A <= D) goto 6) goto (10)
(12) 15)
11) goto (1) 7) T1 = c + 1
8) c = T1
12)T2 = A + B
9) goto (1)
13)A = T2
14)goto (10)

Leaders: (1), (2), (3), (4) (5), (6), (7), (10),


(11), (12) (15),
Construct basic blocks and flow graph for
the following (1)i=1
(2)j=1
(3)a[i, j]=0.0
(4)j=j+1
(5)if (j<=10) goto
(4)
(6)i=i+1
(7)if(i<=10) goto
(2)
(8)i=1
(9)a[i,i]=1
(10)i=i+1
(11)if (i<=10)
goto (9)
Leaders: (1), (2), (4), (6), (8), (9),
(12)
Basic Blocks and (1)i=
Flow Graphs 1
(1)i=1 2) j=1
(2)j=1 3) a[i,
j]=0.0
(3)a[i, j]=0.0
(4)j=j+1 4) j=j+1
5) if (j<=10) goto
(5)if (j<=10) goto (4)
(4)
6) i=i+1
(6)i=i+1 7) if(i<=10) goto (2)
(7)if(i<=10) goto
(2) 8) i=1
(8)i=1
(9)a[i,i]=1 9) a[i,i]=1
(10)i=i+1 10) i=i+1
11) if (i<=10) goto (9)
(11)if (i<=10)
goto (9) Flow Graph
Leaders: (1), (2), (4), (6), (8), (9),
(12)
Optimization of Basic Blocks: DAG representation of
basic blocks
• A Directed Acyclic Graph (DAG) is a graph where nodes represent
operations or values, and directed edges (arrows) indicate dependencies
between them.
• It's "directed" because the edges have a direction (from one node to
another), and "acyclic" because it doesn't contain any cycles (loops).
• DAGs are often used as an intermediate representation of the program's
code, capturing the control flow and data dependencies.
• By representing the program's structure in a DAG, compilers can
generate more efficient code by avoiding redundant computations and
optimizing the flow of control.
• DAGs can aid in register allocation by identifying variables that can share
the same register, minimizing memory accesses.
• In compiler design, DAGs are often used to represent the control flow and
data dependencies of a program, and they play a crucial role in
optimizing the generated code.
DAG construction for a basic block

• Initially, we assume there are no nodes. Suppose the


“current” three address statement is either (i) x:=y op z
(ii) x:=op y (iii) x:=y. x
op
(i) DAG for x:=y op z
y z

op x
(ii) DAG for x:=op y
y

x,
op
(iii) x:=y y
Construct DAG for the following
1) a:=b+c 3) a=b+c
t:=c*d b=a-d
c=b+c
x:=m+a d=a-d

2) d:=b*c
e:=a+b
b:=b*c
a:=e-d
4) Construct DAG for the expression: a+a*(b-
c)+(b-c)*d
First, write three address code based on operator
precedence:
a+a*(b-c)+(b-c)*d

t1=b-c
t2=b-c
t3=a*t1
t4=t2*d
t5=a+t3
t6=t5+t4
Construct DAG for the
following
t1=b+c
6) a=b+c+d t2=t1+
d
b=c+a a=t2
t3=c+a
c=c+d b=t3
t4=c+d
c=t4

t1=b+c
7) a=b+c a=t1
b=a-d t2=a-d
b=t2
c=b+c t3=b+c
c=t3
5) Construct DAG for the following three address code

(1)t1:=4*i
(2)t2:=a[t1] // compute a[i]
(3)t3:=4*i
(4)t4:=b[t3] // compute b[i]
(5)t5:=t2*t4 // a[i]*b[i]
(6)t6:=prod+t5
(7)prod:=t6 //
prod:=prod+a[i]*b[i]
(8) t7:=i+1
(9) i:=t7
(10) if i<=20 goto (1)
6) Construct DAG for the
following three address code:
1) sqr := i * i
temp,
2) temp := sum + sum1 < (1)
+ =
sqr sqr
3) sum := temp *
i1
+ 1
sum0
4) i := i + 1 5
5) if (i<=15) goto
i0 i0 1
(1)
Applications of DAGs
• We can automatically detect common subexpressions
• We can determine which identifiers have their values
used in the block.
• We can determine which statements compute values
that could be used outside the block
SIMPLE CODE GENERATOR
• The code generation strategy here generates target code for three
address statements considering location of operands.
• Assumptions:
• ADD Rj, Ri  Cost=1 (Register operations)
• ADD c, Ri  Cost=2 (Memory operation, ‘c’ is a memory location)
• Register and address descriptors: Register descriptor keeps
track of what is currently in each register.
• It is consulted when a new register is needed.
• Address descriptor keeps track of the location where the current
value of the name can be found at run time.
• The location can be a register, a stack location or a memory
address.
• Format of instruction  Instruction-Name Source,
Destination
CODE GENERATOR INSTRUCTIONS
STATEMENTS INSTRUCTIONS
a:=b[i] MOV i, R0 // R0 = i
MOV b(R0), a // a=b[i]
a[i]:=b MOV i,R0 // R0=i
MOV b, a(R0) // a[i]=b
c=a+b MOV a, R0 // R0=a
ADD b, R0 // R0=R0+b
STORE R0, c //c=R0
t=p*q MOV p, R0 // R0=p
MUL q, R0 // R0=R0*q
STORE R0, t // t=R0
s=i-j MOV i, R0 // R0=i
SUB j, R0 // R0 = R0-j
STORE R0, s //s=R0

Instruction Format  InstructionName Source, Destination


CODE GENERATION PROBLEMS
1) Generate target code for the following
expression:
d:= (a-b) + (a-c) + (a-c)
Ans: First convert to three address code, then
assembly code:
t1=a-b MOV a, R0 // R0=a
SUB b, R0 // R0 = R0-b = a-b
t2=a-c
MOV a, R1 // R1=a
t3=t1+t2 SUB c, R1 // R1 = R1-c = a-c
t4=t3+t2 ADD R1, R0 // R0=(a-b)+(a-c)
ADD R1, R0 // R0=(a-b) + (a-c) + (a-c)
d:=t4 STORE R0, d // d=(a-b) + (a-c) + (a-c)
2) Generate target code for the
expression: (a * b) + (c + d) – (a
+ b + c + d)
Ans:
THREE ADDRESS TARGET CODE
CODE
T1 = a * b MOV a, R0 // R0=a
MUL b, R0 // R0 =R0*b = a*b
T2 = c + d MOV c, R1
ADD d, R1 // R1 = R1+d = c+d
T3=a+b MOV a, R2
ADD b, R2 // R2 =R2+b = a+b
T4=T3+T2 // ADD R1, R2 // R2 = R2+R1 = a+b+c+d
a+b+c+d
T5=T1+T2 // (a*b)+ ADD R1, R0 // R0= R0+R1 = (a*b)+(c+d)
(c+d)
T6=T5-T4 SUB R2, R0 // R0 = R0-R2 = (a*b)+(c+d) –
3) Generate target code for: (i*j) +
(e+f) * (a*b+c)
T1=i*j MOV i, R0
T2=e+f MUL j, R0 // R0=i*j
MOV e, R1
T3=a*b ADD f, R1 // R1=e+f
T4=T3 + c //T4=a*b+c MOV a, R2
T5=T2+T4 MUL b, R2 // R2=a*b
//T5=(e+f)*(a*b+c) ADD c, R2 // R2=a*b+c
T6=T1*T5 //T6=(i*j)+ MUL R2, R1 //
(e+f)*(a*b+c)
THREE ADDRESS CODE R1=R1*R2=(e+f)*(a*b+c)
ADD R2, R0 // R0=R0+R2=(i*j)+
(e+f)*(a*b+c)
TARGET CODE
4) Generate target code for: (q-j) * (a+b*c) + (a-b+c)

MOV q, R0
t1=q-j SUB j, R0 // R0=q-j
t2=b*c MOV b, R1
t3=a+t2 MUL c, R1 //R1=b*c
t4=a-b ADD a, R1//R1=a+b*c
t5=t4+c MOV a, R2
t6=t1*t3 SUB b, R2 // R2=a-b
t7=t6+t5 ADD c, R2 // R2=a-b+c
MUL R1, R0//R0=(q-j)*(a+b*c)
ADD R2, R0//R0=(q-j)*(a+b*c)+(a-
THREE ADDRESS b+c)
CODE TARGET
CODE

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy