0% found this document useful (0 votes)
16 views42 pages

CS6109 Module 8

Module 8 of CS6109 focuses on three-address code, detailing its types including quadruples, triples, and indirect triples. It explains the structure and purpose of three-address code in compiler design, emphasizing its role in translating complex expressions and handling control flow. Additionally, the module covers backpatching techniques for managing jumps in boolean expressions and flow-of-control statements.

Uploaded by

cip29.mit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views42 pages

CS6109 Module 8

Module 8 of CS6109 focuses on three-address code, detailing its types including quadruples, triples, and indirect triples. It explains the structure and purpose of three-address code in compiler design, emphasizing its role in translating complex expressions and handling control flow. Additionally, the module covers backpatching techniques for managing jumps in boolean expressions and flow-of-control statements.

Uploaded by

cip29.mit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 42

CS6109 – COMPILER DESIGN

Module – 8

Presented By
Dr. S. Muthurajkumar,
Assistant Professor,
Dept. of CT, MIT Campus,
Anna University, Chennai.
MODULE - 8
 Three address code
 Types of Three address code – Quadruples, Triples
 Three-address code for Declarations, Arrays, Loops, Backpatching

2
THREE ADDRESS CODE
 In three-address code, there is at most one operator on the right side of an
instruction; that is, no built-up arithmetic expressions are permitted.
 Thus a source-language expression like x+y*z might be translated into the
sequence of three-address instructions
 t1 = y * z
 t2 = x + t1
 where t1 and t2 are compiler-generated temporary names.

3
THREE ADDRESS CODE
 Three-address code is a linearized representation of a syntax tree or a DAG in
which explicit names correspond to the interior nodes of the graph.
 a + a * (b-c) + (b-c) * d

4
THREE ADDRESS CODE
 Three-address code is built from two concepts: addresses and instructions.
 In object-oriented terms, these concepts correspond to classes, and the various
kinds of addresses and instructions correspond to appropriate subclasses.
 An address can be one of the following:
 A name. For convenience, we allow source-program names to appear as
addresses in three-address code. In an implementation, a source name is
replaced by a pointer to its symbol-table entry, where all information about the
name is kept.
 A constant. In practice, a compiler must deal with many different types of
constants and variables.

5
THREE ADDRESS CODE
 A compiler-generated temporary. It is useful, especially in optimizing
compilers, to create a distinct name each time a temporary is needed. These
temporaries can be combined, if possible, when registers are allocated to
variables.
 Here is a list of the common three-address instruction forms:
§ Assignment instructions of the form x = y op z, where op is a binary arithmetic
or logical operation, and x, y, and z are addresses.
§ Assignments of the form x = op y, where op is a unary operation. Essential
unary operations include unary minus, logical negation, and conversion
operators that, for example, convert an integer to a floating-point number.
§ Copy instructions of the form x = y, where x is assigned the value of y.

6
THREE ADDRESS CODE
4. An unconditional jump goto L. The three-address instruction with label L is
the next to be executed.
5. Conditional jumps of the form if x goto L and if False x goto L. These
instructions execute the instruction with label L next if x is true and false,
respectively. Otherwise, the following three-address instruction in sequence
is executed next, as usual.
6. Conditional jumps such as if x relop y goto L, which apply a relational
operator (<, ==, >=, etc.) to x and y, and execute the instruction with label L
next if x stands in relation relop to y. If not, the three-address instruction
following if x relop y goto L is executed next, in sequence.

7
THREE ADDRESS CODE
7. Procedure calls and returns are implemented using the following instructions:
param x for parameters; call p, n and y = call p, n for procedure and function calls,
respectively; and return y, where y, representing a returned value, is optional. Their
typical use is as the sequence of three-address instructions
param x1
param x2

param xn
call p, n
generated as part of a call of the procedure p(x 1, x2, …, xn). The integer n, indicating
the number of actual parameters in “call p, n”, is not redundant because calls can be
nested. That is, some of the first param statements could be parameters of a call that
comes after p returns its value; that value becomes another parameter of the later call.
8
THREE ADDRESS CODE
8. Indexed copy instructions of the form x = y[i] and x[i]=y. The instruction x
= y[i] sets x to the value in the location i memory units beyond location y.
The instruction x[i]=y sets the contents of the location i units beyond x to the
value of y.
9. Address and pointer assignments of the form x = &y, x = *y, and * x = y.
The instruction x = &y sets the r-value of x to be the location (l-value) of y 2.
Presumably y is a name, perhaps a temporary, that denotes an expression with
an l-value such as A[i][j], and x is a pointer name or temporary. In the
instruction x = * y, presumably y is a pointer or a temporary whose r-value is
a location. The r-value of x is made equal to the contents of that location.
Finally, * x = y sets the r-value of the object pointed to by x to the r-value of
y.

9
THREE ADDRESS CODE
 Consider the statement: do i = i+1; while (a[i] < v);

Two ways of assigning labels to three-address statements

10
QUADRUPLES
 Three such representations are called “quadruples”, “triples”, and “indirect
triples”.
 A quadruple (or just “quad”) has four fields, which we call op, arg 1, arg2, and
result. The op field contains an internal code for the operator.
 For instance, the three-address instruction x = y + z is represented by placing +
in op, y in arg1, z in arg2, and x in result.
 The following are some exceptions to this rule:
§ Instructions with unary operators like x = minus y or x = y do not use arg 2.
Note that for a copy statement like x = y, op is =, while for most other
operations, the assignment operator is implied.
§ Operators like param use neither arg2 nor result.
§ Conditional and unconditional jumps put the target label in result.
11
QUADRUPLES
 The assignment a = b * - c + b * - c;

Three-address code and its quadruple representation


12
TRIPLES
 A triple has only three fields, which we call op, arg1, and arg2.
 Using triples, we refer to the result of an operation x op y by its position, rather
than by an explicit temporary name.
 Thus, instead of the temporary t1, a triple representation would refer to position
(0).
 Parenthesized numbers represent pointers into the triple structure itself.
 In positions or pointers to positions were called value numbers.

13
TRIPLES
 The assignment a = b * - c + b * - c;

Representations of a = b * - c + b * - c;

14
INDIRECT TRIPLES
 Indirect triples consist of a listing of pointers to triples, rather than a listing of
triples themselves.
 For example, let us use an array instruction to list pointers to triples in the
desired order.
 With indirect triples, an optimizing compiler can move an instruction by
reordering the instruction list, without affecting the triples themselves.

15
INDIRECT TRIPLES
 The assignment a = b * - c + b * - c;

Indirect triples representation of three-address code

16
STATIC SINGLE-ASSIGNMENT FORM
 Static single-assignment form (SSA) is an intermediate representation that
facilitates certain code optimizations.
 Two distinctive aspects distinguish SSA from three-address code.
 The first is that all assignments in SSA are to variables with distinct names;
hence the term static single-assigment.
 Note that subscripts distinguish each definition of variables p and q in the SSA
representation.

17
STATIC SINGLE-ASSIGNMENT FORM
 The assignment a = b * - c + b * - c;

Intermediate program in three-address code and SSA

18
STATIC SINGLE-ASSIGNMENT FORM
 The same variable may be defined in two different control-flow paths in a
program.
 For example, the source program
if ( flag ) x = -1; else x = 1;
y = x * a;
 has two control-flow paths in which the variable x gets defined. If we use
different names for x in the true part and the false part of the conditional
statement, then which name should we use in the assignment y = x * a? Here is
where the second distinctive aspect of SSA comes into play. SSA uses a
notational convention called the -function to combine the two definitions of x:
if ( flag ) x1 = -1; else x2 = 1;
x3 = ϕ (x1, x2)
19
EXERCISES
 Translate the arithmetic expression a + - (b + c) into:
a) A syntax tree.
b) Quadruples.
c) Triples.
d) Indirect triples.
 The following assignment statements:
i. a = b[i] + c[j].
ii. a[i] = b*c - b*d.
iii. x = f(y+1) + 2.
iv. x = *p + &y.

20
TRANSLATION OF EXPRESSIONS
 An expression with more than one operator, like a + b * c, will translate into
instructions with at most one operator per instruction.
 An array reference A[i][j] will expand into a sequence of three-address
instructions that calculate an address for the reference.
 Operations Within Expressions
 Incremental Translation
 Addressing Array Elements
 Translation of Array References

21
TRANSLATION OF EXPRESSIONS
 Operations Within Expressions
 The syntax-directed definition in builds up the three-address code for an
assignment statement S using attribute code for S and attributes addr and code
for an expression E.
 Attributes S:code and E:code denote the three-address code for S and E,
respectively.
 Attribute E:addr denotes the address that will hold the value of E.

22
TRANSLATION OF EXPRESSIONS
 Operations Within Expressions

Three-address code for expressions 23


TRANSLATION OF EXPRESSIONS
 Incremental Translation
 In the incremental approach, gen not only constructs a three-address
instruction, it appends the instruction to the sequence of instructions generated
so far.
 The sequence may either be retained in memory for further processing, or it
may be output incrementally.

24
TRANSLATION OF EXPRESSIONS
 Incremental Translation

Generating three-address code for expressions incrementally

25
TRANSLATION OF EXPRESSIONS
 Addressing Array Elements
 Array elements can be accessed quickly if they are stored in a block of
consecutive locations.
 Array elements are numbered 0, 1, …, n -1, for an array with n elements.
 If the width of each array element is w, then the ith element of array A begins in
location
base + i x w
 where base is the relative address of the storage allocated for the array. That is,
base is the relative address of A[0].

26
TRANSLATION OF EXPRESSIONS
 Addressing Array Elements

Layouts for a two-dimensional array

27
TRANSLATION OF EXPRESSIONS
 Translation of Array References
 The chief problem in generating code for array references is to relate the
address-calculation formulas to a grammar for array references.
 Let nonterminal L generate an array name followed by a sequence of index
expressions:
L  L [ E ] | id [ E ]
 The lowest-numbered array element is 0.

28
TRANSLATION OF EXPRESSIONS
 Translation of Array References

Semantic actions for array references

29
TRANSLATION OF EXPRESSIONS
 Translation of Array References

Annotated parse tree for c + a[i][j]


30
TRANSLATION OF EXPRESSIONS
 Translation of Array References

Three-address code for expression c + a[i][j]

31
EXERCISES
 Translate the following assignments:
a) x = a[i] + b[j].
b) x = a[i][j] + b[i][j].
c) x = a[b[i][j]][c[k]].

32
BACKPATCHING
 A key problem when generating code for boolean expressions and flow-of-control
statements is that of matching a jump instruction with the target of the jump.
 For example, the translation of the boolean expression B in if (B ) S contains a
jump, for when B is false, to the instruction following the code for S.
 In a one-pass translation, B must be translated before S is examined.
 This section takes a complementary approach, called backpatching, in which lists
of jumps are passed as synthesized attributes.
 Specifically, when a jump is generated, the target of the jump is temporarily left
unspecified.
 Each such jump is put on a list of jumps whose labels are to be filled in when the
proper label can be determined.
 All of the jumps on a list have the same target label.

33
BACKPATCHING
 To manipulate lists of jumps, we use three functions:
1. makelist(i) creates a new list containing only i, an index into the array of
instructions; makelist returns a pointer to the newly created list.
2. merge(p1, p2) concatenates the lists pointed to by p1 and p2, and returns a
pointer to the concatenated list.
3. backpatch(p, i) inserts i as the target label for each of the instructions on the
list pointed to by p.

34
BACKPATCHING FOR BOOLEAN
EXPRESSIONS
 The grammar is as follows:B  B1 || MB2 | B1 && M B2 | ! B1 | (B1 ) | E1 rel E2 |
true | false
Mϵ

Translation scheme for


boolean expressions

35
BACKPATCHING FOR BOOLEAN
EXPRESSIONS
 Consider again the expression x < 100 || x > 200 && x ! = y
 An annotated parse tree is shown in Fig. for readability, attributes truelist,
falselist, and instr are represented by their initial letters.
 The actions are performed during a depth-first traversal of the tree.
 Since all actions appear at the ends of right sides, they can be performed in
conjunction with reductions during a bottom-up parse.
 In response to the reduction of x < 100 to B by production (5), the two
instructions
100: if x < 100 goto
101: goto
 are generated. (We arbitrarily start instruction numbers at 100.)

36
BACKPATCHING FOR BOOLEAN
EXPRESSIONS
 The marker nonterminal M in the production
B  B1 || MB2
 records the value of nextinstr, which at this time is 102. The reduction of x >
200 to B by production (5) generates the instructions
102: if x > 200 goto
103: goto
 The subexpression x > 200 corresponds to B1 in the production
B  B1 && M B2
 The marker nonterminal M records the current value of nextinstr, which is now
104. Reducing x ! = y into B by production (5) generates
104: if x != y goto
105: goto
37
BACKPATCHING FOR BOOLEAN
EXPRESSIONS
 We now reduce by B  B1 && M B2. The corresponding semantic action calls
backpatch(B1.truelist, M.instr) to bind the true exit of B1 to the first instruction
of B2. Since B1.truelist is {102} and M.instr is 104, this call to backpatch falls
in 104 in instruction 102.
 The semantic action associated with the final reduction by B  B1 || MB2 calls
backpatch({101},102).
 The entire expression is true if and only if the gotos of instructions 100 or 104
are reached, and is false if and only if the gotos of instructions 103 or 105 are
reached. These instructions will have their targets filled in later in the
compilation, when it is seen what must be done depending on the truth or
falsehood of the expression.

38
BACKPATCHING FOR BOOLEAN
EXPRESSIONS

Annotated parse tree for x < 100 || x > 200 && x ! = y


39
BACKPATCHING FOR BOOLEAN
EXPRESSIONS

Steps in the backpatch process


40
REFERENCE
1. Alfred V. Aho, Monica S. Lam, Ravi Sethi, Jeffrey D. Ullman, “Compilers:
Principles, Techniques and Tools”, Second Edition, Pearson Education Limited,
2014.
2. Randy Allen, Ken Kennedy, “Optimizing Compilers for Modern Architectures:
A Dependence-based Approach”, Morgan Kaufmann Publishers, 2002.
3. Steven S. Muchnick, “Advanced Compiler Design and Implementation”,
Morgan Kaufmann Publishers - Elsevier Science, India, Indian Reprint, 2003.
4. Keith D Cooper and Linda Torczon, “Engineering a Compiler”, Morgan
Kaufmann Publishers, Elsevier Science, 2004.
5. V. Raghavan, “Principles of Compiler Design”, Tata McGraw Hill Education
Publishers, 2010.
6. Allen I. Holub, “Compiler Design in C”, Prentice-Hall Software Series, 1993.

41
42

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy