CS6109 Module 8
CS6109 Module 8
Module – 8
Presented By
Dr. S. Muthurajkumar,
Assistant Professor,
Dept. of CT, MIT Campus,
Anna University, Chennai.
MODULE - 8
Three address code
Types of Three address code – Quadruples, Triples
Three-address code for Declarations, Arrays, Loops, Backpatching
2
THREE ADDRESS CODE
In three-address code, there is at most one operator on the right side of an
instruction; that is, no built-up arithmetic expressions are permitted.
Thus a source-language expression like x+y*z might be translated into the
sequence of three-address instructions
t1 = y * z
t2 = x + t1
where t1 and t2 are compiler-generated temporary names.
3
THREE ADDRESS CODE
Three-address code is a linearized representation of a syntax tree or a DAG in
which explicit names correspond to the interior nodes of the graph.
a + a * (b-c) + (b-c) * d
4
THREE ADDRESS CODE
Three-address code is built from two concepts: addresses and instructions.
In object-oriented terms, these concepts correspond to classes, and the various
kinds of addresses and instructions correspond to appropriate subclasses.
An address can be one of the following:
A name. For convenience, we allow source-program names to appear as
addresses in three-address code. In an implementation, a source name is
replaced by a pointer to its symbol-table entry, where all information about the
name is kept.
A constant. In practice, a compiler must deal with many different types of
constants and variables.
5
THREE ADDRESS CODE
A compiler-generated temporary. It is useful, especially in optimizing
compilers, to create a distinct name each time a temporary is needed. These
temporaries can be combined, if possible, when registers are allocated to
variables.
Here is a list of the common three-address instruction forms:
§ Assignment instructions of the form x = y op z, where op is a binary arithmetic
or logical operation, and x, y, and z are addresses.
§ Assignments of the form x = op y, where op is a unary operation. Essential
unary operations include unary minus, logical negation, and conversion
operators that, for example, convert an integer to a floating-point number.
§ Copy instructions of the form x = y, where x is assigned the value of y.
6
THREE ADDRESS CODE
4. An unconditional jump goto L. The three-address instruction with label L is
the next to be executed.
5. Conditional jumps of the form if x goto L and if False x goto L. These
instructions execute the instruction with label L next if x is true and false,
respectively. Otherwise, the following three-address instruction in sequence
is executed next, as usual.
6. Conditional jumps such as if x relop y goto L, which apply a relational
operator (<, ==, >=, etc.) to x and y, and execute the instruction with label L
next if x stands in relation relop to y. If not, the three-address instruction
following if x relop y goto L is executed next, in sequence.
7
THREE ADDRESS CODE
7. Procedure calls and returns are implemented using the following instructions:
param x for parameters; call p, n and y = call p, n for procedure and function calls,
respectively; and return y, where y, representing a returned value, is optional. Their
typical use is as the sequence of three-address instructions
param x1
param x2
…
param xn
call p, n
generated as part of a call of the procedure p(x 1, x2, …, xn). The integer n, indicating
the number of actual parameters in “call p, n”, is not redundant because calls can be
nested. That is, some of the first param statements could be parameters of a call that
comes after p returns its value; that value becomes another parameter of the later call.
8
THREE ADDRESS CODE
8. Indexed copy instructions of the form x = y[i] and x[i]=y. The instruction x
= y[i] sets x to the value in the location i memory units beyond location y.
The instruction x[i]=y sets the contents of the location i units beyond x to the
value of y.
9. Address and pointer assignments of the form x = &y, x = *y, and * x = y.
The instruction x = &y sets the r-value of x to be the location (l-value) of y 2.
Presumably y is a name, perhaps a temporary, that denotes an expression with
an l-value such as A[i][j], and x is a pointer name or temporary. In the
instruction x = * y, presumably y is a pointer or a temporary whose r-value is
a location. The r-value of x is made equal to the contents of that location.
Finally, * x = y sets the r-value of the object pointed to by x to the r-value of
y.
9
THREE ADDRESS CODE
Consider the statement: do i = i+1; while (a[i] < v);
10
QUADRUPLES
Three such representations are called “quadruples”, “triples”, and “indirect
triples”.
A quadruple (or just “quad”) has four fields, which we call op, arg 1, arg2, and
result. The op field contains an internal code for the operator.
For instance, the three-address instruction x = y + z is represented by placing +
in op, y in arg1, z in arg2, and x in result.
The following are some exceptions to this rule:
§ Instructions with unary operators like x = minus y or x = y do not use arg 2.
Note that for a copy statement like x = y, op is =, while for most other
operations, the assignment operator is implied.
§ Operators like param use neither arg2 nor result.
§ Conditional and unconditional jumps put the target label in result.
11
QUADRUPLES
The assignment a = b * - c + b * - c;
13
TRIPLES
The assignment a = b * - c + b * - c;
Representations of a = b * - c + b * - c;
14
INDIRECT TRIPLES
Indirect triples consist of a listing of pointers to triples, rather than a listing of
triples themselves.
For example, let us use an array instruction to list pointers to triples in the
desired order.
With indirect triples, an optimizing compiler can move an instruction by
reordering the instruction list, without affecting the triples themselves.
15
INDIRECT TRIPLES
The assignment a = b * - c + b * - c;
16
STATIC SINGLE-ASSIGNMENT FORM
Static single-assignment form (SSA) is an intermediate representation that
facilitates certain code optimizations.
Two distinctive aspects distinguish SSA from three-address code.
The first is that all assignments in SSA are to variables with distinct names;
hence the term static single-assigment.
Note that subscripts distinguish each definition of variables p and q in the SSA
representation.
17
STATIC SINGLE-ASSIGNMENT FORM
The assignment a = b * - c + b * - c;
18
STATIC SINGLE-ASSIGNMENT FORM
The same variable may be defined in two different control-flow paths in a
program.
For example, the source program
if ( flag ) x = -1; else x = 1;
y = x * a;
has two control-flow paths in which the variable x gets defined. If we use
different names for x in the true part and the false part of the conditional
statement, then which name should we use in the assignment y = x * a? Here is
where the second distinctive aspect of SSA comes into play. SSA uses a
notational convention called the -function to combine the two definitions of x:
if ( flag ) x1 = -1; else x2 = 1;
x3 = ϕ (x1, x2)
19
EXERCISES
Translate the arithmetic expression a + - (b + c) into:
a) A syntax tree.
b) Quadruples.
c) Triples.
d) Indirect triples.
The following assignment statements:
i. a = b[i] + c[j].
ii. a[i] = b*c - b*d.
iii. x = f(y+1) + 2.
iv. x = *p + &y.
20
TRANSLATION OF EXPRESSIONS
An expression with more than one operator, like a + b * c, will translate into
instructions with at most one operator per instruction.
An array reference A[i][j] will expand into a sequence of three-address
instructions that calculate an address for the reference.
Operations Within Expressions
Incremental Translation
Addressing Array Elements
Translation of Array References
21
TRANSLATION OF EXPRESSIONS
Operations Within Expressions
The syntax-directed definition in builds up the three-address code for an
assignment statement S using attribute code for S and attributes addr and code
for an expression E.
Attributes S:code and E:code denote the three-address code for S and E,
respectively.
Attribute E:addr denotes the address that will hold the value of E.
22
TRANSLATION OF EXPRESSIONS
Operations Within Expressions
24
TRANSLATION OF EXPRESSIONS
Incremental Translation
25
TRANSLATION OF EXPRESSIONS
Addressing Array Elements
Array elements can be accessed quickly if they are stored in a block of
consecutive locations.
Array elements are numbered 0, 1, …, n -1, for an array with n elements.
If the width of each array element is w, then the ith element of array A begins in
location
base + i x w
where base is the relative address of the storage allocated for the array. That is,
base is the relative address of A[0].
26
TRANSLATION OF EXPRESSIONS
Addressing Array Elements
27
TRANSLATION OF EXPRESSIONS
Translation of Array References
The chief problem in generating code for array references is to relate the
address-calculation formulas to a grammar for array references.
Let nonterminal L generate an array name followed by a sequence of index
expressions:
L L [ E ] | id [ E ]
The lowest-numbered array element is 0.
28
TRANSLATION OF EXPRESSIONS
Translation of Array References
29
TRANSLATION OF EXPRESSIONS
Translation of Array References
31
EXERCISES
Translate the following assignments:
a) x = a[i] + b[j].
b) x = a[i][j] + b[i][j].
c) x = a[b[i][j]][c[k]].
32
BACKPATCHING
A key problem when generating code for boolean expressions and flow-of-control
statements is that of matching a jump instruction with the target of the jump.
For example, the translation of the boolean expression B in if (B ) S contains a
jump, for when B is false, to the instruction following the code for S.
In a one-pass translation, B must be translated before S is examined.
This section takes a complementary approach, called backpatching, in which lists
of jumps are passed as synthesized attributes.
Specifically, when a jump is generated, the target of the jump is temporarily left
unspecified.
Each such jump is put on a list of jumps whose labels are to be filled in when the
proper label can be determined.
All of the jumps on a list have the same target label.
33
BACKPATCHING
To manipulate lists of jumps, we use three functions:
1. makelist(i) creates a new list containing only i, an index into the array of
instructions; makelist returns a pointer to the newly created list.
2. merge(p1, p2) concatenates the lists pointed to by p1 and p2, and returns a
pointer to the concatenated list.
3. backpatch(p, i) inserts i as the target label for each of the instructions on the
list pointed to by p.
34
BACKPATCHING FOR BOOLEAN
EXPRESSIONS
The grammar is as follows:B B1 || MB2 | B1 && M B2 | ! B1 | (B1 ) | E1 rel E2 |
true | false
Mϵ
35
BACKPATCHING FOR BOOLEAN
EXPRESSIONS
Consider again the expression x < 100 || x > 200 && x ! = y
An annotated parse tree is shown in Fig. for readability, attributes truelist,
falselist, and instr are represented by their initial letters.
The actions are performed during a depth-first traversal of the tree.
Since all actions appear at the ends of right sides, they can be performed in
conjunction with reductions during a bottom-up parse.
In response to the reduction of x < 100 to B by production (5), the two
instructions
100: if x < 100 goto
101: goto
are generated. (We arbitrarily start instruction numbers at 100.)
36
BACKPATCHING FOR BOOLEAN
EXPRESSIONS
The marker nonterminal M in the production
B B1 || MB2
records the value of nextinstr, which at this time is 102. The reduction of x >
200 to B by production (5) generates the instructions
102: if x > 200 goto
103: goto
The subexpression x > 200 corresponds to B1 in the production
B B1 && M B2
The marker nonterminal M records the current value of nextinstr, which is now
104. Reducing x ! = y into B by production (5) generates
104: if x != y goto
105: goto
37
BACKPATCHING FOR BOOLEAN
EXPRESSIONS
We now reduce by B B1 && M B2. The corresponding semantic action calls
backpatch(B1.truelist, M.instr) to bind the true exit of B1 to the first instruction
of B2. Since B1.truelist is {102} and M.instr is 104, this call to backpatch falls
in 104 in instruction 102.
The semantic action associated with the final reduction by B B1 || MB2 calls
backpatch({101},102).
The entire expression is true if and only if the gotos of instructions 100 or 104
are reached, and is false if and only if the gotos of instructions 103 or 105 are
reached. These instructions will have their targets filled in later in the
compilation, when it is seen what must be done depending on the truth or
falsehood of the expression.
38
BACKPATCHING FOR BOOLEAN
EXPRESSIONS
41
42