CD Unit-4 LM
CD Unit-4 LM
Objectives
To gain knowledge on syntax directed translation ,symbol table organization for languages and various
storage allocation strategies
Syllabus
Learning Outcomes:
Learning Material
Semantic Analysis
Semantic analysis is the task of ensuring that the declarations and statements of a program are
semantically correct, i.e that their meaning is clear and consistent with the way in which control
structures and data types are supposed to be used.
A compiler must check that the source program follows both the syntactic and semantic conventions
of the source language.
This checking called static checking ensures that certain kinds of programming errors will be detected
and reported.
CFG
EVALUATE USING
TREE TRAVERSAL TECHNIQUE
(PREORDER/POSTORDER/INORDER/
DEPTH FIRST SEARCH)
FINAL RESULT
1. Semantic rules set up dependencies between attributes which can be represented by a dependency
graph.
2. This dependency graph determines the evaluation order of these semantic rules.
Example:
L→En print(E.val)
E → E1 + T E.val = E1.val + T.val
E→T E.val = T.val
T → T1 * F T.val = T1.val * F.val
T→F T.val = F.val
F→(E) F.val = E.val
F → digit F.val = digit.lexval
S-attributed definition
• A syntax directed translation that uses synthesized attributes exclusively is said to be a S-attributed
definition.
• An SDD is S-attributed if every attribute is synthesized.
• A parse tree for a S-attributed definition can be annotated by evaluating the semantic rules for the
attributes at each node, bottom up from leaves to the root.
• We can have a post-order traversal of parse-tree to evaluate attributes in S-attributed definitions
postorder(N)
{
for (each child C of N, from the left) postorder(C);
evaluate the attributes associated with node N;
}
• S-Attributed definitions can be implemented during bottom-up parsing without the need to explicitly
create parse trees
Annotated Parse Tree
1. A parse tree showing the values of attributes at each node is called an annotated parse tree.
2. Values of Attributes in nodes of annotated parse-tree are either,
– initialized to constant values or by the lexical analyzer.
– determined by the semantic-rules.
3. The process of computing the attributes values at the nodes is called annotating (or decorating) of
the parse tree.
4. The order of these computations depends on the dependency graph induced by the semantic rules.
Example
– Put each semantic rule into the form b=f(c1,…,ck) by introducing dummy synthesized attribute
b for every semantic rule that consists of a procedure call.
– The graph has a node for each attribute and an edge to the node for b from the node for c if
attribute b depends on attribute c
– If a semantic rule defines the value of synthesized attribute A.b in terms of the value of X.c
then the dependency graph has an edge from X.c to A.b
– If a semantic rule defines the value of inherited attribute B.c in terms of the valueof X.a then
the dependency graph has an edge from X.a to B.c
• If an attribute b at a node depends on an attribute c, then the semantic rule for b at that node must be
evaluated after the semantic rule that defines c.
Dependency Graph Construction
E . val
Inherited attributes
• An inherited value at a node in a parse tree is defined in terms of attributes at the parent and/or
siblings of the node.
• Convenient way for expressing the dependency of a programming language construct on the context
in which it appears.
• Example: The inherited attribute distributes type information to the various identifiers in a
declaration.
Syntax-Directed Definition – Inherited Attributes
• A SDD is L-Attributed if the edges in dependency graph goes from Left to Right but not from Right
to Left.
• More precisely, each attribute must be either
– Synthesized
– Inherited, but if there is a production A->X1X2…Xn and there is an inherited attribute Xi.a
computed by a rule associated with this production, then the rule may only use:
• Inherited attributes associated with the head A
• Either inherited or synthesized attributes associated with the occurrences of symbols
X1,X2,…,Xi-1 located to the left of Xi
• Inherited or synthesized attributes associated with this occurrence of Xi itself, but in
such a way that there is no cycle in the graph
Annotated parse tree
D D
T L
T.type=real L1.in=real
real L , id3
L , id2
id1
Evaluation Order
A topological sort of a directed acyclic graph is any ordering m1,m2…mk of the nodes of the graph
such that edges go from nodes earlier in the ordering to later nodes.
. i.e if there is an edge from mi to mj them mi appears before mj in the ordering
Any topological sort of dependency graph gives a valid order for evaluation of semantic rules
associated with the nodes of the parse tree.
– The dependent attributes c1,c2….ck in b=f(c1,c2….ck ) must be available before f is
evaluated.
Translation specified by Syntax Directed Definition
Dependency Graph
Translation Schemes:
• A translation scheme is a context-free grammar in which:
– attributes are associated with the grammar symbols and
– semantic actions enclosed between braces {} are inserted within the right sides of
productions.
Each semantic rule can only use the information compute by already executed semantic rules.
• Ex: A → { ... } X { ... } Y { ... }
Semantic Actions
– indicate the order of evaluation of semantic actions associated with a production rule.
– In other words, translation schemes give a little bit information about implementation details.
• Translation schemes describe the order and timing of attribute computation.
Translation Schemes for S-attributed Definitions
• useful notation for specifying translation during parsing.
• Can have both synthesized and inherited attributes.
• If our syntax-directed definition is S-attributed, the construction of the corresponding translation
scheme will be simple.
• Each associated semantic rule in a S-attributed syntax-directed definition will be inserted as a
semantic action into the end of the right side of the associated production.
Production Semantic Rule
⇓
E → E1 + T {E.val = E1.val + T.val} the production of the corresponding translation scheme
Example
• A simple translation scheme that converts infix expressions to the corresponding postfix expressions.
E→TR
R → + T { print(“+”) } R1
R→ε
T → id { print(id.name) }
a+b+c ab+c+
• Type checking
• intermediate code generation (chapter 6)
• Construction of syntax trees
– Leaf nodes: Leaf(op,val)
– Interior node: Node(op,c1,c2,…,ck)
Symbol-Table Contents
The following list of attributes are not necessary for all compilers.
1. Variable name
2. Object-code address
3. Type
4. Dimension or number of parameters for a procedure
5. Source line number at which the variable is declared
6. Source line numbers at which the variable is referenced
7. Link field for listing in alphabetical order
2 X3 4 1 0 3 12,14 0
3 FORM 1 8 3 2 4 36,37,38 6
10, 11,13,
4B 48 1 0 5 1
23
5 ANS 52 1 0 5 11,23,25 4
6M 56 6 0 6 17,21 2
A second approach is to place a string descriptor in the variable-name field of the table. The
descriptor contains position and length subfields. The pointer subfield indicates the position of the first
character of the variable name in a general string area, and the length subfield describes the number of
characters in the variable name.
Using a string descriptor to represent a variable name.
ANS 1 0
B 1 0
COMPANY* 2 1
FIRST 1 0
FORM1 3 2
M 6 0
X3 1 0
2 B 1 0 3 4
3 ANS 1 0 0 0
COMPANY
4 2 1 0 0
#
5 M 6 0 6 7
6 F0RM1 3 2 0 0
7 X3 1 0 0 0
(6)
That is, a hashing function H takes as its argument a variable name and
produces a table address (i.e., location) at which the set of attributes for that variable are stored. The
function generates this address by performing some simple arithmetic or logical operations on the
name or some part of the name.
The most widely accepted hashing function is the division method, which is defined as H(x) =(x
mod m)+1 for divisor m.
A second hashing function that performs reasonably well is the midsquare hashing-method. In this
method, a key is multiplied by itself and an address is obtained by truncating bits or digits at both
ends of the product until the number of bits or digits left is equal to the desired address length.
For the folding method, a key is partitioned into a number of parts, each of which has the same
length as the required address with the possible exception of the last part. The parts are then added
together, ignoring the final carry, to form an address. If the keys are in binary form, the exclusive-or
operation may be substituted for addition.
Another hashing technique, referred to as a length-dependent method is used commonly in
symbol-table handling. In this approach, the length of the variable name is used in conjunction with
some subpart of the name to produce either a table address directly or, more commonly, an
intermediate key which is use.
Figure 8-24 Hashing using a hash table.
Two records cannot occupy the same location, and therefore some method must be used to resolve the
collisions that can result. There are basically two collision-resolution techniques: open addressing and
separate chaining.
TEST1, F, X, IND
The compiler creates and manages a run-time environment in which it assumes its target
programs are being executed.
This environment deals with a variety of issues such as the layout and allocation of storage
locations for the objects named in the source program, the mechanisms used by the target
program to access variables, the linkages between procedures, the mechanisms for passing
parameters, and the interfaces to the operating system, input/output devices, and other
programs.
Storage Organization
The executing target program runs in its own logical address space in which each program value has a
location. The management and organization of this logical address space is shared between the compiler,
operating system, and target machine. The operating system maps the logical addresses into physical
addresses, which are usually spread throughout memory.
1. Temporary values, such as those arising from the evaluation of expressions, in cases where those
temporaries cannot be held in registers.
2. Local data belonging to the procedure whose activation record this is.
3. A saved machine status, with information about the state of the machine just before the call to the
procedure. This information typically includes the return address (value of the program counter, to which the
called procedure must return) and the contents of registers that were used by the calling procedure and that
must be restored when the return occurs.
4. An "access link" may be needed to locate data needed by the called procedure but found elsewhere, e.g.,
in another activation record.
5. A control link, pointing to the activation record of the caller.
6. Space for the return value of the called function, if any. Again, not all called procedures return a value,
and if one does, we may prefer to place that value in a register for efficiency.
7. The actual parameters used by the calling procedure. Commonly, these values are not placed in the
activation record but rather in registers, when possible, for greater efficiency.
Example:
Displays:
Faster access to nonlocals than with access links can be obtained using an array d of pointers to
activation records, called a display. We maintain the display so that storage for a nonlocal a at nesting depth
i is in the activation record pointed to by display element d[i].
a. 1 3 2 b. 2 2 3 c. 2 3 1 d Syntax error
3. A->BC {B.s=A.s} [ ]
a. S-attributed b. L-attributed c. Both d. None
S → ER
R → *E {print(“*”);}R | ε
E→F+E {print(“+”);} | F
F → (S) | id {print(id.value);}
Here id is a token that represents an integer and id.value represents the corresponding integer value.
For an input ‘2 * 3 + 4′, this translation scheme prints [ ]
a. 2 * 3 + 4 b. 2 * +3 4 c. 2 3 * 4 + d. 2 3 4+*
8. Consider the following Syntax Directed Translation Scheme (SDTS),with non terminals {
E,T,F} and terminals {2,4} [ ]
E->E*T {E.VAL=E.VAL*T.VAL;}
E->T {E.VAL=T.VAL;}
T->F-T {T.VAL=F.VAL-T.VAL;}
T->F {T.VAL=F.VAL;}
F->2 {F.VAL=2;}
F->4 {F.VAL=4;}
Using the above SDTS, the total number of reductions done by a bottom-up parser
for the input 4-2-4*2 is
a. 10 b. 9 c. 11 d 13
II) Problems
}
6. Construct annotated parse tree according to the SDD for the input string 1011
N->L {N.C=L.C}
L->L,B {L.C=L,.C+B.C}
L->B {L.C=B.C}
B->0 {B.C=0}
B->1 {B.C=1}
7. Construct annotated parse tree according to the SDD for the input string 2+3*4
E->E1+T {E.nptr=mknode(E1.nptr,+,T.npr);}
E->T {E.nptr=T.nptr;}
T->T1*F {T.nptr=mknode(T1.nptr,*,F.nptr);}
T->F {T.nptr=F.nptr}
F->id {F.nptr=mknode(null,id.name,null);}
D.GATE/NET/SLET
1. Consider the grammar with the following translation rules and E as the start symbol.
| T {E.value = T.value}
|F {T.value= F.value}
Compute E.value for the root of the parse tree for the expression: 2 # 3 & 5 # 6 &4. [ ]
(GATE CS 2004)
a. 200
b. 180
c. 160
d. 40
2. Consider the program given below, in a block-structured pseudo-language with lexical scoping and
nesting of procedures permitted. [ ](Gate 2012)
Program main;
Var ...
Procedure A1;
Var ...
Call A2;
End A1
Procedure A2;
Var ...
Procedure A21;
Var ...
Call A1;
End A21
Call A21;
End A2
Call A1;
End main.
Consider the calling chain: Main →→ A1 →→ A2 →→ A21 →→ A1
The correct set of activation records along with their access links is given by