0% found this document useful (0 votes)
2 views47 pages

CD - CH4 - Syntax Directed Translation

The document discusses syntax-directed analysis in compiler design, focusing on syntax-directed definitions, semantic rules, and evaluation orders for attributes in parse trees. It explains the concepts of synthesized and inherited attributes, types of attributed grammars (S-attributed and L-attributed), and the importance of type checking in static analysis. The document also outlines methods for evaluating semantic rules and constructing dependency graphs to determine evaluation orders.

Uploaded by

andualem.second
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views47 pages

CD - CH4 - Syntax Directed Translation

The document discusses syntax-directed analysis in compiler design, focusing on syntax-directed definitions, semantic rules, and evaluation orders for attributes in parse trees. It explains the concepts of synthesized and inherited attributes, types of attributed grammars (S-attributed and L-attributed), and the importance of type checking in static analysis. The document also outlines methods for evaluating semantic rules and constructing dependency graphs to determine evaluation orders.

Uploaded by

andualem.second
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

CSE 4310 – Compiler Design

CH4 – Syntax Directed Analysis


Outline
• Introduction
• Syntax Directed Translation
• Syntax Directed Definition
– Synthesized vs. Inherited Attributes
– Semantic Rules
• Evaluation Order
– Dependency graph
• Some Classes of Non-circular Attributed Grammars
– S-Attributed Grammars
– L-Attributed Grammars
• Type Checking
Introduction
• We associate information with a programming
construct by attaching attributes to the
grammar symbols representing the construct.
• Values for attributes are computed by
“Semantic Rules” associated with the grammar
productions.
• There are two notations for associating
semantic rules with productions:
1. Syntax directed definitions, and
2. Translation schemes.
Introduction
• Syntax-directed definitions are high-level specifications for
translations.
– They hide many implementation details and free the user from having
to specify explicitly the order in which translation takes place.
• Translation schemes indicate the order in which semantic rules are
to be evaluated.
– So they allow some implementation details to be shown.
• Both notations are used for specifying semantic checking, particularly
the determination of types(type checking), and for generating
intermediate code.
• Conceptually, with both syntax-directed definitions and translation
schemes:
– we parse the input token stream,
– build the parse tree, and
– then traverse the tree as needed to evaluate the semantic rules at the
parse-tree nodes.
Syntax-Directed Translation
• Parsing an input to do nothing about it is useless.
• Various actions can be performed while doing parsing.
– These actions are done by semantic actions associated to the
different rules of the grammar.

• Evaluation of the semantic rules


– may generate code,
– save information in the symbol table,
– issue error messages, or
– perform any other activities.
• The translation of the token stream is the result obtained by
evaluating the semantic rules.
Syntax-Directed Definitions
• A syntax directed definition is a generalization of
the CFG in which each grammar symbol has an
associated set of attributes (synthesized and
inherited).

• An attribute can represent anything we choose


– a string, a number, a type, a memory location, etc.

• The value of an attribute at a parse-tree node is


defined by a semantic rule associated with the
production used at that node.
Synthesized vs. Inherited Attributes
• The value of a synthesized attribute is computed from the
values of attributes at the children of that node in the parse
tree.
• The value of an inherited attribute is computed from the
values of attributes at the siblings and parent of that node
in the parse tree.
Semantic Rules
• Semantic rules calculate the values of attributes.
– Hence they setup dependencies between attributes that will be
represented by a dependency graph.
– The dependency graph enables to find an evaluation order for the
semantic rules.
• A parse tree showing the values of the attributes is called an
annotated or decorated parse tree.
Semantic Rules
Example 1: consider an example semantic rules for binary to decimal conversion:
– How many attributes are there?
– Which are synthesized?
– Which are inherited?

Syntax Rules Semantic Rules


� ⟶ �1 ∙ �2 �. � = �1 . � + �2 . �/(2�2.� )
�1 ⟶ �2 � �1 . � = 2 ∗ �2 . � + �. �
�1 . � = �2 . � + 1
�⟶� �. � = �. �
�. � = 1
�⟶0 �. � = 0
�⟶1 �. � = 1

• In the these example, everything is calculated from leaves to root, all


attributes (i.e. � and �) are synthesized.
Semantic Rules
Example 2: consider an example semantic rules for binary to decimal conversion:
Syntax Rules Semantic Rules
� ⟶ �1 ∙ �2 �. � = �1 . � + �2 . �
�1 . � = 0
�2 . � =− �2 . �
�1 ⟶ �2 � �1 . � = �2 . � + 1
�2 . � = �1 . � + 1
�. � = �1 . �
�1 . � = �2 . � + �. �
�⟶� �. � = �. �
�. � = 1
�. � = �. �
�⟶0 �. � = 0
�⟶1 �. � = � ∗ 2�.�
• Exercise: Draw the decorated parse tree for the input 1011.01
Formal Definition
• In a syntax directed definition, each grammar production
� → � has associated with it a set of semantic rules of the
form:
� : = � (�� , �� , …, �� ) where
– � is a function and
– �, �� , �� , …, �� are attributes of � and the symbols at the right
side of the production.
• We say that:
– � is synthesized attribute of � if �, �� , �� , …, �� are attributes
belonging to the grammar symbols of the production and,
– � is inherited attribute of one of the grammar symbols on the
right side of the production if �� , �� , …, �� are attributes
belonging to the grammar symbols of the production and,
– In either case, we say that attribute b depends on attributes
�� , �� , …, �� .
Example 1
• Consider the following syntax-directed definition for a desk calculator
program:
– The attributed grammar that calculate the value of the expression
Note: ��. ����� is the attribute of �� that gives its value.
PRODUCTION SEMANTIC RULES
�→�� �����(�. ���)
� → �� + � �. ��� : = ��. ��� + �. ���
�→� �. ��� : = �. ���
� → �� ∗ � �. ��� : = ��. ��� ∗ �. ���
�→� �. ��� : = �. ���
�→(�) �. ��� : = �. ���
� → ����� �. ��� : = �����. ������
– ��� is a synthesized attribute associated with each nonterminal.
– The token ����� has a synthesized attribute ������ whose value is
assumed to be supplied by the lexical analyzer.
Synthesized Attributes
• A syntax-directed definition that uses synthesized attributes
exclusively is said to be an S-attributed definition.
– A parse-tree for an S-attributed definition can always be
annotated by evaluating the semantic rules for the attributes at
each node bottom up, from the leaves to the root.
• Example 3: Consider an annotated parse tree for the input
� ∗ � + ��.
Inherited Attributes
• An inherited attribute is one whose value at a
node in a parse-tree is defined in terms of
attributes at the parent and/or siblings of that
node.
– Inherited attributes are convenient for expressing the
dependence of a programming language construct on
the context in which it appears.

• Let us consider an example that uses an inherited


attributes that distributes type information to the
various identifiers in a declaration.
Inherited Attributes
• Example 4: Syntax-directed definition with inherited attribute �. ��:

PRODUCTION SEMANTIC RULES


D®TL L.in := T.type
T ® int T.type := integer
T ® real T.type := real
L ® L1 , id L1.in := L.in
addtype(id.entry, L.in)
L ® id addtype(id.entry, L.in)

– Rules associated with the productions for � call procedure ������� to


add the type of each identifier to its entry in the symbol table (pointed
to by attribute entry).
Inherited Attributes
• Example 4: cont.…
– The following is the annotated parse-tree for the sentence
���� ���, ���, ���:

– Parse tree with inherited attribute �� at each node labeled �


Evaluation Order
• The attributes should be evaluated in a given
order because they depend on one another.
• The dependency of the attributes is
represented by a dependency graph.
• The graph has
– a node for each attribute and
– an edge to the node for � from the node for � if
attribute � depends on attribute c.
• �(�) −− �() −−> � (�) if and only if there exists a
semantic action such as �(�) : = � (…, �(�), …)
Evaluation Order
• Algorithm for the construction of the dependency
graph is as follows:

��� ���ℎ ���� � �� �ℎ� ����� ���� ��


��� ���ℎ ��������� � �� �ℎ� ������� ������ �� � ��
��������� � ���� �� �ℎ� ���������� ����ℎ ��� � ;
��� ���ℎ ���� � �� �ℎ� ����� ���� ��
��� ���ℎ �������� ���� � : = �(�� , �� , …, �� )
���������� ���ℎ �ℎ� ���������� ���� �� � ��
��� � : = 1 �� � ��
��������� �� ���� ���� ���� ���
�� �� �ℎ� ���� ��� �;
Evaluation Order
• For example, suppose � ⋅ � : = �(� ⋅ �, � ⋅ �) is a
semantic rule for the production � ⟶ ��.
– This rule defines a synthesized attribute � ⋅ � that depends
on the attributes � ⋅ � and � ⋅ �.
– Then there will be three nodes � ⋅ �, � ⋅ �, and � ⋅ � in the
dependency graph with an edge to � ⋅ � from � ⋅ � since � ⋅
� depends on � ⋅ �, and an edge to � ⋅ � from � ⋅ � since � ⋅
� also depends on � ⋅ �.

• If the production � ⟶ �� has the semantic rule � ⋅ � : =


�(� ⋅ �, � ⋅ �) associated with it, then there will be an
edge to � ⋅ � from � ⋅ � and also an edge to � ⋅ � from � ⋅
�, since � ⋅ � depends on both � ⋅ � and � ⋅ �.
Evaluation Order
Example 5: Dependency graph for the parse tree
in example 4.
Evaluation Order
• A topological sort of a directed acyclic graph is any
ordering �1, �2, . . . , �� of the nodes of the
graph such that
– Edges go from nodes earlier in the ordering to later
nodes;
– That is, if �� ⟶ �� is an edge from �� to ��, then ��
appears before �� in the ordering.
• Any topological sort of a dependency graph gives a
valid order in which the semantic rules associated
with the nodes in a parse tree can be evaluated.
Evaluation Order
Example 6: The topological sort for the dependency graph
in example 5 is in order of node numbers, from which, we
obtain the following program.
– We write �� for the attribute associated with the node
numbered � in the dependency graph.
�4 : = ����;
�5 : = �4;
�������(���. �����, �5)
�7 : = �5;
�������(���. �����, �7)
�9 : = �7;
�������(���. �����, �9)
• Evaluating these semantic rules stores the type real in
the symbol-table entry for each identifier.
Evaluation Order
• Several methods have been proposed for evaluating
semantic rules:
1. Parse rule based methods: for each input, the compiler
finds an evaluation order.
• These methods fail only if the dependency graph for that particular
parse tree has a cycle.
2. Rule based methods: the order in which the attributes
associated with a production are evaluated is
predetermined at compiler-construction time.
• For this method, the dependency graph need not be constructed.
3. Oblivious methods: The evaluation order is chosen without
considering the semantic rules.
• This restricts the class of syntax directed definition that can be used.
Some Classes of Non-circular Attributed
Grammars
S-Attributed Grammars
• An attributed grammar is S-Attributed when all of
its attributes are synthesized, i.e. it doesn't have
inherited attributes.
• Synthesized attributes can be evaluated by a
bottom-up parser as the input is being parsed.
• A new stack will be maintained to store the values
of the attributes as in the example below.
• Example:
� ⟶ �1 + �2 { �. � : = �1. � + �2. � }
���(������) : = ���(������) + ���(������ – 2)
$$ = $1 + $3 (�� ����)
Some Classes of Non-circular Attributed
Grammars
• Example:
� ⟶ �1 + �2 { �. � : = �1. � + �2. � }
���(������) : = ���(������) + ���(������ – 2)
$$ = $1 + $3 (�� ����)
• We assume that the synthesized attributes are evaluated
just before each reduction.
– Before the reduction, attribute of � is in ���(���) and
attributes of �1 and �2 are in ��� (��� – 1) and ���(��� 2)
respectively.
– After the reduction, � is put at the top of the State stack and
its attribute values are put at the top of Value stack.
– The semantic actions that reference the attributes of the
grammar will in fact be translated by the Compiler generator
(such as Yacc) into codes that reference the value stack.
Some Classes of Non-circular Attributed
Grammars
L-Attributed grammars
• It is difficult to execute the tasks of the compiler just by
synthesized attributes.
• The L-attributed class of grammars allow a limited kind
of inherited attributes.
• Definition: A grammar is L-Attributed if and only if for
each rule �0 ⟶ �1 �2 …�� …�� , all inherited attributes
of �� depend only on:
– Attributes of �1 �2 …��−1
– Inherited attributes of �0
• Of course all S-attributed grammars are L-attributed.
Some Classes of Non-circular Attributed
Grammars
L-Attributed grammars
• Example:
A®LM { L.h = f1 (A.h)
M.h = f2 (L.s)
A.s = f3 (M.s) }

– Does this production contradict the rules?


– No → the corresponding grammar may be L-
attributed if all of the other productions follow the
rule of L-attributed grammars.
Some Classes of Non-circular Attributed
Grammars
L-Attributed grammars
• Example:
� → � � { �. ℎ : = �4 (�. ℎ)
�. ℎ : = �5 (�. �)
�. � : = �6 (�. �) }

– Does this production contradict the rules?


– Yes, since �. ℎ depends on �. �
– → The grammar containing this production is not L-
Attributed.
Type Checking

• Introduction
• Type Systems
• Type Conversions
Introduction – Static Checking
● The compiler must check if the source program
follows semantic conventions of the source
language.
● This is called static checking (to distinguish it
from dynamic checking executed during
execution of the program).
● Static checking ensures that certain kind of
errors are detected and reported.
Introduction – Static Checking
● The following are examples of static checks:
– Type Checking: Incompatible operands,
– Flow control check:-
● A break instruction in C that is not in an inclosing statement,
● A return in Pascal that is not in functions body.
– Uniqueness checks:- Redefined variable
– Name related checks:-
● In Ada loops can have names.
● However, the same name should be used to start and end the
loop.
Type Checking
● A type checker verifies that the type construct matches that expected by its
context.

– For example, a type checker should verify that the type value assigned to a variable is
compatible with the type of the variable.
– For instance, ��� (%) expects two integer operands
– If any errors are found, they will be reported by the type checker.
● Type information produced by the type checker may be needed when the code is
generated.
● In almost all languages, types are either basic or constructed – In C++, for example:
– ���, �����, ������, �ℎ��, ... are basic types
– �����, ������, ... are constructed types
Type Systems
● The type of a language construct will be denoted by a type
expression.
● A type expression is either a basic type or formed by applying an
operator called type constructor to the type expression.
● The type expression may be obtained by using the following
definition.
– A basic type is a type expression, e.g., Integer, Boolean, char, ...
– A type name is a type expression
– A type constructor applied to a type expression is a type expression.
● The type checker uses two more basic types:
– Void → indicates absence of type error.
– Type_error → indicates the presence of a type error.
Type Systems
● The following are type constructors:
– Arrays: if � is an index set and � is a type expression,
then �����(�, �) is a Type Expression.
● For example, �����([� . . ��], �������) is a Type Expression.
– Products: if �1 and �2 are type expressions, the
Cartesian product �� × �� is a Type Expression.
● × is left associative.
● For example, in function parameters �(����, 5) = ������� ×
���
– Records: ������( (�����1 × �1 ) × (�����2 × �2 ) × … )
● For example, ������{ ��� ���; ����� ����; }
– ������( ( ��� × ������� ) × ( ���� × ����� ) )
Type Systems
● The following are type constructors:
– Pointers: if T is a Type Expression then Pointer(T) is a
Type expression
● For example, Pointer( Integer ).
– Functions: the Type Expression of a function has the
form
D⟶R
– where D is the type expression of the parameters
and R is the Type Expression of the returned value.
● For example, ��� ∗ �(�ℎ�� �, �ℎ�� �)
– �ℎ�� � �ℎ�� → �������( ������� )
Type Systems
DAG Representation:
● A convenient way to represent a type
expression is to use a graph (tree or DAG).
● For example, the type expression
corresponding to the above function
declaration is shown below:
Type Systems
● A type system is a collection of rules implemented by the type
checker for assigning type expressions to the various parts of a
program.
● Different type systems may be used by different compilers for
the same language.
– For example, some compilers implement stricter rules than others.
– Lint, for instance, has much more detailed type system than the C
compiler itself.
● Errors: At the very least, the compiler must report the nature
and location of errors.
● Error Recovery:
– It is also desirable that the type checker recovers from errors and
continues parsing the rest of the input.
Specification of a Simple Type Checker

Declarations
● The purpose of the semantic actions is to determine the
type expression of a variable and add the type
expression in the symbol table.
P ® D ; e
D ® D ; D
D ® id : T {addtype (id.lexeme, T.type)}
T ® char {T.type := char}
T ® integer {T.type := integer}
T ® ^T1 {T.type := pointer (T1.type)}
T ® array [num1 .. num2 ] of T1 {T.type := array
([num1.val .. num2.val],
T1.type)}
Specification of a Simple Type Checker

Example: Consider the following declarations


A : array[1..10] of ^ array[1..5] of char;
B : char;
Draw the decorated parse tree for the given declaration.
Specification of a Simple Type Checker
Expressions

E ® literal { E.type := char }

E ® num { E.type := integer }

E ® id { E.type := lookup (id.lexeme) }

E ® E1 mod E2 { E.type := if E1.type = integer and E2.type := integer then

Integer

Else

Type_error }

E ® E1[E2 ] { E.type := if E2.type = integer and E1.type := array (s, t)


then

Else

Type_error }

E ® ^E1 { E.type := if E2.type = pointer (t) then

Else

Type_error }
Specification of a Simple Type Checker

Statements
S ® id := E { S.type := if id.type = E.type
then void Else
Type_error }

S ® if E then S1 { S.type := if E.type = Boolean then


S1 .type Else
Type_error }
S ® while E do S1 { S.type := if E.type = Boolean
then S1.type Else
Type_error }
S ® S1 ; S2 { S.type := if S1.type = void
and S2.type = void then
Specification of a Simple Type Checker

● The following example gives a type checking


system for function calls:
E ® E1(E2) { E.type := if E 2 .type = s and E 1 .type = s ® t then

Else

Type_error }
Equivalence of types
● So far we compared the type expressions using the equal
operators.
● However, such an operator is not defined except perhaps
between basic types.
● In fact we should rather use the equivalent operator
which is more appropriate.
● A natural notion of equivalence is structural equivalence:
– two type expressions are structurally equivalent if and only if
they are the same basic type or are formed by applying the
same constructor to structurally equivalent types.
– For example,
● integer is equivalent only to integer
● pointer (integer) is structurally equivalent to pointer (integer).
Equivalence of types
● Some relaxing rules are very often added to this notion of
equivalence.
– For example, when arrays are passed as parameter, the array
boundaries of the effective parameters may not be the same
as those of the formal parameters.
● In some languages, types may be given names.
– For example in Pascal we can define:
Type:
Ptr = ^Integer;
Var A : Ptr; B : Ptr; C : ^Integer; D, E : ^Integer;
– Have the variables A, B, C, D, E the same type?
● Surprisingly, the answer varies from implementation to
implementation.
Equivalence of types
● When names are allowed in type expressions, a new notion of
equivalence is introduced: Name equivalence.
● We have name equivalence between two type expressions if and only
if they are identical.
● Under structural equivalence, names are replaced by type expressions
they define,
– so two types are structurally equivalent if they represent two structurally
equivalent type expressions when all names have been substituted.
– For example, ptr and pointer (integer) are not name equivalent but they
are structurally equivalent.
● Note: Confusion arises from the fact that many implementations
associate an implicit type name with each declared identifier.
– Thus, C and D of the above example may not be name equivalent.
Type Conversions
● Consider expressions like � + �, where � is of type real
and � of type integer.
● Of course, the machine cannot execute this operation as
it involves different types of values.
● However, most languages accept such expressions to be
used; the compiler will be in charge of converting one of
the operand into the type of the other.
● The type checker can be used to insert these conversion
operations into the intermediate representation of the
source program.
– For example, an operator ��������� may be inserted
whenever an operand needs to be implicitly converted.
Type Conversions
● Type conversions may be implicit or explicit
– Explicit – done by the programmer
For example, ����� �� = 3.14;
��� � = (���)��;

– Implicit – done by the compiler


For example, ��� � = 10;
����� � = �;
End of CH4
● Thank You!

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy