0% found this document useful (0 votes)

198 views

VMKV Engineering College Department of Computer Science & Engineering Principles of Compiler Design Unit I Part-A

The document discusses principles of compiler design. It provides definitions and explanations of key concepts in compiler design such as: 1) A compiler is a program that converts a program written in one language (the source language) into another language (the target language). 2) The main phases of a compiler are lexical analysis, syntax analysis, code generation, and code optimization. 3) A symbol table stores information about variables and other symbols in the program being compiled.

Uploaded by

Dular Chandran

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

198 views

VMKV Engineering College Department of Computer Science & Engineering Principles of Compiler Design Unit I Part-A

Uploaded by

Dular Chandran

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 80

VMKV ENGINEERING COLLEGE

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

PRINCIPLES OF COMPILER DESIGN

Unit I

PART-A

1. What is a compiler?
A program that converts another program from some source
language (or programming language) to machine language
(object code).

2. Name the different phases of a compiler?

Lexical analysis phase or scanning phase
Syntax analysis phase
Intermediate code generation
Code optimization
Code generation.

3. What is a symbol table?

Symbol table is a data structure that contains all variables in the
program and temporary storage and any information needed to
reference or allocate storage for them.

4. What is a token and give examples?

A basic, grammatically indivisible unit of a language such as a
keyword, operator or identifier.

5. What is lexeme? Give an example?

Sequence of character in the source program that are matched
with the pattern of the token .(e.g.) ,int,i,num,ans,choice.

6. Write short note on error handler?

Each phase can encounter errors. After detecting error in a
phase it allows further errors to be detected. In lexical analysis, it
finds errors where it cannot form a token. In syntax analyzer, it
finds where the token stream violates the structure rules. In
semantic phase, it tries to detect constructs that have the right
syntactic structure but no meaning to the operations involved.

7. What is 3-address code and give example?

1. Lexical,
2. syntax,
3. semantic.

8. Define translator
A translator which can translate the high level language to the
low level language.

9. Differentiate star closure & positive closure?

The syntax tree for the above statement is,

position +

initial *

rate 60

it is also called as parsing . in this phase the token generated by the

lexical analyser are grouped to form a hierarical structure .

11.What are the issues of the lexical analyzer?

Lexical analysis is the first phase of a compiler. Lexical
analysis, also called scanning, scans a source program form left to
right character-by-character and groups them into tokens having a
collective meaning. Each token or basic syntactic element
represents a logically cohesive sequence of characters such as
identifier (also called variable), a keyword (if, then. else, etc.), a
multi -character operator < =, etc. The output of this phase goes to
the next phase, i.e, syntax analysis or parsing.
The second task performed during lexical analysis is to
make entry of tokens into a symbol table if it is not there.
Some other tasks performed during lexical analysis are:
- To remove all comments, tabs, blank spaces and machine
characters.
- To produce error messages (also called diagnostics) occurred
in a source program.
12. What is a regular expression?
We use regular expressions to describe tokens of a
programming language.
A regular expression is built up of simpler regular expressions
(using defining rules).
Each regular expression denotes a language.
A language denoted by a regular expression is called as a
regular set.

13. Define parse tree

Syntax analysis is the second phase of compilation process.
This process is also called parsing. It performs the following
operations:
1. Obtains a group of tokens from the lexical analyzer.
2. Determines whether a string of tokens can be generated by a
grammar of the language, i.e. it checks whether the
expression is syntactically correct or not.
3. Reports syntax error(s) if any.

14. What are the two parts of compilation process?

Analysis & synthesis are the two parts of compilation. The
analysis part is carried out in 3 sub parts they are lexical analysis,
syntax analysis, semantic analysis.

15. List out the compiler construction tools?

1. Data Flow Engines
2. Parser Generator:
3. Syntax Directed Translation Engine
4. Automatic code generators
5. Scanner Generator
16.What is an assembly code?
Some compilers produce assembly code and passed to an
assembler for further processing. Some produce relocatable
machine code that can be passed to loader and linker editor. The
assembly code is mnemonic version of machine code, in which
names are used instead of binary codes for operations and names
are also given to memory addresses.
17.What are the classifications of a compiler?
The compiler can be classified as single pass,multi-pass,load nand
go,debugging,or optimization ,depending up how they have been
constructed or on what function they are supported to performed.
18.What are the fronts and back ends of a compiler?
Front end
1. Lexical analysis
2. Syntax analysis
3. Semantic analysis
Back end
1. Code optimizer
2. Code generator.

19.What is meant by lexical analysis?

1. Explain specification of tokens in detail.

Token specification
Alphabet :

 a finite set of symbols (ASCII characters)

String :

 Finite sequence of symbols on an alphabet

 Sentence and word are also used in terms of string
  is the empty string
 |s| is the length of string s.

Language:

 sets of strings over some fixed alphabet

  the empty set is a language.
 {} the set containing empty string is a language
 The set of well-wormed C programs is a language
 The set of all possible identifiers is a language.
Operators on Strings:

 Concatenation: xy represents the concatenation of strings x and y.

 s  = s,
 
 s =s
 (Exponentiation) sn = s s s .. s ( n times) s0 = 

Parts of string:

 prefix of s : a string abtained by removing zero or more trailing

symbols of the string s; eg. Com is a prefix of Computer
 suffix of s : a string abtained by removing zero or more leading
symbols of the string s; eg. puter is a suffix of Computer
 sub string of s: a string obtained by deleting the prefix and suffix
from the string s. eg put in computer
 proper prefix, suffix or substring of s: any non empty string x that
is, respectively , a prefix , suffix, or substring of s suct that sx.

Operations on Languages

 •Concatenation:
o L1L2 = { s1s2 | s1  L1 and s2  L2 }
 •
Union

o L1  L2 = { s | s  L1 or s  L2 }
 •Exponentiation:
o L0 = {} L1 = L L2 = LL
 •
 Kleene Closure
o –
 L* = zero or more occurance
 •Positive Closure

o L+ = one or more occurance

Example

 L1 = {a,b,c,d} L2 = {1,2}
 L1L2 = {a1,a2,b1,b2,c1,c2,d1,d2}
 L1 È L2 = {a,b,c,d,1,2}
 L13 = all strings with length three (using a,b,c,d}
 L1* = all strings using letters a,b,c,d and empty string
 L1+ = doesn’t include the empty string

Regular Expressions

 We use regular expressions to describe tokens of a programming

language.
 A regular expression is built up of simpler regular expressions
(using defining rules)
 Each regular expression denotes a language.
 A language denoted by a regular expression is called as a regular
set.
Regular Expressions (Rules)

 Regular expressions over alphabet S

Reg. Expr Language it denotes

 {}

a S {a}

(r1) | (r2) L(r1) L(r2)

(r1) (r2) L(r1) L(r2)

(r)* (L(r))*
(r) L(r)

 (r)+ = (r)(r)*
 (r)? = (r) | 

 We may remove parentheses by using precedence rules.

o * highest
o concatenation next
o | lowest
ab*|c means (a(b)*)|(c)
Examples:

 S = {0,1}
 0|1 => {0,1}
 (0|1)(0|1) => {00,01,10,11}
 0* => { ,0,00,000,0000,....}
 (0|1)* => all strings with 0 and 1, including the empty string

Regular Definitions

 To write regular expression for some languages can be difficult,

because their regular expressions can be quite complex. In those
cases, we may use regular definitions.
 We can give names to regular expressions, and we can use these
names as symbols to define other regular expressions.
 A regular definition is a sequence of the definitions of the form:
d1  r1 where di is a distinct name and

d2  r2 ri is a regular expression over

symbols in

.  S{d1,d2,...,di-1}

dn  rn
basic symbols previously defined
names

Ex:Identifiers in Pascal

 letter  A | B | ... | Z | a | b | ... | z

 digit  0 | 1 | ... | 9
 id  letter (letter | digit ) *

 If we try to write the regular expression representing identifiers

without using regular definitions, that regular expression will be
complex.
o (A|...|Z|a|...|z) ( (A|...|Z|a|...|z) | (0|...|9) ) *
 •

 Ex: Unsigned numbers in Pascal

 digit  0 | 1 | ... | 9
 digits  digit +
 opt-fraction  ( . digits ) ?
 opt-exponent  ( E (+|-)? digits ) ?
 unsigned-num digits opt-fraction opt-exponent

1. Briefly explain about operate precedent parsing with example

Ambiguity – Operator Precedence
Ambiguous grammars (because of ambiguous operators) can be
disambiguated according to the precedence and associativity rules.

E  E+E | E*E | E^E | id | (E)

disambiguate the grammar

precedence: ^ (right to left)

* (left to right)
+ (left to right)

E  E+T | T

T  T*F | F

F  G^F | G

G  id | (E)

Left Recursion
 A grammar is left recursive if it has a non-terminal A such that
there is a derivation.
o +

o A  A for some string 

 Top-down parsing techniques cannot handle left-recursive

grammars.
 So, we have to convert our left-recursive grammar into an
equivalent grammar which is not left-recursive.
 The left-recursion may appear in a single step of the derivation
(immediate left-recursion), or may appear in more than one step of
the derivation.

Immediate Left-Recursion

AA|  where  does not start with A

 eliminate immediate left recursion

A   A’

A’   A’ |  an equivalent grammar

In general,

A A 1 | ... | A m | 1 | ... | n where 1 ... n do not start with

 eliminate immediate left recursion

A  1 A’ | ... | n A’

A’  1 A’ | ... | m A’ |  an equivalent grammar

Immediate Left-Recursion – Example

E  E+T | T

T  T*F | F
F  id | (E)

 eliminate immediate left recursion

E  T E’

E’  +T E’ | 

T  F T’

T’  *F T’ | 

F  id | (E)
Left-Recursion – Problem

 A grammar cannot be immediately left-recursive, but it still can be

 left-recursive.
 By just eliminating the immediate left-recursion, we may not get
 a grammar which is not left-recursive.

S  Aa | b

A Sc | d This grammar is not immediately left-recursive,

but it is still left-recursive.

S  Aa  Sca or

A  Sc  Aac causes to a left-recursion

So, we have to eliminate all left-recursions from our grammar Eliminate
Left-Recursion – Algorithm
Arrange non-terminals in some order: A1 ... An

- for i from 1 to n do {
- for j from 1 to i-1 do {
replace each production

Ai  Aj 

Ai  1  | ... | k 

where Aj  1 | ... | k

}
- eliminate immediate left-recursions among Ai productions

Eliminate Left-Recursion – Example

S  Aa | b

A  Ac | Sd | f
- Order of non-terminals: S, A
for S:
- we do not enter the inner loop.
- there is no immediate left recursion in S.
for A:

- Replace A  Sd with A  Aad | bd

So, we will have A  Ac | Aad | bd | f

- Eliminate the immediate left-recursion in A

A  bdA’ | fA’

A’  cA’ | adA’ | 
So, the resulting equivalent grammar which is not left-recursive is:

S  Aa | b
A  bdA’ | fA’

A’  cA’ | adA’ | 

Eliminate Left-Recursion – Example2

S  Aa | b

A  Ac | Sd | f
- Order of non-terminals: A, S
for A:
- we do not enter the inner loop.
- Eliminate the immediate left-recursion in A

A  SdA’ | fA’

A’  cA’ | 
for S:

- Replace S  Aa with S SdA’a | fA’a

So, we will have S  SdA’a | fA’a | b

- Eliminate the immediate left-recursion in S

S  fA’aS’ | bS’

S’  dA’aS’ | 

So, the resulting equivalent grammar which is not left-recursive is:

S  fA’aS’ | bS’

S’  dA’aS’ | 

A  SdA’ | fA’

A’  cA’ | 

2. Briefly explain about a predictive parsing with example

Predictive Parser

a grammar   a grammar suitable for

predictive
eliminate left parsing (a LL(1) grammar)
left recursion factor no %100 guarantee.
When re-writing a non-terminal in a derivation step, a predictive
parser can uniquely choose a production rule by just looking the current
symbol in the input string.

A  1 | ... | n input: ... a .......

current token

stmt  if ...... |
while ...... |
begin ...... |
for .....

 When we are trying to write the non-terminal stmt, if the current

token is if we have to choose first production rule.
 When we are trying to write the non-terminal stmt, we can
uniquely choose the production rule by just looking the current
token.
 We eliminate the left recursion in the grammar, and left factor it.
But it may not be suitable for predictive parsing (not LL(1)
grammar).

Recursive Predictive Parsing

Each non-terminal corresponds to a procedure.

Ex: A  aBb (This is only the production rule for A)

proc A
{
- match the current token with a, and move to the next
token;
- call ‘B’;
- match the current token with b, and move to the next
token;
}

A  aBb | bAB
proc A {
case of the current token {
‘a’: - match the current token with a, and move to the next
token;
- call ‘B’;
- match the current token with b, and move to the next
token;
‘b’: - match the current token with b, and move to the next
token;
- call ‘A’;
- call ‘B’;
}
}

When to apply -productions.

A  aA | bB | e

 If all other productions fail, we should apply an -production. For

example, if the current token is not a or b, we may apply the -
production.
 Most correct choice: We should apply an -production for a non-
terminal A when the current token is in the follow set of A (which
terminals can follow A in the sentential forms).

Recursive Predictive Parsing (Example)

A  aBe | cBd | C

B  bB | e

Cf

proc C
{
match the current token with f, and move to the next token;
}
proc A
{
case of the current token
{
a: - match the current token with a, and move to the next
token; - call B;
- match the current token with e, and move to the next
token;
c: - match the current token with c, and move to the next
token;
- call B;
- match the current token with d, and move to the next
token;
f: - call C
}
}

proc B
{
case of the current token
{
b: - match the current token with b, and move to the next
token;
- call B;
e,d: do nothing
}

f- first set of C
e,d – follow set of B

Non-Recursive Predictive Parsing -- LL(1) Parser

 Non-Recursive predictive parsing is a table-driven parser.
 It is a top-down parser.
 It is also known as LL(1) Parser.

input buffer

stack Non-recursive output

Predictive Parser

Parsing Table

LL(1) Parser

input buffer
 our string to be parsed. We will assume that its end is marked with
a special symbol $.
output

 a production rule representing a step of the derivation sequence

(left-most derivation) of the string in the input buffer.
stack
–contains the grammar symbols

 at the bottom of the stack, there is a special end marker symbol $.

 initially the stack contains only the symbol $ and the starting
symbol S.
 $S  initial stack
 when the stack is emptied (ie. only $ left in the stack), the parsing
is completed.
•
parsing table

 a two-dimensional array M[A,a]

 each row is a non-terminal symbol
 each column is a terminal symbol or the special symbol $
 each entry holds a production rule.

LL(1) Parser – Parser Actions

The symbol at the top of the stack (say X) and the current symbol in the
input string (say a) determine the parser action.

There are four possible parser actions.

 1. If X and a are $
parser halts (successful completion)
 2.
 If X and a are the same terminal symbol (different from $)
parser pops X from the stack, and moves the next symbol in
the input buffer.
 3. If X is a non-terminal
 parser looks at the parsing table entry M[X,a]. If M[X,a]
holds a production rule XY1Y2...Yk, it pops X from the
stack and pushes Yk,Yk-1,...,Y1 into the stack. The parser
also outputs the production rule XY1Y2...Yk to represent a
step of the derivation.

 4. none of the above

 error
o all empty entries in the parsing table are errors.
o If X is a terminal symbol different from a, this is also an error case.

LL(1) Parser – Example1

S  aBa

B  bB | 

stack input output

$S abba$ S  aBa
$aBa abba$

$aB bba$ B  bB
$aBb bba$
$aB ba$ B  bB
$aBb ba$

$aB a$ Be
$a a$
$ $ accept, successful completion

a b $
S S
aBa
B B B
bB
LL(1) Parsing Table

Outputs: S  aBa B  bB B bB B

Derivation(left-most): SaBaabBaabbBaabba
Unit II

PART-A

1. Define CFG

Context-Free Grammars

Inherently recursive structures of a programming language are

defined by a context-free grammar.
In a context-free grammar,
We have:

 A finite set of terminals (in our case, this will be the set of tokens)
 A finite set of non-terminals (syntactic-variables)
 A finite set of productions rules in the following form
 Aa where A is a non-terminal and a is a string of
terminals and non-terminals (including the empty string)
 A start symbol (one of the non-terminal symbol)

2. What are the advantages of grammar?

In a context-free grammar,
We have:

3. Define parse tree with example

Parser
 Parser works on a stream of tokens.

 The smallest item is a token.

token parse
Source
Lexical
Analyzer Parser tree
program
4. Define ambiguous grammar.
Get next token

Ambiguity
A grammar produces more than one parse tree for a sentence is called
as an ambiguous grammar.

5. Eliminate left recursion from the grammar

S → (L)/a
T → L,S/S

AA|  where  does not start with A

 eliminate immediate left recursion

A   A’
A’   A’ |  an equivalent grammar
T-> ST’

T’->,ST’/ .
6. What is the role of the error handler in a parser?

Each phase can encounter errors. After detecting error in a

phase it allows further errors to be detected. In lexical analysis, it
finds errors where it cannot form a token. In syntax analyzer, it
finds where the token stream violates the structure rules. In
semantic phase, it tries to detect constructs that have the right
syntactic structure but no meaning to the operations involved.

7. What are the possible action of a shift reduce parsing

A shift-reduce parser tries to reduce the given input string

into the starting symbol.

8. a string  the starting symbol

reduced to

At each reduction step, a substring of the input matching to

the right side of a production rule is replaced by the non-terminal
at the left side of that production rule.
If the substring is chosen correctly, the right most derivation
of that string is created in thereverse order.

Rightmost Derivation: S

Shift-Reduce Parser finds:   ...  S
2. Define bottom up parsing.

A bottom-up parser creates the parse tree of the given input

starting from leaves towards the root.

3. Define top down parsing.

The parse tree is created top to bottom.

1. Top-down parser
2. Recursive-Descent Parsing

 Backtracking is needed (If a choice of a production rule does

not work, we backtrack to try other alternatives.)
 It is a general parsing technique, but not widely used.
 Not efficient

3. Predictive Parsing

 no backtracking
 efficient
 needs a special form of grammars (LL(1) grammars).
 Recursive Predictive Parsing is a special form of Recursive
Descent parsing without backtracking.
 Non-Recursive (Table Driven) Predictive Parser is also
known as LL(1) parser.

4. List out the error recovery strategies

1. panic mode
2. pharse level
3. error productions
4. global correction.
5. Draw the block diagram for syntax analysis
token parse
Source
6. Lexical parser
Analyzer Parser
program
Get next token

8. Construct parse tree for the given statement

i. E=E+E*10
=

E +

E *
E 10

15. What are the terminals? Non Terminals and start symbol for
the grammar
S → (L)|a
L→ L,S|S

AA|  where  does not start with A

 eliminate immediate left recursion

A   A’

A’   A’ |  an equivalent grammar
T-> ST’

T’->,ST’/ .
9. What is the need of left factoring
A predictive parser (a top-down parser without backtracking)
insists that the grammar must be left-factored.

grammar  a new equivalent grammar suitable for predictive

parsing

stmt  if expr then stmt else stmt |

i. if expr then stmt

10. Define parser.

Parser works on a stream of tokens.

The smallest item is a token.

token parse
tree
Source
Lexical
Analyzer Parser
program
We categorize the parsers into two groups:
Get next token

11. What is terminal with example

L(G) is the language of G (the language generated by G)
which is a set of sentences.
A sentence of L(G) is a string of terminal symbols of G.

12. Define Handle

Informally, a handle of a string is a substring that matches
the right side of a production rule.
–But not every substring matches the right side of a production
rule is handle

13. Construct parse tree for the given statement

i. E=E+E*10
=

E +

E *
E 10

15. What are the terminals? Non Terminals and start symbol for
the grammar
S → (L)|a
L→ L,S|S

AA|  where  does not start with A

 eliminate immediate left recursion

A   A’

A’   A’ |  an equivalent grammar
T-> ST’

T’->,ST’/ .
14. What is the need of left factoring
A predictive parser (a top-down parser without backtracking)
insists that the grammar must be left-factored.

grammar  a new equivalent grammar suitable for predictive

parsing

stmt  if expr then stmt else stmt |

i. if expr then stmt

15. Define parser.

Parser works on a stream of tokens.

The smallest item is a token.

token parse
tree
Source
Lexical
Analyzer Parser
program
We categorize the parsers into two groups:
Get next token

16. What is terminal with example

L(G) is the language of G (the language generated by G)
which is a set of sentences.
A sentence of L(G) is a string of terminal symbols of G.

17. Define Handle

Informally, a handle of a string is a substring that matches
the right side of a production rule.
–But not every substring matches the right side of a production
rule is handle

Unit III

PART-A

1. Write the advantage of generating an intermediate

representation

Advantage of machine independent intermediate form

 Retarget is facilitated. [a compiler for a different machine can be

created by attaching a back end for the new machine to an existing
front end]
 A machineindepedent code optimiser can be applied to the
intermediate represntation

2. Define syntax Tree with example.

A syntax tree shows the hierarchical structure of a source program
A DAG gives the same information but in compact way because
common sub expression are identified.
Statements a:= b*-c +b*-c

3. What are the functions used to create the notes of syntax

trees.

A syntax tree shows the hierarchical structure of a source program

A DAG gives the same information but in compact way because
common sub expression are identified.
Statements a:= b*-c +b*-c

4. What are the three kinds of intermediate representation?

Types of intermediate represntation

 Syntax trees
 Postfix notation
 Three address codes: [the semantic rules for generating three
address code from common programming languages]

5. Define quadruple with an example.

A quadruples is a record structure with four fields ,which we call
op,arg1, arg2 and result.

6. Define symbol table

It is data structure contaning a record for each identify with fields

for the attribute of the identifier. The data structure allows us to find the
record for each identifier quickly and to store or retrive data from that
record quickly.

7. What are the various data structure used for implementing

symbol table

1. Intial,
2. position,and
3. rate .

8. Draw the DAG for a:=b-c+b-c

Code for the DAG (Statements a:= b*-c +b*-c)
1. t1:= -c
2. t2:= b * t1
3. t5 := t2 + t2
4. a:= t5

9. Translate the expression –(a+b)*(c+d)+(a+b+c)into

quadruples

Op Arg1 Arg2 result

(0) + a b T1

(1) *,+ t1 c,d T2

(3) -,*,+ t2 a,b,c T3
(4) = t4 x

10. Write short note for Triples

In this the use of temporary variables is avoided by referring the

pointer in the symbol table .

11. Translate the arithmetic expression a*-(b+c)into syntax

tree

12. Write three address code for the expressiona*-(b+c)

Op Arg1 Arg2 result

(0) + a b T1

(1) *,+ t1 c,d T2

(3) -,*,+ t2 a,b,c T3

(4) = t4 x

13. Give the difference between syntax – directed definition

and translation schemes

 Retarget is facilitated. [a compiler for a different machine can be

created by attaching a back end for the new machine to an existing
front end]
 A machineindepedent code optimiser can be applied to the
intermediate represntation

Translation:evaluate the expression

find which value in the list of case is the same as the value of the
expression

14. Give the form of a syntax - directed definition

When the three address code is generated, temporary names are
made up for the interior nodes of a syntax tree.

15. What is the purpose of DAG

A DAG gives the same information but in compact way because

common sub expression are identified.

16. Define back patching

Generating a series of branching statement for boolean expression
and flow of control statement with the targets of the jump temporarily
left un specified .
17. Define dependency graphs

18. Define procedure calls

Procedure or a function is an important programming constructs

which is used to obtain the modularity in the user program .

19. What is the use of symbol table

It is data structure contaning a record for each identify with fields

for the attribute of the identifier. The data structure allows us to find the
record for each identifier quickly and to store or retrive data from that
record quickly.

20. Give the advantage and disadvantage of linear list

implementation of symbol table

It is data structure contaning a record for each identify with fields

for the attribute of the identifier. The data structure allows us to find the
record for each identifier quickly and to store or retrive data from that
record quickly.

1. Translate the arithmetic expression a*-(b+c)into

i. syntax tree
ii. postfix notation
iii. Three-address code
 Syntax trees
 Postfix notation
 Three address codes: [the semantic rules for generating three
address code from common programming languages]
Graphical representation

A syntax tree shows the hierarchical structure of a source program

A DAG gives the same information but in compact way because
common sub expression are identified.
Statements a:= b*-c +b*-c
Postfix notation

Linear represntation of a syntax tree

Postfix for the Statements a:= b-c +b-c

a b c uminus * b c unminus * + assign
Syntax tree for the assignment statement is produced by the syntax
directed translation
Nonterminal S generates an assignment statement
+ and – are operators in the typical languages
operator associates and precedence are usual
syntax directed definition for the Statements a:= b*-c +b*-c
Productio Semantic rule
n
S id:=E S.nptr := mknode (‘assign’, mkleaf(id,
id.place),E.nptr)
EE1 + E2 E.nptr := mknode (‘+’ E1.nptr, E2.nptr)
EE1 * E.nptr := mknode (‘*’ E1.nptr, E2.nptr)
E2
E-E1 E.nptr := mknode (‘uminus’ E1.nptr)
E ( E1 ) E.nptr :=E1.nptr
Eid E.nptr := mkleaf(id,id.place)

Representation of syntax tree

 Each node as As a record

o Field – operator, pointers to children
 Node are allocated from an array
2. Briefly explain about intermediate code generation?

The front end translates a source program into an intermediate

represntation from which the back end generates the target code.
Intermediate code
Parser Static Intermedi Code
checker ate code generator
generator

Position of intermediate code

Advantage of machine independent intermediate form

 Retarget is facilitated. [a compiler for a different machine can be

created by attaching a back end for the new machine to an existing
front end]
 A machineindepedent code optimiser can be applied to the
intermediate represntation

Intermediate languages

Types of intermediate Representation

Syntax trees

 Postfix notation
 Three address codes: [the semantic rules for generating three
address code from common programming languages]

Graphical representation

A syntax tree shows the hierarchical structure of a source program

A DAG gives the same information but in compact way because
common sub expression are identified.
Statements a:= b*-c +b*-c
Postfix notation

Linear represntation of a syntax tree

Postfix for the Statements a:= b*-c +b*-c

a b c uminus * b c unminus * + assign

Syntax tree for the assignment statement is produced by the syntax

directed translation
Nonterminal S generates an assignment statement
+ and – are operators in the typical languages
operator associates and precedence are usual

syntax directed definition for the Statements a:= b-c +b-c

Productio Semantic rule

n
S id:=E S.nptr := mknode (‘assign’, mkleaf(id,
id.place),E.nptr)
EE1 + E2 E.nptr := mknode (‘+’ E1.nptr, E2.nptr)
EE1 * E.nptr := mknode (‘*’ E1.nptr, E2.nptr)
E2
E-E1 E.nptr := mknode (‘uminus’ E1.nptr)
E ( E1 ) E.nptr :=E1.nptr
Eid E.nptr := mkleaf(id,id.place)

Representation of syntax tree

 Each node as As a record

o Field – operator, pointers to children
 Node are allocated from an array

Unit IV

PART-A

1. What are the issues in the design of code generators?

 The requirements on a code generator:

o –The output code must be correct.
o –The output code must be high quality.
o –It should make effective use of the resources of the target
machine.
o –It should run efficiently.

2. Give the two standard storage allocation strategies.

1. Static allocation ,and
2. stack allocation.

3. Define static allocation

The position of an activation record in memory is fixed at compiler

time.

4. Define stack allocation.

Static allocatin can become stack allocationby using relative address

for storage in activation records.

5. Define code generation.

A code generator takes an intermediate representation of a
source program, and produces an equivalent program as an output.

6. Write short note for target program.

1.The output of the code generation is the target program.
2.The target program can be in one of the following form:
1. –absolute machine language
2. –relocatable machine language
3. –assembly language
4. –virtual machine codes(Java..)

7. Define flow graph

 We can add the flow-of-control information to the set of basic
blocks making up a program by constructing a directed graph
called a flow graph.

8. Define basic block

A basic block is a sequence of consecutive statements (of

intermediate codes – quadraples) in which flow of control enters at
the beginning and leaves at the end without halt or possibility of
branch (except at the end).

A basic block:
t1 := a * a
t2 := a * b
t3 := t1 – t2

9. Give the types of partition sequence ot three – address

statements into basic blocks.
(i) The first statement is a leader
(ii) Conditional or unconditional,goto statement is a leader.
(iii) Any statement that immediately follows a go to or
conditional goto statement is a leader .

10. Give the types of transformation of basic blocks

1.Structure-preserving transformation,
2. Algebraic transformation.

11. Define DAG with example

The directed acyclic graph(DAGs) are useful data structure for

implementing transformations on basic blocks
12. What are the typical peephole optimization?.

Peephole Optimization is a method to improve performance of the

target program by examining a short sequence of target instructions
(called peephole), and replacing these instructions shorter and faster
instructions.

13. Give the representation of intermediate language

The front end translates a source program into an intermediate
represntation from which the back end generates the target code.

Intermediate code
Parser Static Intermedi Code
checker ate code generator
generator

14. Position of intermediate code

15. List out the primary structure preserving

transformations

Structure-Preserving Transformations

 The primary structure-preserving transformations are:

o –common sub-expression elimination
o –dead-code elimination
o –renaming of temporary variables
o –interchange of two independent adjacent statements.

16. Write short note for flow of control optimizations

Flow-of-Control Optimizations
goto L1 goto L2
.
L1: goto L2 L1: goto L2

----------------------------------------------
if a<b goto L1 if a<b goto L2
.
L1: goto L2  L1: goto L2
-----------------------------------------------
goto L1  if a<b goto L2
. goto L3
L1: if a<b goto L2 .

17. Give the different forms of addressing mode and

associated costs.

Address Modes
 The source and destination fields are not long enough to hold
memory addresses. Certain bit-patterns in these fields specify that
words following the instruction (the instruction is also one word)
contain operand addresses (or constants).
 Of course, there will be cost for having memory addresses and
constants in instructions.
 We will use different addressing modes to get addresses of source
and destination.

18. What are the forms of the output of the code generator?
Target Programs (Output of Code Generation)

 The output of the code generation is the target program.

 The target program can be in one of the following form:
o –absolute machine language
o –relocatable machine language
o –assembly language
o –virtual machine codes(Java..)

19. Write short note for memory management

Implementation of static and stack allocation of data objects?
How the names in the intermediate codes are converted into
addresses in the target code?
The labels in the intermediate codes must be converted into
the addresses of the target machine instructions.
A quadraple will match to more than one machine
instruction. If that quadraple has a label, this label will be the
address of the first machine instruction corresponding to that
quadraple.

20. Define re locatable machine language.

Producing a relocatable machine language program as out put allows

subprogram to be compiled separately.

PART-B

1. What are the issues in designing of code generator? Explain in

detail.

Code Generation

 A code generator takes an intermediate representation of a source

program, and produces an equivalent program as an output.
 The requirements on a code generator:
o –The output code must be correct.
o –The output code must be high quality.
o –It should make effective use of the resources of the target
machine.
o –It should run efficiently.
 In theory, the problem of generating optimal code is undecidable.
 In practice, we use heuristic techniques to generate sub-optimal
(good, but not optimal) target code. The choice of the heuristic is
important since a carefully designed code generation algorithm can
produce much better code than a naive code generation algorithm.

Input to Code Generator

 The input of a code generator is the intermediate representation of

a source program (together with the information in the symbol
table to figure out the addresses of the symbols).
 The intermediate representation can be:
o –Three-address codes (quadraples).
o –Trees
o –Dags (Directed Acyclic Graphs)
o –or, other representations
 Code generator assumes that codes in the intermediate
representation are free of semantic errors and we have all the type
conversion instructions in these codes.

Target Programs (Output of Code Generation)

 The output of the code generation is the target program.

 The target program can be in one of the following form:
o –absolute machine language
o –relocatable machine language
o –assembly language
o –virtual machine codes(Java..)
 If absolute machine language is used, the target program can be
placed in a fixed location, and immediately executed (WATFIV,
PL/C).
 If relocatable machine language is used, we need a linker and
loader to combine the relocatable object files and load them. It is a
flexible approach (C language)
 If assembly language is used, we need an assembler is need

Memory Management

 Implementation of static and stack allocation of data objects?

 How the names in the intermediate codes are converted into
addresses in the target code?
 The labels in the intermediate codes must be converted into the
addresses of the target machine instructions.
 A quadraple will match to more than one machine instruction. If
that quadraple has a label, this label will be the address of the first
machine instruction corresponding to that quadraple.

Instruction Selection

 The structure of the instruction set of the target machine

determines the difficulty of the instruction selection.
 –The uniformity and completeness of the instruction set are an
important factors.
 Instruction speeds are also important.
 –If we do not care speed, the code generation is a straight forward
job. We can map each quadraple into a set of machine instructions.
Naive code generation:
ADD y,z,x  MOV y, R0
ADD z, R0
MOV R0,x

 The quality of the generated code is determined by its speed and

size.
 Instruction speeds are needed to design good code sequences.

 Instructions involving register operands are usually shorter and

faster than those involving operands in memory.
 The efficient utilization of registers is important in generating good
code sequence.
 The use of registers is divided into two sub-problems:
o –Register Allocation – we select the set of registers that will
be reside in registers at a point in the program.
o –Register Assignment – we pick the specific register that a
variable will reside in.
 Finding an optimal assignment of registers is difficult.
o –In theory, the problem is NP-complete.
o –The problem is further complicated because some
architectures may require certain register-usage conventions
such as address vs data registers, even vs odd registers for
certain instructions.

Choice of Evaluation Order

 The order of computations affect the efficiency of the target code.
 Some computation orders require less registers to hold
intermediate results.
 Picking the best computation order is also another NP-complete
problem.
 We will try to use the order used in the intermediate codes.
 But, the most important criterion for a code generator is that it
should produce correct code.
 We may use a less efficient code generator as long as it produces
correct codes. But we cannot use a code generator which is
efficient but it does not produce correct codes.

Target Machine
 To design a code generator, we should be familiar with the
structure of the target machine and its instruction set.
 nstead of a specific architecture, we will design our own simple
target machine for the code generation.
– We will decide the instruction set, but it will be closer
actually machine instructions.
– We will decide size and speeds of the instructions, and we
will use them in the creation of good code generators.
– Although we do not use an actual target machine, our
discussions are also applicable to actual target machines.

Our Target Machine

 Our target machine is a byte-addressable machine (each word is
four-bytes).
 Our target machine has n general purpose registers – R0, R1,...,Rn-
1
 Our target machine has two-address instructions of the form:
 op source,destination
 where op is an op-code, and source and destination are data fields.


ADD  add source to destination
SUB  subtract source from destination
MOV  move source to destination

Our Target Machine – Address Modes

 •The source and destination fields are not long enough to hold
memory addresses. Certain bit-patterns in these fields specify that
words following the instruction (the instruction is also one word)
contain operand addresses (or constants).
 •Of course, there will be cost for having memory addresses and
constants in instructions.
 •We will use different addressing modes to get addresses of source
and destination.

a. Discuss about the run time storage management of a code

generator.

Run-Time Addresses
 Stack Variables
o Stack variables are accesses using offsets from the beginning
of the activation records.

 local variable  *OFFSET(SP)

 non-local variable
o access links
o displays

Basic Blocks
A basic block is a sequence of consecutive statements (of intermediate
codes – quadraples) in which flow of control enters at the beginning and
leaves at the end without halt or possibility of branch (except at the end).
A basic block:
t1 := a * a
t2 := a * b
t3 := t1 – t2

Partition into Basic Blocks

Input: A sequence of three-address codes

Output: A list of basic blocks with each three-address statement in
exactly one block.
Algorithm:

 1.Determine the list of leaders. The first statement of each basic

block will be a leader.
o The first statement is a leader
o Any statement that is the target of a jump instruction
(conditional or unconditional) is a leader.
o Any statement immediately following a jump instruction
(conditional or unconditional) is a leader.
 2.For each leader, its basic block consists of the leader and all
statements up to but not including the next leader or the end of the
program.

Example Pascal Program

begin
prod := 0;
i := 1;
do begin
prod := prod + a[i] * b[i];
i := i + 1;
end
while i <= 20
end

Corresponding Quadraples
1: prod := 0
2: i := 1
3: t1 := 4*i
4: t2 := a[t1]
5: t3 := 4*i
6: t4 := b[t3]
7: t5 := t2*t4
8: t6 := prod+t5
9: prod := t6
10: t7 := i+1
11: i := t7
12: if i<=20 goto 3
2. Explain briefly about DAG representation of basic blocks.

DAG Representation of Basic Blocks

 Directed Acyclic Graphs (dags) can be useful data structures for
implementing transformations on basic blocks.
 Using dags
o –we can easily determine common sub-expressions
o –We can determine which names are evaluated outside of the
block, but used in the block.
 First, we will construct a dag for a basic block.
 Then, we apply transformations on this dag.
 Later, we will produce target code from a dag.

A dag for A Basic Block

 A dag for a basic block is:
o –Leaves are labeled with unique identifiers(names,
constants). If the value of a variable is changed in a basic
block we use subscripts to distinguish two different value of
that name.
o –Interior nodes are labeled by an operator symbol
o –Interior nodes optionally may also have a sequence of
names as labels.
 So, for each basic block we can create a dag for that basic block.

Three-Address Codes for A Basic Block

1: t1 := 4*i
2: t2 := a[t1]
3: t3 := 4*i
4: t4 := b[t3]
5: t5 := t2*t4
6: t6 := prod+t5
7: prod := t6
8: t7 := i+1
9: i := t7
10: if i<=20 goto 1

Corresponding DAG

Construction of DAGs
 •We can systematically create a corresponding dag for a given
basic block.
 •Each name is associated with a node of the dag. Initially, all
names are undefined (i.e. they are not associated with nodes of the
dag).
 •For each three-address code x := y op z
o –Find node(y). If node(y) is undefined, create a leaf node
labeled y and let node(y) to be this node.
o –Find node(z). If node(z) is undefined, create a leaf node
labeled y and let node(z) to be this node.
o –If there is a node with op, node(y) as its left child, and
node(z) as its right child  this is node is also treated as
node(x).
o –Otherwise, create node(x) with op, node(y) as its left child,
and node(z) as its right child.

Applications of DAGs
 We automatically detect common sub-expressions.
 We can determine which identifiers whose values are used in the
block. (the identifier at leaves).
 We can create simplified quadraples for a block using its dag.
o –taking advantage of common sub-expressions
o –without performing unnecessary move instructions.
 In general, the interior nodes of a the dag can be evaluated in any
order that is a topological sort of the dag.
o –In topological sort, a node is not evaluated until its all
children are evaluated.
o –So, a different evaluation order may correspond to a better
code sequence.

3. Write short notes for

i. flow of control optimizations
ii. Algebraic simplification
iii. Redundant instruction elimination
iv. Reduction in strength

(i)Flow-of-Control Optimizations:
goto L1 goto L2
. 
L1: goto L2 L1: goto L2
----------------------------------------------
if a<b goto L1 if a<b goto L2
. 
L1: goto L2 L1: goto L2
-----------------------------------------------
goto L1 if a<b goto L2
.  goto L3
L1: if a<b goto L2 .
L3: L3:

ii) Algebraic Transformations

x := x+0 eliminate this statement
x := y+0  x := y
x := x+1  INC ,,X
x := y**2  x := y*y

iii) redundant-instruction elimination:

if we see the instruction sequence
MOV R,a
MOV a,R
We can delete the second instruction if it is an unlabeled
instruction, because the first instruction ensures that the value of a is
already in the register R .
iv) reduction in strength:
It replaced expensive operation by equivalent cheaper ones on
target machine.
6.Briefly explain about storage allocation strategies?
Run-Time Storage Organization

 Static Allocation -- the static allocation can be performed by just

reserving enough memory space for static data objects.
o –Static variables can be accessible by just using absolute
memory address.

 Stack Allocation – the code generator should produce machine

codes to allocate the activation records (corresponding to
intermediate codes).
o –Normally we will use a specific register to point (the
beginning of) the activation record, and we will use this
register to access variables residing in that activation record.
o We cannot know actual address of these stack variables until
run-time.
Stack Allocation – Activation Record
Return address

Return Value

SP
Actual Parameters

Other Stuff

Local Variables

Temporaries

 All values in the activation record can be accessible from SP by a

positive offset.

 And all these offsets are calculated at compile-time.

Possible Procedure Invocation

ADD #caller.recordsize,SP
MOV PARAM1,*8(SP) // save parameters
MOV PARAM2,*12(SP)
.
MOV PARAMn,*4+4n(SP)
. // saving other stuff
MOV #here+16,*SP // save return address
GOTO callee.codearea // jump to procedure
SUB #caller.recordsize,SP // return address

Possible Return from A Procedure Call

MOV RETVAL,*4(SP) // save the return value
GOTO *SP // return to caller

Run-Time Addresses
 Static Variables:
o static[12]  staticaddressblock+12

 if the beginning of the static address block is 100,

o MOV #0,,static[12]  MOV #0,112

 So, the static variables are absolute addresses and these absolute
addresses are evaluated at compile time (or load time).

Run-Time Addresses
 Stack Variables
o Stack variables are accesses using offsets from the beginning
of the activation records.

 local variable  *OFFSET(SP)

 non-local variable
o access links
o displays

7. Write short note

i) Register allocation
ii) Memory management
iii) Input to the code generator
iv) Instruction selection

i)Register Allocation

 Instructions involving register operands are usually shorter and

 Implementation of static and stack allocation of data objects?

iii)Input to Code Generator

 The input of a code generator is the intermediate representation of

iv)Instruction Selection
 The structure of the instruction set of the target machine
determines the difficulty of the instruction selection.
 –The uniformity and completeness of the instruction set are an
important factors.
 Instruction speeds are also important.
 –If we do not care speed, the code generation is a straight forward
job. We can map each quadraple into a set of machine instructions.
Naive code generation:
ADD y,z,x  MOV y, R0
ADD z, R0
MOV R0,x

 The quality of the generated code is determined by its speed and

size.
 Instruction speeds are needed to design good code sequences.

3. Explain about peephole optimization?

Peephole Optimization

 Peephole Optimization is a method to improve performance of the

target program by examining a short sequence of target
instructions (called peephole), and replacing these instructions
shorter and faster instructions.
o –peephole optimization can be applicable to both
intermediate codes and target codes.
o –the peephole can be in a basic block (sometimes can be
across blocks).
o –we may need multiple passes to get best improvement in the
target code.
o –we will look at certain program transformations which can
be seen as peephole optimization.

Redundant Instruction Elimination

MOV R0,a  MOV R0,a
MOV a,R0
•We can eliminate the second instruction, if there is no jump instruction
jumping to that instruction.

Unreachable Code
We may remove unreachable codes.
#define debug 0
.
.
if (debug==1) { print debugging info }

This is an unreachable code sequence. So we can eliminate it.

Flow-of-Control Optimizations
goto L1 goto L2
. 
L1: goto L2 L1: goto L2
----------------------------------------------
if a<b goto L1 if a<b goto L2
. 
L1: goto L2 L1: goto L2
-----------------------------------------------
goto L1 if a<b goto L2
.  goto L3
L1: if a<b goto L2 .
L3: L3:

Other Peephole Optimizations

 Algebraic Simplifications:
o –x := x+0
o –x := x*1
o –... more
 Reduction in Strength
o –x := y**2  x := y*y
o –x := y*2  x := lshift(y,1)
 Specific Machine Instructions
o –The target machine may specific instructions to implement
specific operations.
o –auto increment, auto decrement, ...
Unit V

PART-A

1. Define activation Tree.

Each execution of a procedure is referred to an activation of a

procedure .if the procedure is recursive ,several of its activation may be
alive at the same time.

2. What is the use of control stack?

The flow of control in a program corresponds to a depth first

traversal of the activation tree that starts at the root , visits a node before
its child ren, and recursively visit children at each node in a left to right
roder .

3. Define scope of declarations.

A scope of declaration in a language is a syntatic construct that

associates information with name .

4. What are the strategies of storage allocation?

There are 3 storage allocation strategy ,they are
1. Static allocation
2. Stack allocation
4. Help allocation

5. Define static allocation.

The position of an activation record in memory is fixed at
compiler time.
6. What are the limitations of the static memory allocation?
The static allocation can be performed by just reserving
enough memory space for static data objects.
–Static variables can be accessible by just using absolute memory
address

7. What are the two approaches to implement dynamic scopes.

1. Static scope rule,

2. Dynamic scope rule.

8. What is meant by code optimization?

Codegen produce optimal code for our target machine if we

assume that
1. –there are no common sub-expression
2. –there are no algebraic properties of operators which effect
the timing.

9. Define optimizing compilers.

Compiler that apply code-improving transformation are called

optimizing compiler.

10. Give the criteria for – improving transformation

A transformation of a program is called local,if it can be performed

by looking only at the statements in a basic blocks.

11. What are the two levels of code optimization technique?

o –there are no common sub-expression

o –there are no algebraic properties of operators which effect
the timing.
12. What are the phases of code optimization technique.

codegen produce optimal code for our target machine if we assume

that
1. –there are no common sub-expression
2. –there are no algebraic properties of operators which effect
the timing.
Algebraic properties of operators (such as commutative,
associative) may effect the generated code.
When there are common sub-expressions, the dag will no longer be
a tree. In this case, we cannot apply the algorithm directly.

13. Define function – preserving transformation

1. –Structure-Preserving Transformations
2. –Algebraic Transformations.

14. what is heap allocation

The stack allocation cannot be used if either of the following is

possible
i) The value of a local name must be retained when an
activation ends.
ii) A called activation outline the caller.
Heap allocation parcels out pieces of contiguous stroage as
needed for activation records or other object.

15. Define dead code elimination?

i)We say that x is dead at a certain point, if it is not used after that
point in the block (or in the following blocks).
ii)If x is dead at the point of the statement x := y op z, this
statement can be safely eliminated without changing the meaning
of the block

16. What is access links

A direct implementation of lexical scope for nested procedure is

obtained by adding a pointer called access link.

17. Define deep access and shallow access

Deep:

A simple implementation is to dispenses with access links and use the

control link to search in to stack.

Shallow:

The current value of each name is in statically allocated stroage.

18. List out three loop optimization technique

The running time of a program if its value can be used

subsequently.
I) Code motion
II) Induction variable elimination
III) Reduction in strength.

19. List the principle source of the code optimization.

 1.Labeling – Label each node of the tree (bottom-up) with an

integer that denotes the fewest number of registers required to
evaluate the tree with no stores of intermediate results.
 2.Code Generation from The Labeled Tree – We traverse the
tree by looking the computed label of the tree, and emit the target
code during that traversal.
–For a binary operator, we evaluate the hardest operand first, then
we evaluate the other operand.

20. Define code generation

 This code generation algorithm takes a basic block of three-address

codes, and produces machines codes in our target architecture.
 For each three-address code we perform certain actions.
 We assume that we have getreg routine to determine the location
of the result of the operation.

1. What are the different storage organization strategies? Explain

Storage for Temporaries

If two temporaries are not live at the some time, we can pack these
temporaries into a same location.
We can use the next use information to pack temporaries.
t1 := a*a t1 := a*a
t2 := a*b t2 := a*b
t3 := 2*t2  t2 := 2*t2
t4 := t1+t3 t1 := t1+t2
t5 := b*b t2 := b*b
t6 := t4+t5 t1 := t1+t2

Simple Code Generator

 For simplicity, we will assume that for each intermediate code

operator we have a corresponding target code operator.
 We will also assume that computed results can be left in registers
as long as possible.
o –If the register is needed for another computation, the value
in the register must be stored.
o –Before we leave a basic block, everything must be stored in
memory locations.
 We will try to produce reasonable code for a given basic block.
 The code-generation algorithm will use descriptors keep track of
register contents and addresses for names.

Register Descriptors
 A register descriptor keeps track of what is currently in each
register.
 It will be consulted when a new register is needed by code-
generation algorithm.
 We assume that all registers are initially empty before we enter
into a basic block. This is not true if the registers are assigned
across blocks.
 At a certain time, each register descriptor will hold zero or more
names.
R1 is empty
MOV a,R1
R1 holds a
MOV R1,b
R1 holds both a and b

Address Descriptors
 An address descriptor keeps track of the locations where the
current value of a name can be found at run-time.
 The location can be a register, a stack location or a memory
location (in static area). The location can be a set of these.
 This information can be stored in the symbol table.

a is in the memory
MOV a,R1
a is in R1 and in the memory
MOV R1,b
b is in R1 and in the memory

A Code Generation Algorithm

 This code generation algorithm takes a basic block of three-address

Each execution of a procedure is referred to as an activation of the

procedure . if the procedure is recursive , several of its activation may be
alive at the same time .
The life time of a activation of a procedure P is the sequence of
steps btw the first & the last step in the execution of the procedure
body , including time spent executing procedures called P.
If a &b are the procedure activation can begin before an earlier
activation of the same procedure has ended. An activation tree which
depict the way control enters & leaves activation .
In activation tree:
i) Each node represents an activation of a
procedure.
ii) The root represent the activation of the main
program.
iii) 3. The node for a is the parent of the node for b
if and only if control flow from activation a to b
iv) The node for a is the left of the node for b if and
only if the life time of a occurs before the lifetime
of b

3. Write short note

a) copy propagation
b) dead code elimination
c) constant folding

Dead-Code Elimination

 We say that x is dead at a certain point, if it is not used after that

point in the block (or in the following blocks).
 If x is dead at the point of the statement x := y op z, this statement
can be safely eliminated without changing the meaning of the
block
i) Constant folding:

 The contiguous evaluation of a tree is:

o –first evaluate left sub-tree , then evaluate right sub-tree, and
the root.
o –Or, first evaluate the right sub-tree, then evaluate the left
sub-tree, and finally the root.
 In non-contiguous evaluations, we may mix the evaluations of the
sub-trees.
 For any given machine-language program P (for register machines)
to evaluate an expression tree T, we can find an equivalent
program Q such that:
 1.Q does not have higher cost than P
 2.Q uses no more registers than
 3.Q evaluates the tree in a contiguous fashion.
 –This means that every expression tree can be evaluated optimally
by a contiguous program.

Example
 Assume that we have the following machine codes, and the cost of
each of them is one unit.
o –mov M,Ri
o –mov Ri,M
o –mov Ri,Rj
o –OP M,Ri
o –OP Rj,Ri
 Assume that we have only two registers R0 and R1.
 First, we have to evaluate cost arrays for the tree.

Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
ME2L Step-By-step GUIDE - Display A List of POs
0% (1)
ME2L Step-By-step GUIDE - Display A List of POs
17 pages
Scrip Cryptotab 2019 Full 1329
67% (9)
Scrip Cryptotab 2019 Full 1329
4 pages
Session 5 - PNP ACG - Understanding Digital Forensics
No ratings yet
Session 5 - PNP ACG - Understanding Digital Forensics
79 pages
Compiler
No ratings yet
Compiler
31 pages
12 Mark Questions With Answer-1
No ratings yet
12 Mark Questions With Answer-1
21 pages
CD Important Questions With Answers
No ratings yet
CD Important Questions With Answers
34 pages
CD Question Bank
No ratings yet
CD Question Bank
56 pages
CS3501 CD QB-UNIT 1
No ratings yet
CS3501 CD QB-UNIT 1
6 pages
Compiler Design Assignment
100% (1)
Compiler Design Assignment
16 pages
Principles of Compiler Design
No ratings yet
Principles of Compiler Design
36 pages
CD-30 Questions With Solution
No ratings yet
CD-30 Questions With Solution
43 pages
CS3501 Compiler Design
No ratings yet
CS3501 Compiler Design
31 pages
1 Principles of Compiler Design
No ratings yet
1 Principles of Compiler Design
89 pages
Unit-I - CD R2021
No ratings yet
Unit-I - CD R2021
60 pages
CD Unit-2
No ratings yet
CD Unit-2
64 pages
CD_UNIT-2
No ratings yet
CD_UNIT-2
64 pages
CD - 2 Marks Questions With Answers
No ratings yet
CD - 2 Marks Questions With Answers
21 pages
CD Unit 1
No ratings yet
CD Unit 1
54 pages
Module 5 Lexical Analyser
No ratings yet
Module 5 Lexical Analyser
10 pages
Chapter 7 Lexical Analysis
No ratings yet
Chapter 7 Lexical Analysis
61 pages
COMPILER lab VIVA
No ratings yet
COMPILER lab VIVA
11 pages
V - Cse - CS3501 - CD - QB - Unit 1
No ratings yet
V - Cse - CS3501 - CD - QB - Unit 1
5 pages
CD Unit 1
No ratings yet
CD Unit 1
35 pages
Notes - IAE-1-CD
No ratings yet
Notes - IAE-1-CD
14 pages
CC Questions
No ratings yet
CC Questions
9 pages
PCD - Theory - Paper Solution - Nov - Dec - 2017
No ratings yet
PCD - Theory - Paper Solution - Nov - Dec - 2017
27 pages
Unit - 1 University Questions
No ratings yet
Unit - 1 University Questions
12 pages
Principles of Compiler Design
100% (2)
Principles of Compiler Design
35 pages
CAT Short Key Material
No ratings yet
CAT Short Key Material
38 pages
Chapter 2 - Lexical Analysis_Regular Expressions(1)
No ratings yet
Chapter 2 - Lexical Analysis_Regular Expressions(1)
27 pages
CSC 318 Class Notes
No ratings yet
CSC 318 Class Notes
21 pages
Overview of Compiler Environment Pass and Phase Phases of Compiler Regular Expression Lexical Analyzer LEX Tool Bootstrapping
No ratings yet
Overview of Compiler Environment Pass and Phase Phases of Compiler Regular Expression Lexical Analyzer LEX Tool Bootstrapping
35 pages
CD ppt1
No ratings yet
CD ppt1
62 pages
CS8602 Compiler Design Two Marks Questions 1
No ratings yet
CS8602 Compiler Design Two Marks Questions 1
22 pages
cd1
No ratings yet
cd1
92 pages
CD QB
No ratings yet
CD QB
9 pages
Unity University: Department of Computer Sciences
No ratings yet
Unity University: Department of Computer Sciences
4 pages
Compiler Design Two Marks
50% (2)
Compiler Design Two Marks
17 pages
SSCD Chapter3
No ratings yet
SSCD Chapter3
97 pages
Question bank part A, part B&C
No ratings yet
Question bank part A, part B&C
15 pages
Cs1352 Principles of Compiler Design
No ratings yet
Cs1352 Principles of Compiler Design
33 pages
PCD
No ratings yet
PCD
14 pages
Compiler Design
No ratings yet
Compiler Design
12 pages
Compiler Design
No ratings yet
Compiler Design
48 pages
Lexical and Code Generation
No ratings yet
Lexical and Code Generation
6 pages
Compiler Design 1
100% (1)
Compiler Design 1
30 pages
Compiler Design
No ratings yet
Compiler Design
19 pages
CD Lexical
No ratings yet
CD Lexical
26 pages
2_Lexical Analysis
No ratings yet
2_Lexical Analysis
52 pages
CS6660 CD QB PDF
No ratings yet
CS6660 CD QB PDF
27 pages
2
No ratings yet
2
109 pages
Compiler Module 1 Important Questions
No ratings yet
Compiler Module 1 Important Questions
14 pages
Unit I Introduction To Compiler: Question Bank
No ratings yet
Unit I Introduction To Compiler: Question Bank
7 pages
4 Lexical Analysis
No ratings yet
4 Lexical Analysis
60 pages
Compiler Design Question Bank - Unit I
No ratings yet
Compiler Design Question Bank - Unit I
8 pages
Compiler Designviva
No ratings yet
Compiler Designviva
12 pages
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
Coding for beginners The basic syntax and structure of coding
From Everand
Coding for beginners The basic syntax and structure of coding
Diamond Moore
No ratings yet
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
Python Programming Concepts
From Everand
Python Programming Concepts
MRB
No ratings yet
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
Python: Advanced Guide to Programming Code with Python: Python Computer Programming, #4
From Everand
Python: Advanced Guide to Programming Code with Python: Python Computer Programming, #4
Charlie Masterson
No ratings yet
CS8083 UNIT IV Notes
No ratings yet
CS8083 UNIT IV Notes
21 pages
Programming Constructs in MATLAB: Susovan Jana
No ratings yet
Programming Constructs in MATLAB: Susovan Jana
26 pages
Cis 311
No ratings yet
Cis 311
4 pages
Qube XP-D Series 2 Operations Manual v1.0 - English
100% (1)
Qube XP-D Series 2 Operations Manual v1.0 - English
91 pages
ZZ Xxs
No ratings yet
ZZ Xxs
2 pages
Problem Set 2
No ratings yet
Problem Set 2
3 pages
Chap 01 Real Analysis
No ratings yet
Chap 01 Real Analysis
3 pages
C for programmers 2nd ed Edition Harvey M. Deitel instant download
100% (1)
C for programmers 2nd ed Edition Harvey M. Deitel instant download
52 pages
SQL WMI Provider
No ratings yet
SQL WMI Provider
13 pages
WinCC V70 OS Antivirus Compatibility List
No ratings yet
WinCC V70 OS Antivirus Compatibility List
2 pages
Test Driven Lasse Koskela Chapter 2: Beginning TDD: Paul Ammann
No ratings yet
Test Driven Lasse Koskela Chapter 2: Beginning TDD: Paul Ammann
25 pages
402 Ada Lab Manual
No ratings yet
402 Ada Lab Manual
64 pages
Xii CS - Model Question Paper 8
No ratings yet
Xii CS - Model Question Paper 8
19 pages
Maximum Likelihood Estimation (MLE)
No ratings yet
Maximum Likelihood Estimation (MLE)
4 pages
9 Ed., Prentice Hall
No ratings yet
9 Ed., Prentice Hall
51 pages
5 Mil1
100% (1)
5 Mil1
34 pages
Use Tools in Efficient Way
No ratings yet
Use Tools in Efficient Way
139 pages
Creating Vagrant Boxes
No ratings yet
Creating Vagrant Boxes
8 pages
Stat Soft Registration
No ratings yet
Stat Soft Registration
3 pages
Azure Databricks Overview
100% (1)
Azure Databricks Overview
4 pages
What Is EZT+ ?
No ratings yet
What Is EZT+ ?
53 pages
VP Sales or VP Business Dev or Director of Sales or Director of
No ratings yet
VP Sales or VP Business Dev or Director of Sales or Director of
3 pages
Pankaj Chudasama
No ratings yet
Pankaj Chudasama
4 pages
Veriton M200 B350 - Desktop - 2017
No ratings yet
Veriton M200 B350 - Desktop - 2017
5 pages
File: /home/anonymous/.cache/.fr-09 Lymorphic - Shellcode - Engine - TXT Page 1 of 70
No ratings yet
File: /home/anonymous/.cache/.fr-09 Lymorphic - Shellcode - Engine - TXT Page 1 of 70
70 pages
Nikhil Resume
No ratings yet
Nikhil Resume
3 pages
VSAM - Basic Concepts - V2.0
No ratings yet
VSAM - Basic Concepts - V2.0
133 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

VMKV Engineering College Department of Computer Science & Engineering Principles of Compiler Design Unit I Part-A

Uploaded by

VMKV Engineering College Department of Computer Science & Engineering Principles of Compiler Design Unit I Part-A

Uploaded by

VMKV ENGINEERING COLLEGE

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

PRINCIPLES OF COMPILER DESIGN

2. Name the different phases of a compiler?

3. What is a symbol table?

4. What is a token and give examples?

5. What is lexeme? Give an example?

6. Write short note on error handler?

7. What is 3-address code and give example?

9. Differentiate star closure & positive closure?

Star closure Positive closure

The syntax tree for the above statement is,

it is also called as parsing . in this phase the token generated by the

11.What are the issues of the lexical analyzer?

13. Define parse tree

14. What are the two parts of compilation process?

15. List out the compiler construction tools?

19.What is meant by lexical analysis?

1. Explain specification of tokens in detail.

 a finite set of symbols (ASCII characters)

 Finite sequence of symbols on an alphabet

 sets of strings over some fixed alphabet

 Concatenation: xy represents the concatenation of strings x and y.

 prefix of s : a string abtained by removing zero or more trailing

o L+ = one or more occurance

 We use regular expressions to describe tokens of a programming

 Regular expressions over alphabet S

Reg. Expr Language it denotes

(r1) | (r2) L(r1) L(r2)

(r1) (r2) L(r1) L(r2)

 We may remove parentheses by using precedence rules.

 To write regular expression for some languages can be difficult,

d2  r2 ri is a regular expression over

 letter  A | B | ... | Z | a | b | ... | z

 If we try to write the regular expression representing identifiers

 Ex: Unsigned numbers in Pascal

1. Briefly explain about operate precedent parsing with example

E  E+E | E*E | E^E | id | (E)

precedence: ^ (right to left)

o A  A for some string 

 Top-down parsing techniques cannot handle left-recursive

AA|  where  does not start with A

 eliminate immediate left recursion

A A 1 | ... | A m | 1 | ... | n where 1 ... n do not start with

 eliminate immediate left recursion

A’  1 A’ | ... | m A’ |  an equivalent grammar

Immediate Left-Recursion – Example

 eliminate immediate left recursion

 A grammar cannot be immediately left-recursive, but it still can be

A Sc | d This grammar is not immediately left-recursive,

A  Sc  Aac causes to a left-recursion

Eliminate Left-Recursion – Example

- Replace A  Sd with A  Aad | bd

So, we will have A  Ac | Aad | bd | f

Eliminate Left-Recursion – Example2

- Replace S  Aa with S SdA’a | fA’a

So, we will have S  SdA’a | fA’a | b

So, the resulting equivalent grammar which is not left-recursive is:

2. Briefly explain about a predictive parsing with example

a grammar   a grammar suitable for

A  1 | ... | n input: ... a .......

 When we are trying to write the non-terminal stmt, if the current

Recursive Predictive Parsing

Ex: A  aBb (This is only the production rule for A)

When to apply -productions.

 If all other productions fail, we should apply an -production. For

Recursive Predictive Parsing (Example)

Non-Recursive Predictive Parsing -- LL(1) Parser

stack Non-recursive output

 a production rule representing a step of the derivation sequence

 at the bottom of the stack, there is a special end marker symbol $.

 a two-dimensional array M[A,a]

LL(1) Parser – Parser Actions

There are four possible parser actions.

 4. none of the above

LL(1) Parser – Example1

8. Draw the DAG for a:=b-c+b-c