Unit-2_25_09_2024
Unit-2_25_09_2024
Engineering &Technology
Dr. M.Gangappa
Associate Professor
Email: gangappa_m@vnrvjiet
<Web link of your created resource if any>
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 1
Overview of Compilation: Phases of Compilation – Lexical
Analysis, Pass and Phases of translation, interpretation,
bootstrapping, data structures in compilation.
Context-free Grammars and Parsing: Context free
grammars, derivation, parse trees, ambiguity, LL(K) grammars
and LL (1) parsing, bottom-up parsing, handle pruning, LR
Grammar Parsing, LALR parsing, YACC programming
specification.
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 2
1. Lexical analysis: This is the first phase of compiler which converts high level input
programs into sequence of tokens. These tokens are sequence of characters that are treated as
a unit in grammar of the programming language. It can be implemented with Deterministic finite
Automata. The output is a sequence of tokens sent to parser for syntax analysis.
Lexical Analyzer is also known as scanner. The process of lexical analysis is known as Linear
analysis or scanning.
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 3
Fig 1: Lexical Analyzer
Solution: 10.
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 4
MCQS
The lexical Analysis for any modern programming language such as Java needs the power
of which one of the below machine models in a sufficient and necessary sense?
1.Finite state Automata
2.Non-deterministic automata
3.Deterministic pushdown automata
4.Turing machine
Solution: (A)
Solution: (D)
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 5
In the compilation procedure, the Syntax analysis is the second stage. It is also known as Parsing or
Hierarchical analysis.. Basically, in the second phase, it analyses the syntactical structure and
inspects if the given input is correct or not in terms of programming syntax.
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 6
The semantic analysis phase of a compiler checks the meaning(type information) of the
statement. It is also known as type checking. Type checking is generally done at compile
time and at run time. If type checking is done at compile time, then it is called static type
checking other it is called dynamic type checking( type checking is done at run time).
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 7
The parse tree is semantically confirmed; now, an intermediate code generator develops
three address codes. A middle-level language code generated by a compiler at the time of
the translation of a source program into the object code is known as intermediate code or
text.
➢ A code that is neither high-level nor machine code, but a middle-level code is an
intermediate code.
➢ We can translate this code to machine code later.
➢ This stage serves as a bridge or way from analysis to synthesis.
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 8
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 9
Now coming to a phase that is totally optional, and it is code optimization. It is used to
enhance the intermediate code. This way, the output of the program is able to run fast and
consume less space. To improve the speed of the program, it eliminates the unnecessary
strings of the code and organizes the sequence of statements.
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 10
The final stage of the compilation process is the code generation process.
In this final phase, it tries to acquire the intermediate code as input which
is fully optimized and map it to the machine code or language. Later, the
code generator helps in translating the intermediate code into the machine
code.
Roles and Responsibilities:
➢ Translate the intermediate code to target machine code.
➢ Select and allocate memory spots and registers.
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 11
What is a Symbol Table?
The symbol table is mainly known as the data structure of the compiler. It helps in storing
the identifiers with their name and types. It makes it very easy to operate the searching
and fetching process.
The symbol table connects or interacts with all phases of the compiler and error handler
for updates. It is also accountable for scope management.
It stores:
➢ It stores the literal constants and strings.
➢ It helps in storing the function names.
➢ It also prefers to store variable names and constants.
➢ It stores labels in source languages.
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 12
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 13
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 14
▪You will understand the procedure of
LA.
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 15
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 16
Keyword
identifier
Comma operator
Identifier
Assignment
operator
Number
Comma operator
Identifier
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 17
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 18
Primary task of LA :
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 19
▪ The task of Lexical analyzer is divided into two sub tasks .
❑ scanner
❑ Lexical Analyzer
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 20
Example
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 21
Lexeme and tokens
Lexeme Token
int Keyword
maximum Identifier
( Operator
int Keyword
x Identifier
, Operator
int Keyword
Y Identifier
) Operator
{ Operator
If Keyword
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 22
Lexical Error
▪L A introduces a R.E to
determine a set of valid
characters which are called
lexemes.
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 23
The software by which the conversion of the high-level instructions is performed line-
by-line to machine-level language, other than compiler and assembler, is known
as INTERPRETER.
The interpreter in the compiler checks the source code line-by-line and if an error is
found on any line, it stops the execution until the error is resolved. Error correction is
quite easy for the interpreter as the interpreter provides a line-by-line error. But the
program takes more time to complete the execution successfully. It translates source
code into some efficient intermediate representation and executes them immediately.
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 25
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 26
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 27
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 28
Context Free Grammar (CFG)
Grammars are used to describe the syntax of a programming language. It specifies the
structure of expression and statements.
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 29
CFG
Context free grammar is also called as Type 2 grammar .
Definition
A context free grammar G is defined by four tuples as G=(V,T,P,S)
where,
G – Grammar
V – Set of variables
T – Set of Terminals
P – Set of productions
S – Start symbol
Production is of the form LHS ->RHS (or) head -> body, where head contains only one
non-terminal and body contains a collection of terminals and non-terminals.
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 31
Example
Let G be,
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 32
Derivation
❑A derivation is a sequence of sentential forms resulting from the application
of a sequence of productions
S→…→…
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 33
Derivation Example
▪ Grammar
E → E + E | E * E | (E) | int
▪ String
int * int + int
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 34
Derivation in Detail (1)
E
E
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 35
Derivation in Detail (2)
E
E
→ E+E
E + E
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 36
Derivation in Detail (3)
E
E
→ E+E
E + E
→ E*E+E
E * E
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 37
Derivation in Detail (4)
E
E
→ E+E
E + E
→ E*E+E
→ int * E + E
E * E
int
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 38
Derivation in Detail (5)
E
E
→ E+E
E + E
→ E*E+E
→ int * E + E
→ int * int + E E * E
int int
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 39
Derivation in Detail (6)
E
E
→ E+E
E + E
→ E*E+E
→ int * E + E
→ int * int + E E * E int
→ int * int + int
int int
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 40
Types of Derivations
❑Types of derivations :
E -> E + E | E * E | -E | (E) | id
Derivations for –(id+id)
E => -E => -(E) => -(E+E) => -(id+E)=>-(id+id)
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 41
Parse trees
▪ -(id+id) //this is input string
▪ E => -E => -(E) => -(E+E) => -(id+E)=>-(id+id) //this is derivation
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 42
Ambiguous grammar
❑For some strings there exist more than one parse tree
❑Or more than one leftmost derivation
❑Or more than one rightmost derivation
❑Example: id+id*id
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 43
left recursion
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 44
Left Recursion
❑Left recursion in a production may be removed by transforming the grammar in
the following way.
❑ Replace
A → Aa |
with
A → A'
A' → aA' |
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 45
Examples on Left Recursion
▪ Consider the left recursive grammar
E→E+T|T
T→T*F|F
F → (E) | id
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 46
Examples on Left Recursion
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 48
Left factor
Left factoring is a process by which the grammar with common
prefixe is transformed to make it useful for Top down parsers.
Top down parsers can not decide which production must be chosen to parse the
string in hand. To remove this confusion, we use left factoring.
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 49
Problem 1:
S → iEtS | iEtSeS | a
E→b
S → iEtSS’ / a
S’ → eS / ∈
E→b
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 50
Problem 2:
First() definition:
➢First(α) is a set of terminal symbols that begin the strings derived from α.
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 53
Rules for First function
▪ To compute First(X) for all grammar symbols X, apply following rules until no
more terminals or ɛ can be added to any First set:
1. If X is a terminal then First(X) = {X}.
2. If X is a nonterminal and X->Y1Y2…Yk s a production for some k>=1, then
place a in First(X) if for some i a is in First(Yi) and ɛ is in all of
First(Y1),…,First(Yi-1) that is Y1…Yi-1 => ɛ. if ɛ is in First(Yj) for j=1,…,k
then add ɛ to First(X).
3. If X-> ɛ is a production then First(X) = { ɛ }
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 54
Find the first() and follow() for the following:
1. S → ABCDE
A →b|€
B →c|€
C →d
D →e|€
E → g|f
2. S → Bb | Cd
B → aB | €
C → cC | €
3. S → ACB | CbB | Ba
A → ab | BC
B →g |€
C → h|€
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 55
Follow Function
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 57
The parser or syntactic analyzer obtains a sequence of tokens from the
lexical analyzer and verifies that the string can be generated by the
grammar for the source language. It reports any syntax errors in the
program. It also recovers from commonly occurring errors so that it can
continue processing its input.
1. It verifies the structure generated by the tokens based on the
grammar.
2. It constructs the parse tree.
3. It reports the errors.
4. It performs error recovery.
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 58
Parsing Techniques
Top down parser builds the parse tree from the top (root) to the
bottom While bottom up parsers start from the leaves and work
up to the root.
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 59
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 60
Top-Down parsing
Top-down parser attempts to construct the parse tree beginning from the
root node and working towards the leaves
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 61
Pre-processing Steps for predictive parser
Step 1 : Eliminate left recursion .
Step 2 : Consider un-ambiguous grammar
Step 3 : Do left-factor if the grammar has common
prefix .
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 62
Predictive parsing Table construction
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 63
Construct the predictive parsing Table for the grammar
Step1 : Compute First and Follow
E → TE’ Variable First Follow
E’ → +TE’ | ε E { ( , id } { ), $ }
T → FT’ E’ { +, ε } { ), $ }
T’ → *FT’ | ε T { ( , id } { + , ), $ }
F → ( E ) | id T’ { *, ε } { + , ), $ }
F { ( , id } { + ,*, ),$}
Step2 : Predictive parsing
Table Variable id + * ( ) $
E
E’
T
T’
F
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 64
Predictive parsing Table construction
E→b
Solution
Do left factor
S → iEtSS’ | a
S’ → eS| ϵ
E →b
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 65
LL(K) grammar
“A context free grammar is called LL (k) for left to right scan,
producing a leftmost derivation with k symbol lookahead if we
can always make a correct decision by checking at most the first
k symbols of W.”
Consider the grammar with the following productions.
S→aaB/aaC LL (k)
B→b
C→c
Which of the following option is true ?
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 67
Predictive parser or LL(1)
• A predictive parser can be built by maintaining a stack explicitly.
• The predictive parser has an input buffer, stack containing sequence
of grammar symbols, parsing table and an output stream.
• The input buffer contains the string to be parsed followed by $.
• Initially the stack contains start symbol of the grammar on the top
followed by $.
• The parsing table deterministically guesses the correct production
to be used.
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 68
Working principle of predictive parser or LL(1) parser
▪ A stack of grammar symbols ($ on the bottom)
▪ A string of input tokens ($ at the end)
▪ A parsing table, M[NT, T] of productions
▪ Algorithm:
1) if top == ‘input’ == $ then accept
2) if top == ‘input’ then
pop top of the stack; advance to next input symbol;
goto 1;
3) If top is ‘non-terminal’
if M[top, input] is a production then replace top with the production;
goto 1
else parsing error
4) else parsing error
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 69
Example of Parsing the input string
(1) E->TE’ Parsing Table:
(2) E’->+TE’
(3) E’->e
(4) T->FT’
(5) T’->*FT’
(6) T’->e
(7) F->(E)
(8) F->id
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 70
Stack Input string Output
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 71
Stack Input string Output
$E id + id$
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 72
Stack Input string Output
$E id + id$
$E’T id + id$ E → TE’
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 73
Stack Input string Output
$E id + id$
$E’T id + id$ E → TE’
$E’T’F id + id$ T → FT’
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 74
Stack Input string Output
$E id + id$
$E’T id + id$ E → TE’
$E’T’F id + id$ T → FT’
$E’T’id id + id$ F → id
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 75
Stack Input string Output
$E id + id$
$E’T id + id$ E → TE’
$E’T’F id + id$ T → FT’
$E’T’id id + id$ F → id
$E’T’ + id$ pop
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 76
Stack Input string Output
$E id + id$
$E’T id + id$ E → TE’
$E’T’F id + id$ T → FT’
$E’T’id id + id$ F → id
$E’T’ + id$ F → id
$E’ + id$ T’→ϵ
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 77
Stack Input string Output
$E id + id$
$E’T id + id$ E → TE’
$E’T’F id + id$ T → FT’
$E’T’id id + id$ F → id
$E’T’ + id$ F → id
$E’ + id$ T’→ϵ
$E’T+ + id$ E’→+TE’
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 78
Stack Input string Output
$E id + id$
$E’T id + id$ E → TE’
$E’T’F id + id$ T → FT’
$E’T’id id + id$ F → id
$E’T’ + id$ F → id
$E’ + id$ T’→ϵ
$E’T+ + id$ E’→+TE’
$E’T id$ pop
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 79
Stack Input string Output
$E id + id$
$E’T id + id$ E → TE’
$E’T’F id + id$ T → FT’
$E’T’id id + id$ F → id
$E’T’ + id$ F → id
$E’ + id$ T’→ϵ
$E’T+ + id$ E’→+TE’
$E’T id$ pop
$E’T’F id$ T→FT’
$E’T’id id$ F→id
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 80
Stack Input string Output
$E id + id$
$E’T id + id$ E → TE’
$E’T’F id + id$ T → FT’
$E’T’id id + id$ F → id
$E’T’ + id$ F → id
$E’ + id$ T’→ϵ
$E’T+ + id$ E’→+TE’
$E’T id$ pop
$E’T’F id$ T→FT’
$E’T’id id$ F→id
$E’T’ $ pop
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 81
Stack Input string Output
$E id + id$
$E’T id + id$ E → TE’
$E’T’F id + id$ T → FT’
$E’T’id id + id$ F → id
$E’T’ + id$ F → id
$E’ + id$ T’→ϵ
$E’T+ + id$ E’→+TE’
$E’T id$ pop
$E’T’F id$ T→FT’
$E’T’id id$ F→id
$E’T’ $ Pop
$E’ $ T’→ϵ
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 82
Stack Input string Output
$E id + id$
$E’T id + id$ E → TE’
$E’T’F id + id$ T → FT’
$E’T’id id + id$ F → id
$E’T’ + id$ F → id
$E’ + id$ T’→ϵ
$E’T+ + id$ E’→+TE’
$E’T id$ pop
$E’T’F id$ T→FT’
$E’T’id id$ F→id
$E’T’ $ Pop
$E’ $ T’→ϵ
$ $ E→ϵ
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 83
Stack Input string Output
$E id + id$
$E’T id + id$ E → TE’
$E’T’F id + id$ T → FT’
$E’T’id id + id$ F → id
$E’T’ + id$ F → id
$E’ + id$ T’→ϵ
$E’T+ + id$ E’→+TE’
$E’T id$ pop
$E’T’F id$ T→FT’
$E’T’id id$ F→id
$E’T’ $ Pop
$E’ $ T’→ϵ
$ $ E→ϵ
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 87
Check the grammar is LL(1) or not – example 2
E→b
Solution
Do left factor
S → iEtSS’ | a
S’ → eS| ϵ
E →b
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 88
LL(1) grammar – example 2
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 89
Checking LL(1) grammar
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 90
Bottom-Up parser
❑ Bottom-Up Parser : Constructs a parse tree for an input string beginning at the
leaves(the bottom) and working up towards the root(the top).
❑ Bottom-up parsing is also known as shift-reduce parsing because its two main actions
are shift and reduce.
❑ “handle” is a substring that matches the right side of the production, and whose
reduction to non-terminal on the left side of the production represents one step along
the reverse of a rightmost derivation.
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 91
Handle pruning
❑ Handle pruning is the process of detecting handle and applying reductions.
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 92
Handle pruning
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 93
Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 94
A Shift-Reduce Parser
❑There are four possible actions of a shift-reduce parser :
1.Shift : The next input symbol is shifted onto the top of the stack until
2.Reduce: Replace the handle on the top of the stack by the non-terminal.
4.Error: Parser discovers a syntax error, and calls an error recovery routine.
Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 95
A Stack Implementation of A Shift-Reduce Parser
Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 96
A Stack Implementation of A Shift-Reduce Parser
id id
Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 97
Bottom up parser can be constructed using one of the two
methods:
1. Operator precedence parser
2. LR parser
Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 98
LR Parsers
▪ LR-Parsers
❑covers wide range of grammars.
❑LR parsing methods
❑ SLR – simple LR parser
❑ CLR – Canonical parser
❑ LALR –look-head LR parser
❑SLR, LR and LALR work same (they used the same
algorithm), only their parsing tables are different.
Block diagram of LR Parser
input a1 ... ai ... an $
stack
Sm
Xm output
LR Parsing Algorithm
Sm-1
Xm-1
.
.
Action Table Goto Table
S1 terminals and $ non-terminal
X1 s s
t four different t each item is
S0 a actions a a state number
t t
e e
s s
A Configuration of LR Parsing Algorithm
▪ A configuration of a LR parsing is:
▪ Sm and ai decides the parser action by consulting the parsing action table. (Initial Stack contains just So )
▪ Sm and ai decides the parser action by consulting the parsing action table. (Initial Stack contains just So )
4. Error -- Parser detected an error (an empty entry in the action table)
Reduce Action
▪ pop 2|| (=r) items from the stack; let us assume that = Y1Y2...Yr
▪ then push A and s where s=goto[sm-r,A]
( So X1 S1 ... Xm-r Sm-r Y1 Sm-r+1 ...Yr Sm, ai ai+1 ... an $ ) ➔ ( So X1 S1 ... Xm-r Sm-r A s, ai ... an $ )
1) E → E+T state id + * ( ) $ E T F
2) E→T 0 s5 s4 1 2 3
3) T → T*F 1 s6 acc
4) T→F 2 r2 s7 r2 r2
5) F → (E)
3 r4 r4 r4 r4
6) F → id
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
9 r1 s7 r1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5
Parse the input string id*id+id by shift reduce( bottom up)
parser
stack input action output
0 id*id+id$ shift 5
Parse the input string id*id+id by shift reduce( bottom up) parser
Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 107
Parse the input string id*id+id by shift reduce( bottom up)
parser
stack input action output
0 id*id+id$ shift 5
0id5 *id+id$ reduce by F→id F→id
0F3 *id+id$ reduce by T→F T→F
Parse the input string id*id+id by shift reduce( bottom up)
parser
stack input action output
0 id*id+id$ shift 5
0id5 *id+id$ reduce by F→id F→id
0F3 *id+id$ reduce by T→F T→F
0T2 *id+id$ shift 7
Parse the input string id*id+id by shift reduce( bottom up)
parser
stack input action output
0 id*id+id$ shift 5
0id5 *id+id$ reduce by F→id F→id
0F3 *id+id$ reduce by T→F T→F
0T2 *id+id$ shift 7
0T2*7 id+id$ shift 5
Parse the input string id*id+id by shift reduce( bottom up)
parser
stack input action output
0 id*id+id$ shift 5
0id5 *id+id$ reduce by F→id F→id
0F3 *id+id$ reduce by T→F T→F
0T2 *id+id$ shift 7
0T2*7 id+id$ shift 5
0T2*7id5 +id$ reduce by F→id F→id
Parse the input string id*id+id by shift reduce( bottom up)
parser
stack input action output
0 id*id+id$ shift 5
0id5 *id+id$ reduce by F→id F→id
0F3 *id+id$ reduce by T→F T→F
0T2 *id+id$ shift 7
0T2*7 id+id$ shift 5
0T2*7id5 +id$ reduce by F→id F→id
0T2*7F10 +id$ reduce by T→T*F T→T*F
Parse the input string id*id+id by shift reduce( bottom up)
parser
stack input action output
0 id*id+id$ shift 5
0id5 *id+id$ reduce by F→id F→id
0F3 *id+id$ reduce by T→F T→F
0T2 *id+id$ shift 7
0T2*7 id+id$ shift 5
0T2*7id5 +id$ reduce by F→id F→id
0T2*7F10 +id$ reduce by T→T*F T→T*F
0T2 +id$ reduce by E→T E→T
Parse the input string id*id+id by shift reduce( bottom up)
parser
stack input action output
0 id*id+id$ shift 5
0id5 *id+id$ reduce by F→id F→id
0F3 *id+id$ reduce by T→F T→F
0T2 *id+id$ shift 7
0T2*7 id+id$ shift 5
0T2*7id5 +id$ reduce by F→id F→id
0T2*7F10 +id$ reduce by T→T*F T→T*F
0T2 +id$ reduce by E→T E→T
0E1 +id$ shift 6
Parse the input string id*id+id by shift reduce( bottom up)
parser
stack input action output
0 id*id+id$ shift 5
0id5 *id+id$ reduce by F→id F→id
0F3 *id+id$ reduce by T→F T→F
0T2 *id+id$ shift 7
0T2*7 id+id$ shift 5
0T2*7id5 +id$ reduce by F→id F→id
0T2*7F10 +id$ reduce by T→T*F T→T*F
0T2 +id$ reduce by E→T E→T
0E1 +id$ shift 6
0E1+6 id$ shift 5
Parse the input string id*id+id by shift reduce( bottom up)
parser
stack input action output
0 id*id+id$ shift 5
0id5 *id+id$ reduce by F→id F→id
0F3 *id+id$ reduce by T→F T→F
0T2 *id+id$ shift 7
0T2*7 id+id$ shift 5
0T2*7id5 +id$ reduce by F→id F→id
0T2*7F10 +id$ reduce by T→T*F T→T*F
0T2 +id$ reduce by E→T E→T
0E1 +id$ shift 6
0E1+6 id$ shift 5
0E1+6id5 $ reduce by F→id F→id
Parse the input string id*id+id by shift reduce( bottom up)
parser
stack input action output
0 id*id+id$ shift 5
0id5 *id+id$ reduce by F→id F→id
0F3 *id+id$ reduce by T→F T→F
0T2 *id+id$ shift 7
0T2*7 id+id$ shift 5
0T2*7id5 +id$ reduce by F→id F→id
0T2*7F10 +id$ reduce by T→T*F T→T*F
0T2 +id$ reduce by E→T E→T
0E1 +id$ shift 6
0E1+6 id$ shift 5
0E1+6id5 $ reduce by F→id F→id
0E1+6F3 $ reduce by T→F T→F
Parse the input string id*id+id by shift reduce( bottom up)
parser
stack input action output
0 id*id+id$ shift 5
0id5 *id+id$ reduce by F→id F→id
0F3 *id+id$ reduce by T→F T→F
0T2 *id+id$ shift 7
0T2*7 id+id$ shift 5
0T2*7id5 +id$ reduce by F→id F→id
0T2*7F10 +id$ reduce by T→T*F T→T*F
0T2 +id$ reduce by E→T E→T
0E1 +id$ shift 6
0E1+6 id$ shift 5
0E1+6id5 $ reduce by F→id F→id
0E1+6F3 $ reduce by T→F T→F
0E1+6T9 $ reduce by E→E+T E→E+T
Parse the input string id*id+id by shift reduce( bottom up)
parser
stack input action output
0 id*id+id$ shift 5
0id5 *id+id$ reduce by F→id F→id
0F3 *id+id$ reduce by T→F T→F
0T2 *id+id$ shift 7
0T2*7 id+id$ shift 5
0T2*7id5 +id$ reduce by F→id F→id
0T2*7F10 +id$ reduce by T→T*F T→T*F
0T2 +id$ reduce by E→T E→T
0E1 +id$ shift 6
0E1+6 id$ shift 5
0E1+6id5 $ reduce by F→id F→id
0E1+6F3 $ reduce by T→F T→F
0E1+6T9 $ reduce by E→E+T E→E+T
0E1 $ accept
Constructing SLR Parsing Tables – LR(0) Item
▪ An LR(0) item of a grammar G is a production of G a dot at the some position of the right side.
▪ Ex: A → aBb Possible LR(0) Items:
(four different possibility)
.
A → aBb
..
A → a Bb
A → aB b
A → aBb .
▪ Sets of LR(0) items will be the states of action and goto table of the SLR parser.
▪ A collection of sets of LR(0) items (the canonical LR(0) collection) is the basis for constructing SLR
parsers.
▪ Augmented Grammar:
G’ is G with a new production rule S’→S where S’ is the new starting symbol.
The Closure Operation
▪ If I is a set of LR(0) items for a grammar G, then closure(I) is the set of LR(0) items constructed from I
by the two rules:
.
1. Initially, every LR(0) item in I is added to closure(I).
.
2. If A → a B is in closure(I) and B→ is a production rule of G; then B→ will be in the
closure(I). We will apply this rule until no more new LR(0) items can be added to closure(I).
Goto Operation
▪ If I is a set of LR(0) items and X is a grammar symbol (terminal or non-terminal), then goto(I,X) is defined as
follows:
. .
▪ If A → a X in I then every item in closure({A → aX }) will be in goto(I,X).
▪ If I is the set of items that are valid for some viable prefix , then goto(I,X) is the set of items that are valid
for the viable prefix X.
Example:
I ={ .. .. .
E’ → E, E → E+T, E → T,
. ...
T → T*F, T → F,
F → (E), F → id }
.. .
goto(I,E) = { E’ → E , E → E +T }
goto(I,T) = { E → T , T → T *F }
.. . . . .
goto(I,F) = {T → F }
goto(I,() = { F → ( E), E → E+T, E → T, T → T*F, T → . F,
.F → (E), F → id }
goto(I,id) = { F → id }
The Canonical LR(0) Collection -- Example
I0: E’ → .E I1: E’ → E. I6: E → E+.T I9: E → E+T.
E → .E+T E → E.+T T → .T*F T → T.*F
E → .T T → .F
T → .T*F I2: E → T. F → .(E) I10: T → T*F.
T → .F T → T.*F F → .id
F → .(E)
F → .id I3: T → F. I7: T → T*.F I11: F → (E).
F → .(E)
I4: F → (.E) F → .id
E → .E+T
E → .T I8: F → (E.)
T → .T*F E → E.+T
T → .F
F → .(E)
F → .id
I5: F → id.
Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 124
Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 125
SLR parser
Steps:
1. Augment the given grammar.
2. Draw the cananical collection of LR(o) items.
a) apply closure function
b) apply GOTO function
3. Draw the GOTO graph(DFA)
4. Create the parsing table
5. Stack implementation
6. Draw parse tree
Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 126
Constructing SLR Parsing Table
(of an augumented grammar G’)
1. Construct the canonical collection of sets of LR(0) items for G’. C{I0,...,In}
0
1
2
3
4
5
6
8
9
10
11 Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 129
SLR Parsing Tables of Expression Grammar
Action Table Goto Table
state id + * ( ) $ E T F
0 s5 s4 1 2 3
1 s6 acc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
9 r1 s7 r1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5
Next PDF includes the rest of Second unit
topics
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 131
References
▪ Compilers principles ,tools and techniques by
Aho, Sethi, and Ullman, Chapters 1, 2, 3
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 132
THANK YOU
by Dr M.Gangappa , Department of Computer Science & Engineering, VNRVJIET, Hyderabad September 25, 2024 133