0% found this document useful (0 votes)

69 views29 pages

CD Unit-3 (1) (R20)

compiler design unit3 part1

Uploaded by

Rajaji

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

69 views29 pages

CD Unit-3 (1) (R20)

compiler design unit3 part1

Uploaded by

Rajaji

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

COMPILER DESIGN

UNIT III
Bottom Up Parsing

Introduction, Difference between LR and LL Parsers, Types of LR Parsers, Shift

Reduce Parsing, SLR Parsers, Construction of SLR Parsing Tables, More
Powerful LR Parses, Construction of CLR (1) and LALR Parsing Tables,
Dangling Else Ambiguity, Error Recovery in LR Parsing, Handling Ambiguity
Grammar with LR Parsers.

Introduction

Bottom-Up parsing is applied in the syntax analysis phase of the compiler. Bottom-up parsing
parses the stream of tokens from the lexical analyzer. And after parsing the input string it generates
a parse tree.
The bottom-up parser builds a parse tree from the leaf nodes and proceeds towards the root node of
the tree. In this section, we will be discussing bottom-up parsing along with its types.

Bottom-up parsing pareses the input string from the lexical analyzer. And if there is no error in the
input string it constructs a parse tree as output. The bottom-up parsing starts building parse trees
from the leaf nodes i.e., from the bottom of the tree. And gradually proceeds upwards to the root of
the parse tree.
The bottom-up parsers are created for the largest class of LR grammars. As the bottom-up parsing
corresponds to the process of reducing the string to the starting symbol of the grammar.
Step of Reducing a String:

1. A specific substring from the input string is identified.

2. A non-terminal in a grammar whose production body matches the substring is identified.
3. The substring in the input string is replaced by the non-terminal identified in step 2.

The main problem of bottom-up parsing is to decide:

1 Dr.SGIET ,Markapur .CSE Department

COMPILER DESIGN
 When to reduce which substring from the input string.
 Which production from the grammar must be applied.

The reduction process is just the reverse of derivation that we have seen in top-down parsing. Thus,
the bottom-up parsing derives the input string reverse.

Top-down Bottom-up
1. Construct tree from root to leaves 1. Construct tree from leaves to root
2. “Guers” which RHS to substitute 2. “Guers” which rule to “reduce”terminals
for nonterminal
3. Produces left-most derivation 3. Produces reverse right-most derivation.
4. Recursive descent, LL parsers 4. Shift-reduce, LR, LALR, etc.
5. Easy for humans 5. “Harder” for humans

Classification of Bottom-up parsing

Bottom-up parsing has been classified into various parsing. These are as follows:
1. Shift-Reduce Parsing
2. Operator Precedence Parsing
3. Table Driven LR Parsing

Classification of Bottom-up parsing

2 Dr.SGIET ,Markapur .CSE Department

COMPILER DESIGN
Difference between LR and LL Parsers

LL LR

Does a leftmost derivation. Does a rightmost derivation in reverse.

Starts with the root nonterminal on the stack. Ends with the root nonterminal on the stack.

Ends when the stack is empty. Starts with an empty stack.

Uses the stack for designating what is still to Uses the stack for designating what is already
be expected. seen.

Builds the parse tree top-down. Builds the parse tree bottom-up.

Continuously pops a nonterminal off the Tries to recognize a right hand side on the stack,
stack, and pushes the corresponding right pops it, and pushes the corresponding
hand side. nonterminal.

Expands the non-terminals. Reduces the non-terminals.

Reads the terminals when it pops one off the Reads the terminals while it pushes them on the
stack. stack.

Pre-order traversal of the parse tree. Post-order traversal of the parse tree.

Types of LR Parsers

LR parsing is also a category of Bottom-up parsing. It is generally used to parse the class of
grammars whose size is large. In the LR parsing, "L" stands for scanning of the input left-to-right,
and "R" stands for constructing a rightmost derivation in a reverse way.
"K" stands for the count of input symbols of the look-ahead that are used to make many parsing
decisions.

3 Dr.SGIET ,Markapur .CSE Department

COMPILER DESIGN
The LR parser has the following structure:

Similar to predictive parsing the end of the input buffer and end of stack has $.

 The input buffer has the input string that has to be parsed.
 The stack maintains the sequence of grammar symbols while parsing the input string.
 The parsing table is a two-dimensional array that has two entries ‘Go To’ and ‘Action’.

The L stands for scan operations of Bottom-up parsing left-to-right, and R stands for scan right-to-
left in Bottom-up parsing.
LR bottom-up parsing parsers have several advantages of Bottom-up parsing like:

 Used in many programming languages except Perl and C.

 Efficient implementation.
 In L-operations, they detect quickly any syntactic errors present in a bottom-up parsing
example.

LR parsing is divided into four categories of Parsing:

 LR (0) parsing
 SLR parsing
 CLR parsing
 LALR parsing

4 Dr.SGIET ,Markapur .CSE Department

COMPILER DESIGN
Types of LR parser
LR algorithm

The LR algorithm requires input, output, stack, and parsing tables. In all types of LR parsing, input,
output, and stack are the same, but the parsing table is different. The input buffer indicates the end
of input data, and it has the string to be parsed. The symbol of "$ follows that string."A stack is used
to contain grammar symbols' sequence with the symbol $ at the stack's Bottom.
A parsing table can be defined as an array of two-dimension. It usually contains two parts: the
action part and the go-to part.
LR (1) Parsing
The various steps involved in the LR (1) Parsing are as follows:

 At first, context-free grammar is written for the given input.

 The ambiguity of the grammar is checked.
 Augment production is added in the given grammar.
 A canonical collection of LR (0) items are created.
 A data flow diagram is drawn.
 An LR (1) parsing table is created.

Augment Grammar
Augmented grammar will be generated if we add one more product in the given grammar G. It helps
the parser identify when to stop the parsing and announce the acceptance of the input.

Shift Reduce Parsing

Shift reduce parsing is the most general form of bottom-up parsing. Here we have an input
buffer that holds the input string that is scanned by the parser from left to right. There is also a stack that is
used to hold the grammar symbols during parsing.

The bottom of the stack and the right end of the input buffer is marked with the $. Initially, before
the parsing starts:

 The input buffer holds the input string provided by the lexical analyzer.
 The stack is empty.

As the parser parses the string from left to right then it shifts zero or more input symbols onto the
stack.
The parser continues to shift the input symbol onto the stack until it is filled with a substring. A
substring that matches the production body of a nonterminal in the grammar. Then the substring is
replaced or reduced by the appropriate nonterminal.

The parser continues shift-reducing until either of the following condition occurs:

 It identifies an error
 The stack contains the start symbol of the grammar and the input buffer becomes empty.

5 Dr.SGIET ,Markapur .CSE Department

COMPILER DESIGN
To perform the shift-reduce parsing we have to perform the following four functions.
Four Functions of Shift Reduce Parsing:

1. Shift: This action shifts the next input symbol present on the input buffer onto the top of the
stack.
2. Reduce: This action is performed if the top of the stack has an input symbol that denotes a right
end of a substring. And within the stack there exist the left end of the substring. The reduce
action replaces the entire substring with the appropriate non-terminal. The production body of
this non-terminal matches the replaced substring.
3. Accept: When at the end of parsing the input buffer becomes empty. And the stack has left with
the start symbol of the grammar. The parser announces the successful completion of parsing.
4. Error: This action identifies the error and performs an error recovery routine.

Let us take an example of the shift-reduce parser. Consider that we have a string id * id + id and the
grammar for the input string is:
E ->E + T | T
T -> T * F | F
F -> ( E ) | id

Note: In the shift-reduce parsing a handle always appears on the top of the stack. The handle is a
substring that matches the body of production in the grammar. The handle must never appear inside
the stack.

Shift reduce parsing is a process to reduce a string to its grammar start symbol. It uses a stack to
hold the grammar and an input tape to hold the string. Shift reduce parsing performs the two

6 Dr.SGIET ,Markapur .CSE Department

COMPILER DESIGN
actions: shift and reduce. This is why it is known as shift-reduce parsing. The current symbol in the
input string is pushed to a stack at the shift action. At each reduction, the symbols will be replaced
by the non-terminals. The non-terminal is the left side of the output, and the symbol is the right side
of the production.

Example
Grammar:
1. S → S+S
2. S → S-S
3. S → (S)
4. S → a
Input string: a1-(a2+a3)
Parsing table

Stack contents Input String Actions

$ a1-(a2+a3)$ Shift a1
$a1 -(a2+a3)$ Reduce by s->a
$S -(a2+a3)$ Shift -
$S- (a2+a3)$ Shift (
$S-( a2+a3)$ Shift a2
$S-(a2 +a3)$ Reduce by S->a
$S-(S +a3)$ Shift +
$S-(S+ a3)$ Shift a3
$S-(S+a3 )$ Reduce by S->a
$S-(S+S )$ shift)
$S-(S+S) $ Reduce by S->S+S
$S-(S) $ Reduce by S->(S)
$S-S $ Reduce by S->S-S
$S $ Accept
Operator Precedence Parsing

Operator precedence grammar is a category in the shift-reduce method of parsing. It is applied to a

class of grammars operators. Operator precedence grammar must have the following two properties:

 No RHS of any product has a∈.

 Two non-terminals must not be adjacent.

7 Dr.SGIET ,Markapur .CSE Department

COMPILER DESIGN
Operator precedence can only be fixed between the terminals of the grammar. It generally ignores
the non-terminal.

There are the three operator precedence relations:

a ⋗ b means that terminal "a" has higher precedence than terminal "b."
a ⋖ b means that terminal "a" has lower priority than terminal "b."
a ≐ b means that the terminal "a" and "b" both have the same precedence.
Precedence table
Following is a precedence table according to which grammar of the operator precedence parsing
works.

Operator precedence relation

Parsing Action

The sequence in which parsing action is performed in operator precedence parsing is as:
 At first, the $ symbol is added to both ends of the string.
 Now we scan the input string from left to right until the character ⋗ is encountered.
 Scanning is done towards leftover all the equal precedence until the first leftmost ⋖ is
encountered.
 Everything between leftmost ⋖ and rightmost ⋗ is a handle.
 If it is $ on $, then it means parsing is successful.

Table-driven LR Parsing
LR parsing is also a category of Bottom-up parsing. It is generally used to parse the class of
grammars whose size is large. In the LR parsing, "L" stands for scanning of the input left-to-right,
and "R" stands for constructing a rightmost derivation in a reverse way.
"K" stands for the count of input symbols of the look-ahead that are used to make many parsing
decisions.
Types of LR Parsers:
1. LR( 1 )
2. SLR( 1 )
3. CLR ( 1 )
4. LALR( 1 )
The L stands for scan operations of Bottom-up parsing left-to-right, and R stands for scan right-to-
left in Bottom-up parsing.

8 Dr.SGIET ,Markapur .CSE Department

COMPILER DESIGN

SLR Parsers

SLR is simple LR. It is the smallest class of grammar having few number of states. SLR is very
easy to construct and is similar to LR parsing. The only difference between SLR parser and LR(0)
parser is that in LR(0) parsing table, there’s a chance of ‘shift reduced’ conflict because we are
entering ‘reduce’ corresponding to all terminal states. We can solve this problem by entering
‘reduce’ corresponding to FOLLOW of LHS of production in the terminating state. This is called
SLR(1) collection of items
Construction of SLR Parsing Tables
Steps for constructing the SLR parsing table :
1. Writing augmented grammar
2. LR(0) collection of items to be found
3. Find FOLLOW of LHS of production
4. Defining 2 functions:goto[list of terminals] and action[list of non-terminals] in the parsing
table
EXAMPLE – Construct LR parsing table for the given context-free grammar

S–>AA
A–>aA|b

Solution:

STEP1 – Find augmented grammar

The augmented grammar of the given grammar is:-
S’–>.S [0th production]
S–>.AA [1st production]
A–>.aA [2nd production]
A–>.b [3rd production]
STEP2 – Find LR(0) collection of items
Below is the figure showing the LR(0) collection of items. We will understand everything one by
one.

9 Dr.SGIET ,Markapur .CSE Department

COMPILER DESIGN

The terminals of this grammar are {a,b}.

The non-terminals of this grammar are {S,A}
RULE –
If any non-terminal has ‘ . ‘ preceding it, we have to write all its production and add ‘ .
‘ preceding each of its production.
RULE –
from each state to the next state, the ‘ . ‘ shifts to one place to the right.
 In the figure, I0 consists of augmented grammar.
 Io goes to I1 when ‘ . ‘ of 0th production is shifted towards the right of S(S’->S.). this state is
the accepted state. S is seen by the compiler.
 Io goes to I2 when ‘ . ‘ of 1st production is shifted towards right (S->A.A) . A is seen by the
compiler
 I0 goes to I3 when ‘ . ‘ of the 2nd production is shifted towards right (A->a.A) . a is seen by
the compiler.
 I0 goes to I4 when ‘ . ‘ of the 3rd production is shifted towards right (A->b.) . b is seen by
the compiler.
 I2 goes to I5 when ‘ . ‘ of 1st production is shifted towards right (S->AA.) . A is seen by the
compiler
 I2 goes to I4 when ‘ . ‘ of 3rd production is shifted towards right (A->b.) . b is seen by the
compiler.
 I2 goes to I3 when ‘ . ‘ of the 2nd production is shifted towards right (A->a.A) . a is seen by
the compiler.
 I3 goes to I4 when ‘ . ‘ of the 3rd production is shifted towards right (A->b.) . b is seen by
the compiler.
 I3 goes to I6 when ‘ . ‘ of 2nd production is shifted towards the right (A->aA.) . A is seen by
the compiler
 I3 goes to I3 when ‘ . ‘ of the 2nd production is shifted towards right (A->a.A) . a is seen by
the compiler.
STEP3 –
Find FOLLOW of LHS of production
FOLLOW(S)=$
FOLLOW(A)=a,b,$
To find FOLLOW of non-terminals,

10 Dr.SGIET ,Markapur .CSE Department

COMPILER DESIGN
STEP 4-
Defining 2 functions:goto[list of non-terminals] and action[list of terminals] in the parsing table.
Below is the SLR parsing table.

 $ is by default a nonterminal that takes accepting state.

 0,1,2,3,4,5,6 denotes I0,I1,I2,I3,I4,I5,I6
 I0 gives A in I2, so 2 is added to the A column and 0 rows.
 I0 gives S in I1,so 1 is added to the S column and 1 row.
 similarly 5 is written in A column and 2 row, 6 is written in A column and 3 row.
 I0 gives a in I3 .so S3(shift 3) is added to a column and 0 row.
 I0 gives b in I4 .so S4(shift 4) is added to the b column and 0 row.
 Similarly, S3(shift 3) is added on a column and 2,3 row ,S4(shift 4) is added on b column and
2,3 rows.
 I4 is reduced state as ‘ . ‘ is at the end. I4 is the 3rd production of grammar(A–>.b).LHS of
this production is A. FOLLOW(A)=a,b,$ . write r3(reduced 3) in the columns of a,b,$ and
4th row
 I5 is reduced state as ‘ . ‘ is at the end. I5 is the 1st production of grammar(S–>.AA). LHS of
this production is S.
FOLLOW(S)=$ . write r1(reduced 1) in the column of $ and 5th row
 I6 is a reduced state as ‘ . ‘ is at the end. I6 is the 2nd production of grammar( A–>.aA). The
LHS of this production is A.
FOLLOW(A)=a,b,$ . write r2(reduced 2) in the columns of a,b,$ and 6th row

More Powerful LR Parses

The SLR Parser discussed in the earlier class has certain flaws.
1.On single input, State may be included a Final Item and a Non- Final Item. This may result in a
Shift-Reduce Conflict .
2.A State may be included Two Different Final Items. This might result in a Reduce-Reduce Conflict
3.SLR(1) Parser reduces only when the next token is in Follow of the left-hand side of the production.
4.SLR(1) can reduce shift-reduce conflicts but not reduce-reduce conflicts

11 Dr.SGIET ,Markapur .CSE Department

COMPILER DESIGN
These two conflicts are reduced by CLR(1) Parser by keeping track of lookahead information in the
states of the parser.
This is also called as LR(1) grammar
LR(1) Parser greatly increases the strength of the parser, but also the size of its parse tables.
The LR(1) techniques does not rely on FOLLOW sets, but it keeps the Specific Look-ahead with each
item.

 A Grammar that is SLR(1) is definitely LALR(1).

 A Grammar that is not SLR(1) may or may not be LALR(1) depending on whether the more
precise lookaheads resolve the SLR(1) conflicts.
 LALR(1) Parser has proven to be the most used variant of the LR family.
 LALR(1) parsers and most programming language constructs can be described with an LALR(1)
grammar

Construction of CLR (1) and LALR Parsing Tables

CLR refers to canonical lookahead. CLR parsing use the canonical collection of LR (1) items to build
the CLR (1) parsing table. CLR (1) parsing table produces the more number of states as compare to
the SLR (1) parsing.

In the CLR (1), we place the reduce node only in the lookahead symbols.

Various steps involved in the CLR (1) Parsing:

o For the given input string write a context free grammar

o Check the ambiguity of the grammar
o Add Augment production in the given grammar
o Create Canonical collection of LR (0) items
o Draw a data flow diagram (DFA)
o Construct a CLR (1) parsing table

LR (1) item

LR (1) item is a collection of LR (0) items and a look ahead symbol.

LR (1) item = LR (0) item + look ahead

12 Dr.SGIET ,Markapur .CSE Department

COMPILER DESIGN
The look ahead is used to determine that where we place the final item.

The look ahead always add $ symbol for the argument production.

Example

CLR ( 1 ) Grammar

1. S → AA
2. A → aA
3. A → b

Add Augment Production, insert '•' symbol at the first position for every production in G and also add
the lookahead.

1) S` → •S, $
2) S → •AA, $
3) A → •aA, a/b
4) A → •b, a/b

I0 State:

Add Augment production to the I0 State and Compute the Closure

I0 = Closure (S` → •S)

Add all productions starting with S in to I0 State because "." is followed by the non-terminal. So, the
I0 State becomes

I0 = S` → •S, $
S → •AA, $

Add all productions starting with A in modified I0 State because "." is followed by the non-terminal.
So, the I0 State becomes.

I0= S` → •S, $
S → •AA, $
A → •aA, a/b
A → •b, a/b

I1= Go to (I0, S) = closure (S` → S•, $) = S` → S•, $

I2= Go to (I0, A) = closure ( S → A•A, $ )

Add all productions starting with A in I2 State because "." is followed by the non-terminal. So, the I2
State becomes

13 Dr.SGIET ,Markapur .CSE Department

COMPILER DESIGN
I2= S → A•A, $
A → •aA, $
A → •b, $

I3= Go to (I0, a) = Closure ( A → a•A, a/b )

Add all productions starting with A in I3 State because "." is followed by the non-terminal. So, the I3
State becomes

I3= A → a•A, a/b

A → •aA, a/b
A → •b, a/b

Go to (I3, a) = Closure (A → a•A, a/b) = (same as I3)

Go to (I3, b) = Closure (A → b•, a/b) = (same as I4)

I4= Go to (I0, b) = closure ( A → b•, a/b) = A → b•, a/b

I5= Go to (I2, A) = Closure (S → AA•, $) =S → AA•, $
I6= Go to (I2, a) = Closure (A → a•A, $)

Add all productions starting with A in I6 State because "." is followed by the non-terminal. So, the I6
State becomes

I6 = A → a•A, $
A → •aA, $
A → •b, $

Go to (I6, a) = Closure (A → a•A, $) = (same as I6)

Go to (I6, b) = Closure (A → b•, $) = (same as I7)

I7= Go to (I2, b) = Closure (A → b•, $) = A → b•, $

I8= Go to (I3, A) = Closure (A → aA•, a/b) = A → aA•, a/b
I9= Go to (I6, A) = Closure (A → aA•, $) = A → aA•, $

Drawing DFA:

14 Dr.SGIET ,Markapur .CSE Department

COMPILER DESIGN
CLR (1) Parsing table:

Productions are numbered as follows:

1. S → AA ... (1)
2. A → aA ....(2)
3. A → b ... (3)

The placement of shift node in CLR (1) parsing table is same as the SLR (1) parsing table. Only
difference in the placement of reduce node.

I4 contains the final item which drives ( A → b•, a/b), so action {I4, a} = R3, action {I4, b} = R3.
I5 contains the final item which drives ( S → AA•, $), so action {I5, $} = R1.
I7 contains the final item which drives ( A → b•,$), so action {I7, $} = R3.
I8 contains the final item which drives ( A → aA•, a/b), so action {I8, a} = R2, action {I8, b} = R2.
I9 contains the final item which drives ( A → aA•, $), so action {I9, $} = R2.

The method for building the collection of sets of valid LR(1) items is essentially the same as the one
for building the canonical collection of sets of LR(0) items. We need only to modify the two
procedures CLOSURE and GOTO.

15 Dr.SGIET ,Markapur .CSE Department

COMPILER DESIGN

LALR Parsing Tables

We now introduce our last parser construction method, the L A L R (lookahead-L R ) technique. This
method is often used in practice, because the tables obtained by it are considerably smaller than the
canonical LR tables, yet most common syntactic tained by it are considerably smaller than the
canonical LR tables, yet most common syntactic constructs of programming languages can be
expressed con-veniently by an L A L R grammar. The same is almost true for S L R grammars, but
there are a few constructs that cannot be conveniently handled by S L R techniques .
For a comparison of parser size, the S L R and L A L R tables for a grammar always have the same
number of states, and this number is typically several hundred states for a language like C. The
canonical LR table would typically have several thousand states for the same-size language. Thus, it
is much easier and more economical to construct S L R and L A L R tables than the canonical L R
tables.
By way of introduction, let us again consider grammar (4.55), whose sets of L R ( 1 ) items were
shown in Fig. 4.41. Take a pair of similar looking states, such as I4 and I 7 . Each of these states has

16 Dr.SGIET ,Markapur .CSE Department

COMPILER DESIGN
only items with first component C -> d-. In I 4 , the lookaheads are c or d; in I 7 , $ is the only
lookahead.
To see the difference between the roles of I4 and I7 in the parser, note that the grammar generates the
regular language c*dc*d. When reading an input cc • • • cdcc • • cd, the parser shifts the first group
of c's and their following d onto the stack, entering state 4 after reading the d. The parser then calls
for a reduction by C -» d, provided the next input symbol is c or d. The requirement that c or d follow
makes sense, since these are the symbols that could begin strings in c*d. If $ follows the first d, we
have an input like ccd, which is not in the language, and state 4 correctly declares an error if $ is the
next input.
The parser enters state 7 after reading the second d. Then, the parser must see $ on the input, or it
started with a string not of the form c*dc*d. It thus makes sense that state 7 should reduce by C -»
d on input $ and declare error on inputs c or d.
Let us now replace I4 and I7 by I47, the union of I4 and I7, consisting of the set of three items
represented by [C ->• d-, c/d/$]. The goto's on d to I4 or I7 from I 0 , I2, I3, and I6 now enter I47 .
The action of state 47 is to reduce on any input. The revised parser behaves essentially like the original,
although it might reduce d to C in circumstances where the original would declare error, for example,
on input like ccd or cdcdc. The error will eventually be caught; in fact, it will be caught before any
more input symbols are shifted.
More generally, we can look for sets of LR(1) items having the same core, that is, set of first
components, and we may merge these sets with common cores into one set of items. I4 and Ij form
such a pair, with
Since the core of GOTO(I,X) depends only on the core of /, the goto's of merged sets can themselves
be merged. Thus, there is no problem revising the goto function as we merge sets of items. The action
functions are modified to reflect the non-error actions of all sets of items in the merger.
Suppose we have an LR(1) grammar, that is, one whose sets of LR(1) items produce no parsing-action
conflicts. If we replace all states having the same core with their union, it is possible that the resulting
union will have a conflict, but it is unlikely for the following reason: Suppose in the union there is a
conflict on lookahead a because there is an item [A ->• a-, a] calling for a reduction by A -» a, and
there is another item [B -» abg, b] calling for a shift. Then some set of items from which the union
was formed has item [A -» a-, a], and since the cores of all these states are the same, it must have an
item [B -> abg,c] for some c. But then this state has the same shift/reduce conflict on a, and the
grammar was not LR(1) as we assumed. Thus, the merging of states with common cores can never
produce a shift/reduce conflict that was not present in one of the original states, because shift actions
depend only on the core, not the lookahead.
It is possible, however, that a merger will produce a reduce/reduce conflict, as the following example
shows.
Example
Consider the grammar

17 Dr.SGIET ,Markapur .CSE Department

COMPILER DESIGN

which generates the four strings acd, ace, bed, and bee. The reader can check that the grammar is
LR(1) by constructing the sets of items. Upon doing so,

generates a reduce/reduce conflict, since reductions by both A -> c and B -> c are called for on inputs
d and e.
We are now prepared to give the first of two LALR table-construction al-gorithms. The general idea
is to construct the sets of LR(1) items, and if no conflicts arise, merge sets with common cores. We
then construct the parsing table from the collection of merged sets of items. The method we are about
to describe serves primarily as a definition of LALR(l) grammars. Constructing the entire collection
of LR(1) sets of items requires too much space and time to be useful in practice.
INPUT : An augmented grammar G'.
OUTPUT : The LALR parsing-table functions ACTION and GOTO for G'.
METHOD :

The table produced by Algorithm 4.59 is called the LALR parsing table for G. If there are no parsing
action conflicts, then the given grammar is said to be an LALR(l) grammar. The collection of sets
of items constructed in step (3) is called the LALR(l) collection.
Example

18 Dr.SGIET ,Markapur .CSE Department

COMPILER DESIGN
Again consider grammar (4.55) whose GOTO graph As we mentioned, there are three pairs of sets
of items that can be merged. I3 and I6 are replaced by their union:

To see how the GOTO's are computed, consider GOTO(I3 6 , C). In the original set of LR(1)
items, GOTO(I3, C) = h, and Is is now part of I89, so we make GOTO(I36,C) be 789- We could
have arrived at the same conclusion if we considered I6, the other part of I3 g. That
is, GOTO(I6, C) = Ig, and IQ is now part of Ig9. For another example, consider GOTO(I2,c), an
entry that is exercised after the shift action of I2 on input c. In the original sets of LR(1)
items, GOTO(I2,c) = I6. Since I6 is now part of 736, GOTO(I2,c) becomes I36. Thus, the entry in
Fig. 4.43 for state 2 and input c is made s36, meaning shift and push state 36 onto the stack.
When presented with a string from the language c*dc*d, both the LR parser of Fig. 4.42 and the
LALR parser of Fig. 4.43 make exactly the same sequence of shifts and reductions, although the
names of the states on the stack may differ. For instance, if the LR parser puts I3 or I6 on the stack,
the LALR parser will put I36 on the stack. This relationship holds in general for an LALR grammar.
The LR and LALR parsers will mimic one another on correct inputs.

19 Dr.SGIET ,Markapur .CSE Department

COMPILER DESIGN
When presented with erroneous input, the LALR parser may proceed to do some reductions after the
LR parser has declared an error. However, the LALR parser will never shift another symbol after the
LR parser declares an error. For example, on input ccd followed by $, the LR parser of will put
0334
on the stack, and in state 4 will discover an error, because $ is the next input symbol and state 4 has
action error on $. In contrast, the LALR parser of Fig. 4.43 will make the corresponding moves,
putting
0 36 36 47
on the stack. But state 47 on input $ has action reduce C —>• d. The LALR parser will thus change
its stack to
0 36 36 89
Now the action of state 89 on input $ is reduce C -»• cC. The stack becomes
0 36 89
whereupon a similar reduction is called for, obtaining stack
02
Finally, state 2 has action error on input $, so the error is now discovered.
Construction of LALR Parsing Tables
There are several modifications we can make to Algorithm 4.59 to avoid con-structing the full
collection of sets of LR(1) items in the process of creating an LALR(l) parsing table.
• First, we can represent any set of LR(0) or LR(1) items I by its kernel, that is, by those items that
are either the initial item — [S' -S] or [S' ->• -S, $] — or that have the dot somewhere other than at
the beginning of the production body.
We can construct the LALR(l)-item kernels from the LR(0)-item kernels by a process of propagation
and spontaneous generation of lookaheads, that we shall describe shortly.

If we have the LALR(l) kernels, we can generate the LALR(l) parsing table by closing each
kernel, using the function CLOSURE of Fig. 4.40, and then computing table entries by Algorithm
4.56, as if the LALR(l) sets of items were canonical LR(1) sets of items.

Example 4.61 : We shall use as an example of the efficient LALR(l) table-construction method the
non-SLR grammar from Example 4.48, which we re-produce below in its augmented form:

20 Dr.SGIET ,Markapur .CSE Department

COMPILER DESIGN
The complete sets of LR(0) items for this grammar were shown in Fig. 4.39. The kernels of these
items are shown in Fig. 4.44.

Example
Let us construct the kernels of the LALR(l) items for the grammar of Example 4.61. The kernels of
the LR(0) items were shown in Fig. 4.44. When we apply Algorithm 4.62 to the kernel of set of
items Io, we first compute CLOSURE({[S" ->> S, #]}), which is

21 Dr.SGIET ,Markapur .CSE Department

COMPILER DESIGN

In Fig. 4.47, we show steps (3) and (4) of Algorithm 4.63. The column labeled INIT shows the
spontaneously generated lookaheads for each kernel item. These are only the two occurrences of =
discussed earlier, and the spontaneous lookahead $ for the initial item S' - 5 .

On the first pass, the lookahead $ propagates from S' —>• S in Io to the six items listed in Fig.
4.46. The lookahead = propagates from L -> *-R in i4 to items L -» * R- in I7 and R -> L- in
I8- It also propagates to itself and to L —»• id • in I5, but these lookaheads are already present.
In the second and third passes, the only new lookahead propagated is $, discovered for the successors
of i2 and i4 on pass 2 and for the successor of IQ on pass 3. No new lookaheads are propagated on
pass 4, so the final set of lookaheads is shown in the rightmost column of Fig. 4.47.

Note that the shift/reduce conflict found in Example 4.48 using the SLR method has disappeared with
the LALR technique. The reason is that only lookahead $ is associated with R -» L- in I2 , so there is
no conflict with the parsing action of shift on = generated by item S -» L=R in i 2 .

22 Dr.SGIET ,Markapur .CSE Department

COMPILER DESIGN

Dangling Else Ambiguity

Compilers and interpreters use grammar to build the data structure in order to process the programs.
So ideally one program should have one derivation tree. A parse tree or a derivation tree is a
graphical representation that shows how strings of the grammar are derived using production rules.
But there exist some strings which are ambiguous.

A grammar is said to be ambiguous if there exists more than one leftmost derivation or more than
one rightmost derivation or more than one parse tree for an input string. An ambiguous grammar or
string can have multiple meanings. Ambiguity is often treated as a grammar bug, in programming
languages, this is mostly unintended.

Dangling-else ambiguity
The dangling else problem in syntactic ambiguity. It occurs when we use nested if. When there are
multiple “if” statements, the “else” part doesn’t get a clear view with which “if ” it should combine.

For example:

if (condition) {
}
if (condition 1) {
}
if (condition 2) {
}
else

23 Dr.SGIET ,Markapur .CSE Department

COMPILER DESIGN
{
}
In the above example, there are multiple “ifs” with multiple conditions and here we want to pair the
outermost if with the else part. But the else part doesn’t get a clear view with which “if” condition it
should pair. This leads to inappropriate results in programming.

The Problem of Dangling-else:

Dangling else can lead to serious problems. It can lead to wrong interpretations by the compiler and
ultimately lead to wrong results.

For example:

Initialize k=0 and o=0

if(ch>=3)
if(ch<=10)
k++;
else
o++;
In this case, we don’t know when the variable “o” will get incremented. Either the first “if”
condition might not get satisfied or the second “if” condition might not get satisfied. Even the first
“if” condition gets satisfied, the second “if” condition might fail which can lead to the execution of
the “else” part. Thus it leads to wrong results.

To solve the issue the programming languages like C, C++, Java combine the “else” part with the
innermost “if” statement. But sometimes we want the outermost “if” statement to get combined with
the “else” part.

Resolving Dangling-else Problem

The first way is to design non-ambiguous programming languages.

Secondly, we can resolve the dangling-else problems in programming languages by using braces
and indentation.

For example:

if (condition) {
if (condition 1) {
if (condition 2) {}
}
}
else {
}
In the above example, we are using braces and indentation so as to avoid confusion.

Third, we can also use the “if – else if – else” format so as to specifically indicate which “else”
belongs to which “if”.

24 Dr.SGIET ,Markapur .CSE Department

COMPILER DESIGN
For example:

if(condition) {
}
else if(condition-1) {
}
else if(condition-2){
}
else{
}

Error Recovery in LR Parsing

LR parser is a bottom-up parser. It is always used to deal with context-free Grammar (CFGs). It is
generally used by computer programming language compilers as well as other associated tools also.
LR parser produces leftmost derivation while using the input taking from left to right. By building
up from the leaves, it helps to reduce the top-level grammar productions that is why it is called a
bottom-up parser. LR parsers are the most powerful parsers among all deterministic parsers which
are used widely nowadays. LR parsers are basically of 4 types:

LR(0) parser
SLR parser
LALR parser
CLR parser

Error Recovery in LR Parsing:

When there is no valid continuation for the input scanned thus far, LR parsers report an error.
Before notifying a mistake, a CLR parser never performs a single reduction and an SLR or LALR
may do multiple reductions, but they will never move an incorrect input symbol into the stack.

When the parser checks the table and discovers that the relevant action item is empty, an error is
recognized in LR parsing. Goto entries can never be used to detect errors.

LR Parser Basically Uses the Mentioned Two Techniques to Detect Errors:

1.Syntactic Phase recovery

2.Panic mode recovery

Syntactic Phase Recovery:

Syntactic Phase Recovery Follows the Given Steps:

Programmer mistakes that call error procedures in the parser table are determined based on the
language.
Creating error procedures that can alter the top of the stack and/or certain symbols on input in a way
that is acceptable for table error entries.

25 Dr.SGIET ,Markapur .CSE Department

COMPILER DESIGN
There are some of the errors that are detected during the syntactic phase recovery:

Errors in structure
Missing operator
Misspelled keywords
Unbalanced parenthesis

Panic Mode Recovery:

This approach involves removing consecutive characters from the input one by one until a set of
synchronized tokens is obtained. Delimiters such as or are synchronizing tokens. The benefit is that
it is simple to implement and ensures that you do not end up in an infinite loop. The drawback is
that a significant quantity of data is skipped without being checked for additional problems.

Panic mode recovery follows the given steps:

Scan the stack until you find a state ‘a’ with a goto() on a certain non-terminal ‘B’ (by removing
states from the stack).
Until a symbol ‘b’ that can follow ‘B’ is identified, zero or more input symbols are rejected.

Example of Error recovery using LR Parser:

Consider the following grammar for detecting the errors:

E→E+E
E→E*E
E→(E)
E → id
Step 1: Firstly make the parsing table for the given grammar:

Parsing Table

Parsing Table for the above example

26 Dr.SGIET ,Markapur .CSE Department

COMPILER DESIGN
Step 2: Let the string given to parse through it is:

String: id+)$

Step 3: Now working of parser on the given string is described below:

STACK INPUT
0 id+)$
0id3 +)$
0E1 +)$
0E1 + 4 )$
0E1 + 4 $
0E1 + 4id3 $
0E1 + 4E7 $
0E1 $

Step 4: Output is given in the form of error is detected:

Action(4, )) = error i.e.,

“unbalanced right parenthesis” c2
removes right parenthesis “missing
operand” e2 pushes id3 on stack
Thus, the error is detected using LR parser in this way.

Parsing ambiguous grammars using LR parser

LR parser can be used to parse ambiguous grammars. LR parser resolves the conflicts (shift/reduce
or reduce/reduce) in parsing table of ambiguous grammars based on certain rules (precedence and/or
associativity of operators) of the grammar.

Example:
Lets take the following ambiguous grammar:

E -> E+E
E -> E*E
E -> id
Lets assume, the precedence and associativity of the operators (+ and *) of the grammar are as
follows:

27 Dr.SGIET ,Markapur .CSE Department

COMPILER DESIGN

“+” and “*” both are left associative,

Precedence of “*” is higher than the precedence of “+”.
If we use LALR(1) parser, the LR(1) item DFA will be:

From the LR(1) item DFA we can see that there are shift/reduce conflicts in the state I5 and I6. So
the parsing table is as follows:

There are both shift and reduce moves in I5 and I6 on “+ and “*”. To resolve this conflict, that is to
determine which move to keep and which to discard from the table we shall use the precedence and
associativity of the operators.
Consider the input string:

id + id + id

Lets look at the parser moves till the conflict state according to the above parsing table.

28 Dr.SGIET ,Markapur .CSE Department

COMPILER DESIGN

If we take the reduce move of I5 state on symbol “+” as in parser 1, then the left “+” of the input
string is reduced before the right “+”, which makes “+” left associative.

If we take the shift move of I5 state on symbol “+” as in parser 2, then the right “+” of the input
string is reduced before the left “+”, which makes “+” right associative.

Similarly, Taking shift move of I5 state on symbol “*” will give “*” higher precedence over “+”, as
“*” will be reduced before “+”. Taking reduce move of I5 state on symbol “*” will give “+” higher
precedence over “*”, as “+” will be reduced before “*”. Similar to I5, conflicts from I6 can also be
resolved.

According to the precedence and associativity of our example, the conflict is resolved as follows,

The shift/reduce conflict at I5 on “+” is resolved by keeping the reduce move and discarding the
shift move, which makes “+” left associative.
The shift/reduce conflict at I5 on “*” is resolved by keeping the shift move and discarding the
reduce move, which will give “*” higher precedence over “+”.
The shift/reduce conflict at I6 on “+” is resolved by keeping the reduce move and discarding the
shift move, which will give “*” higher precedence over “+”.
The shift/reduce conflict at I6 on “*” is resolved by keeping the reduce move and discarding the
shift move, which makes “*” left associative.
Generally, the parser generator tool YAAC resolves conflicts due to ambiguous grammars as
follows,

Shift/reduce conflict in the parsing table is resolved by giving priority to shift move over reduce
move. If the string is accepted for shift move, then reduce move is removed, otherwise shift move is
removed.
Reduce/reduce conflict in the parsing table is resolved by giving priority to first reduce move over
second reduce move. If the string is accepted for first reduce move, then second reduce move is
removed, otherwise first reduce move is removed.

29 Dr.SGIET ,Markapur .CSE Department

CD Unit3
No ratings yet
CD Unit3
103 pages
Bottom Up Parsing
No ratings yet
Bottom Up Parsing
89 pages
CST302 M3 Ktunotes - in
No ratings yet
CST302 M3 Ktunotes - in
36 pages
Mod 3
No ratings yet
Mod 3
29 pages
CD Notes
No ratings yet
CD Notes
52 pages
Unit 3 21csc304j CD
No ratings yet
Unit 3 21csc304j CD
103 pages
LRParsing LRParserGenerator
No ratings yet
LRParsing LRParserGenerator
32 pages
Fundamentals of Database Systems 7th Edition Unlocked Test Bank
No ratings yet
Fundamentals of Database Systems 7th Edition Unlocked Test Bank
307 pages
Grade 9 Social Studies Notes
86% (22)
Grade 9 Social Studies Notes
46 pages
Lec3 SyntaxAnalysis Part4
No ratings yet
Lec3 SyntaxAnalysis Part4
23 pages
Syntax Analyzer 2-Up To LALR
No ratings yet
Syntax Analyzer 2-Up To LALR
74 pages
CD Unit3 Part1
No ratings yet
CD Unit3 Part1
22 pages
Syntax Analyzer 2-Up To LR
No ratings yet
Syntax Analyzer 2-Up To LR
73 pages
PYQs Unit 2 CD
No ratings yet
PYQs Unit 2 CD
31 pages
CD - Chap3 - III - Bottom Up Parsing
No ratings yet
CD - Chap3 - III - Bottom Up Parsing
37 pages
Unit3.2 Bottomupparsars
No ratings yet
Unit3.2 Bottomupparsars
71 pages
Bottomupparser
No ratings yet
Bottomupparser
58 pages
Mod 2
No ratings yet
Mod 2
29 pages
Chapter 6-1 Note
No ratings yet
Chapter 6-1 Note
54 pages
Khawajamohiuddin 2801 20888 5 CC 07 Bottom-UpParsing
No ratings yet
Khawajamohiuddin 2801 20888 5 CC 07 Bottom-UpParsing
19 pages
Lec 03. Bottom Up Parsers Parsing Shift Reduce Parser
No ratings yet
Lec 03. Bottom Up Parsers Parsing Shift Reduce Parser
19 pages
Ar20 Aus CD Unit III
No ratings yet
Ar20 Aus CD Unit III
16 pages
Syntax Analysis (Part-II)
No ratings yet
Syntax Analysis (Part-II)
69 pages
Compiler Design Unit 2
No ratings yet
Compiler Design Unit 2
17 pages
Compiler Design (Unit-II)
No ratings yet
Compiler Design (Unit-II)
89 pages
Bottom Up Parsing
No ratings yet
Bottom Up Parsing
39 pages
CD R19 Unit-2
No ratings yet
CD R19 Unit-2
53 pages
m3010 Series PDF
100% (2)
m3010 Series PDF
133 pages
Lecture 6 Bottom Up Parsing
No ratings yet
Lecture 6 Bottom Up Parsing
13 pages
Bottom Up Parsing
No ratings yet
Bottom Up Parsing
11 pages
CD Unit 2 RV
No ratings yet
CD Unit 2 RV
21 pages
BottomUp Shift Reduce Parser
No ratings yet
BottomUp Shift Reduce Parser
45 pages
LR Parsing
No ratings yet
LR Parsing
8 pages
Chapter 3 Compiler Design
No ratings yet
Chapter 3 Compiler Design
42 pages
Chapter 3 - Syntax Analysis Part 2
No ratings yet
Chapter 3 - Syntax Analysis Part 2
53 pages
Module 3
No ratings yet
Module 3
29 pages
Csc3205-Bottomup-Parsing
No ratings yet
Csc3205-Bottomup-Parsing
21 pages
4 Syntax Analysis - Bottom Up Parsing
No ratings yet
4 Syntax Analysis - Bottom Up Parsing
12 pages
Lecture 5 - Bottom-Up Parsing
No ratings yet
Lecture 5 - Bottom-Up Parsing
10 pages
Unit 02 - Part 03
No ratings yet
Unit 02 - Part 03
50 pages
Bottomupparsingh
No ratings yet
Bottomupparsingh
21 pages
Compilef Design Unit 2 AKTU As Per 2023-24 Syllabus
No ratings yet
Compilef Design Unit 2 AKTU As Per 2023-24 Syllabus
46 pages
Compiler Design 5
No ratings yet
Compiler Design 5
7 pages
British Ballads From Maine
No ratings yet
British Ballads From Maine
599 pages
Bottom Up Parsing
No ratings yet
Bottom Up Parsing
149 pages
PCC-CS501
No ratings yet
PCC-CS501
10 pages
LR Parser
No ratings yet
LR Parser
15 pages
Complier Construction (Final)
No ratings yet
Complier Construction (Final)
8 pages
CS346 Bottom Up Parser
No ratings yet
CS346 Bottom Up Parser
64 pages
Boarding Pass 5 Promo
0% (1)
Boarding Pass 5 Promo
28 pages
RkCD-Chapter 4 - Syntax Analysis
No ratings yet
RkCD-Chapter 4 - Syntax Analysis
20 pages
Lrparser HaLrparser Handout Ndout
No ratings yet
Lrparser HaLrparser Handout Ndout
16 pages
Lecture3 Parser Full
No ratings yet
Lecture3 Parser Full
30 pages
Compiler Construction Unit 3 Part-4 LR Parser Predictive Parsing CSE
No ratings yet
Compiler Construction Unit 3 Part-4 LR Parser Predictive Parsing CSE
9 pages
07 Bottom Up Parsing
No ratings yet
07 Bottom Up Parsing
79 pages
CD Unit 3
No ratings yet
CD Unit 3
30 pages
Bottomupparsing
No ratings yet
Bottomupparsing
12 pages
The Beginnings of British Literature Old English Anglo-Saxon and Medieval Literature
No ratings yet
The Beginnings of British Literature Old English Anglo-Saxon and Medieval Literature
108 pages
Bottom-Up Parsing
No ratings yet
Bottom-Up Parsing
10 pages
Bottom Up Parsing
No ratings yet
Bottom Up Parsing
24 pages
TNPSC Group 2 Mains Preparation Book List For Latest Updated Syllabus - TNPSC Group 4, VAO, Group 2, Group 1, Notificati 1
No ratings yet
TNPSC Group 2 Mains Preparation Book List For Latest Updated Syllabus - TNPSC Group 4, VAO, Group 2, Group 1, Notificati 1
5 pages
D31EXPX 22 Vs CAT AECI431 00 LoRes 58018
No ratings yet
D31EXPX 22 Vs CAT AECI431 00 LoRes 58018
68 pages
Catalogo Hiab 122
No ratings yet
Catalogo Hiab 122
4 pages
LR
No ratings yet
LR
4 pages
Compiler Construction 1 1 Compiler Construction 1 2
No ratings yet
Compiler Construction 1 1 Compiler Construction 1 2
12 pages
Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
Parsing
No ratings yet
Parsing
33 pages
STERN White Paper 2017-03 Withcover (1) - 0
No ratings yet
STERN White Paper 2017-03 Withcover (1) - 0
270 pages
UNIT-4 Parsing Techniques
No ratings yet
UNIT-4 Parsing Techniques
20 pages
Windows Server 2019 Features
No ratings yet
Windows Server 2019 Features
31 pages
Rhotic Degemination in Sanskrit and The Etymology of Vedic Ūrú - Thigh', Hittite UZU (U) Walla - Id.'
No ratings yet
Rhotic Degemination in Sanskrit and The Etymology of Vedic Ūrú - Thigh', Hittite UZU (U) Walla - Id.'
32 pages
Arihant (Madam Rides The Bus)
No ratings yet
Arihant (Madam Rides The Bus)
8 pages
English ss1 2nd Term
No ratings yet
English ss1 2nd Term
17 pages
Ghar Ki Baat Ghar Me Hi Rehne Do - Part 1 - Desi Kahani
50% (2)
Ghar Ki Baat Ghar Me Hi Rehne Do - Part 1 - Desi Kahani
6 pages
Library Manager
No ratings yet
Library Manager
20 pages
SC-Mineral Processing Notes
No ratings yet
SC-Mineral Processing Notes
8 pages
COVID Related Essays
No ratings yet
COVID Related Essays
13 pages
Simple and Compound Interest
No ratings yet
Simple and Compound Interest
4 pages
Ans Magnetic Properties
No ratings yet
Ans Magnetic Properties
44 pages
Bottom Up Parse
No ratings yet
Bottom Up Parse
14 pages
Greek Lit Quiz
No ratings yet
Greek Lit Quiz
2 pages
Business Plan For Poultry in Ibadan
No ratings yet
Business Plan For Poultry in Ibadan
6 pages
Mabel Amos Special Fiduciary
No ratings yet
Mabel Amos Special Fiduciary
10 pages
Epp
100% (1)
Epp
2 pages
Family Dynamics
No ratings yet
Family Dynamics
3 pages
B .Inggris
No ratings yet
B .Inggris
4 pages
01 Guide To Drafting Your Critical Role Letters
No ratings yet
01 Guide To Drafting Your Critical Role Letters
3 pages
Wireless Television Notice Board
No ratings yet
Wireless Television Notice Board
10 pages
Cardio (PP012) Quiz 1 Grades
No ratings yet
Cardio (PP012) Quiz 1 Grades
7 pages
SCFR1 JHS Currhead Consolidation
No ratings yet
SCFR1 JHS Currhead Consolidation
2 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.