0% found this document useful (0 votes)
16 views71 pages

03 Parsing

The document discusses parsing methods, specifically top-down and bottom-up parsing, detailing their strategies, attempts, and objectives. It explains the construction of parsing tables using FIRST and FOLLOW sets, and introduces LL(1) grammar and LR parsers, including their transition diagrams and item sets. Additionally, it covers shift-reduce parsing and the process of reducing token strings to grammar symbols.

Uploaded by

126003020
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views71 pages

03 Parsing

The document discusses parsing methods, specifically top-down and bottom-up parsing, detailing their strategies, attempts, and objectives. It explains the construction of parsing tables using FIRST and FOLLOW sets, and introduces LL(1) grammar and LR parsers, including their transition diagrams and item sets. Additionally, it covers shift-reduce parsing and the process of reducing token strings to grammar symbols.

Uploaded by

126003020
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 71

Parser

• Parsing methods:
– Top-down parsing
– Bottom-up parsing
Sr. No. Key Top Down Parsing Bottom Up Parsing

Bottom up approach
Top down approach starts starts evaluating the
evaluating the parse tree from parse tree from the
1 Strategy
the top and move downwards lowest level of the tree
for parsing other nodes. and move upwards for
parsing the node.

Bottom up parsing
Top down parsing attempts to
attempts to reduce the
2 Attempt find the left most derivation
input string to first
for a given string.
symbol of the grammer.

Bottom up parsing uses


Top down parsing uses
3 Derivation Type the rightmost
leftmost derivation.
derivation.

Bottom up parsing
searches for a
Top down parsing searches
production rule to be
4 Objective for a production rule to be
used to reduce a string
used to construct a string.
to get a starting symbol
of grammer.
TOP-DOWN PARSING
• Non recursive predictive parsing
– Predictive parser can be implemented by recursive-descent parsing
(may need to manipulate the grammar, e.g eliminating left
recursion and left factoring).
• Requirement: by looking at the first terminal symbol that a
nonterminal symbol can derive, we should be able to choose the right
production to expand the nonterminal symbol.
– If the requirement is met, the parser easily be implemented using a
non-recursive scheme by building a parsing table.
• A parsing table example

(1) E->TE’
(2) E’->+TE’ id + * ( ) $
(3) E’->e E (1) (1)
(4) T->FT’ E’ (2) (3) (3)
(5) T’->*FT’
(6) T’->e
T (4) (4)
(7) F->(E) T’ (6) (5) (6) (6)
(8) F->id F (8) (7)
• Using the parsing table, the
predictive parsing program
works like this:
– A stack of grammar symbols ($ on the
bottom)
– A string of input tokens ($ at the end)
– A parsing table, M[NT, T] of productions
– Algorithm:
– put ‘$ Start’ on the stack ($ is the end of
input string).
1) if top == input == $ then accept
2) if top == input then
pop top of the stack; advance to next
input symbol; goto 1;
3) If top is nonterminal
if M[top, input] is a production then
replace top with the production; goto 1
else error
4) else error
– Example:
id + * ( ) $
(1) E->TE’ E (1) (1)
(2) E’->+TE’
(3) E’->e
E’ (2) (3) (3)
(4) T->FT’ T (4) (4)
(5) T’->*FT’ T’ (6) (5) (6) (6)
(6) T’->e F (8) (7)
(7) F->(E)
(8) F->id
Stack input production
$E id+id*id$
$E’T id+id*id$ E->TE’
$E’T’F id+id*id$ T->FT’
$E’T’id id+id*id$ F->id
$E’T’ +id*id$
…...

This produces leftmost derivation:


E=>TE’=>FT’E’=>idT’E’=>….=>id+id*id
• How to construct the parsing table?
– First(a): Here, a is a string of symbols. The set of
terminals that begin strings derived from a. If a is
empty string or generates empty string, then empty
string is in First(a).
– Follow(A): Here, A is a nonterminal symbol. Follow(A)
is the set of terminals that can immediately follow A in
a sentential form.
– Example:
S->iEtS | iEtSeS|a
E->b
First(a) = ?, First(iEtS) = ?, First(S) = ?
Follow(E) = ? Follow(S) = ?
• How to construct the parsing table?
– With first(a) and follow(A), we can build the parsing
table. For each production A->a:
• Add A->a to M[A, t] for each t in First(a).
• If First(a) contains empty string
– Add A->a to M[A, t] for each t in Follow(A)
– if $ is in Follow(A), add A->a to M[A, $]
• Make each undefined entry of M error.
FIRST Set
FOLLOW Set
• Compute FIRST(X)
– If X is a terminal then FIRST(X) = {X}
– If X->e, add e to FIRST(X)
– if X->Y1 Y2 … Yk and Y1 Y2 … Yi-1==>e, where I<= k, add
every none e in FIRST(Yi) to FIRST(X). If Y1…Yk=>e, add e to
FIRST(X).

– FIRST(Y1 Y2 … Yk): similar to the third step.

E->TE’ FIRST(E) = {(, id}


E’->+TE’|e FIRST(E’)={+, e}
T->FT’ FIRST(T) = {(, id}
T’->*FT’ | e FIRST(T’) = {*, e}
F->(E) | id FIRST(F) = {(, id}
• Compute Follow(A).
– If S is the start symbol, add $ to Follow(S).
– If A->aBb, add Frist(b)-{e} to Follow(B).
– If A->aB or A->aBb and b=>e, add Follow(A) to Follow(B).

E->TE’ First(E) = {(, id}, Follow(E)={), $}


E’->+TE’|e First(E’)={+, e}, Follow(E’) = {), $}
T->FT’ First(T) = {(, id}, Follow(T) = {+, ), $}
T’->*FT’ | e First(T’) = {*, e}, Follow(T’) = {+, ), $}
F->(E) | id First(F) = {(, id}, Follow(F) = {*, +, ), $}
• LL(1) grammar:
– First L: scans input from left to right
– Second L: produces a leftmost derivation
– 1: uses one input symbol of lookahead at each step to make a
parsing decision.

– A grammar whose parsing table has no multiply-defined entries is


a LL(1) grammar.

– No ambiguous or left-recursive grammar can be LL(1)


– A grammar is LL(1) iff for each set of A productions, where
– A  1 |  2 | ... |  n The following conditions hold:
Fir s t( i )  Fir s t( j ) {}, when1 i n and1  j n and i  j

if  i   , the n
(a) no, j  e, when i  j
(b) First(  j )  Follow(A) {}, when i  j.
• Example, build LL(1) parsing table for the
following grammar:

S-> i E t S e S | i E t S | a
E -> b
Bottom-up Parsing
• A bottom-up parsing corresponds to the
construction of a parse tree for an input tokens
beginning at the leaves (the bottom) and
working up towards the root (the top).

• An example follows.

18
Bottom-up Parsing (Cont.)
• Given the grammar:
– E→T
– T→T*F
– T→F
– F → id

19
Reduction
• The bottom-up parsing as the process of
“reducing” a token string to the start symbol of
the grammar.

• At each reduction, the token string matching


the RHS of a production is replaced by the
LHS non-terminal of that production.

20
Reduction (Cont.)
• The key decisions during bottom-up parsing
are about when to reduce and about what
production to apply.

21
Shift-reduce Parsing
• Shift-reduce parsing is a form of bottom-up parsing in
which a stack holds grammar symbols and an input
buffer holds the rest of the tokens to be parsed.

• We use $ to mark the bottom of the stack and also the


end of the input.

• During a left-to-right scan of the input tokens, the


parser shifts zero or more input tokens into the stack,
until it is ready to reduce a string β of grammar
symbols on top of the stack.
22
A Shift-reduce Example

23
Shift-reduce Parsing (Cont.)
• Shift: shift the next input token onto the top of the
stack.
• Reduce: the right end of the string to be reduced must
be at the top of the stack. Locate the left end of the
string within the stack and decide what non-terminal
to replace that string.
• Accept: announce successful completion of parsing.
• Error: discover a syntax error and call an error
recovery routine.

24
LR Parsers
• Left-scan Rightmost derivation in reverse (LR)
parsers are characterized by
the number of look-ahead symbols that are
examined to determine parsing actions.

• We can make the look-ahead parameter


explicit and discuss LR(k) parsers,
where k is the look-ahead size.
25
LR(k) Parsers
• LR(k) parsers are of interest in that they are the most
powerful class of deterministic bottom-up parsers
using at most K look-ahead tokens.
• Deterministic parsers must uniquely determine the
correct parsing action at each step;
they cannot back up or retry parsing actions.

We will cover 4 LR(k) parsers: LR(0), SLR(1), LR(1),


and LALR(1) here.
26
LR Parsers (cont.)
In building an LR Parser:

1) Create the Transition Diagram

2) Depending on it, construct:


Go_to Table
Action Table

27
LR Parsers (cont.)
Go_to table defines the
next state after a shift.

Action table tells parser whether to:


• 1) shift (S),
• 2) reduce (R),
• 3) accept (A) the source code, or
• 4) signal a syntactic error (E).

28
Model of an LR parser

29
LR Parsers (Cont.)
• An LR parser makes shift-reduce decisions by
maintaining states to keep track of where we
are in a parse.

• States represent sets of items.

30
LR(0) Item
• LR(0) and all other LR-style parsing are based on the
idea of:
an item of the form:
A→X1…Xi‧Xi+1…Xj

• The dot symbol ‧, in an item


may appear anywhere
in the right-hand side of a production.

• It marks how much of the production


has already been matched. 31
LR (0) Item (Cont.)
• An LR(0) item (item for short) of a grammar G is a
production of G with a dot at some position of the RHS.

• The production A → XYZ yields the four items:


A → ‧XYZ
A → X ‧ YZ
A → XY ‧ Z
A → XYZ ‧

The production A → λ generates only one item, A → ‧.

32
LR(0) Item Closure
• If I is a set of items for a grammar G, then
CLOSURE(I) is the set of items constructed
from I by the 2 rules:
1) Initially, add every item in I to CLOSURE(I)
2) If A → α‧B β is in CLOSURE(I)
and B → γ is a production, then add
B → ‧γ to CLOSURE(I),
if it is not already there.
Apply this until no more new items can be
added.
33
LR(0) Closure Example
E’ → E
E→E+T|T
T→T*F|F
F → (E) | id

I is the set of one item {E’→‧E}.

Find CLOSURE(I)

34
LR(0) Closure Example (Cont.)
First, E’ → ‧E is put in CLOSURE(I) by rule 1.

Then, E-productions with dots at the left end:


E → ‧E + T and E → ‧T.

Now, there is a T immediately to the right of a dot in


E → ‧T, so we add T → ‧T * F and T → ‧F.

Next, T → ‧F forces us to add:


F → ‧(E) and F → ‧id.
35
Another Closure Example
S→E $
E→E + T | T
T→ID | (E)

closure (S→‧E$) = {S→‧E$,


E→‧E+T,
E→‧T,
T→‧ID,
T→‧(E)}
The five items above forms an item set
called state s0.
36
Closure (I)
SetOfItems Closure(I) {
J=I
repeat
for (each item A → α‧B β in J)
for (each production B → γ of G)
if (B → ‧ γ is not in J)
add B → ‧ γ to J;
until no more items are added to J;
return J;
} // end of Closure (I)
37
Goto Next State
• Given an item set (state) s,

we can compute its next state, s’,


under a symbol X,

that is, Go_to (s, X) = s’

38
Goto Next State (Cont.)
E’ → E
E→E+T|T
T→T*F|F
F → (E) | id

S is the item set (state):


E→E‧+T

39
Goto Next State (Cont.)
S’ is the next state that Goto(S, +) goes to:
E → E +‧T
T → ‧T * F (by closure)
T → ‧F (by closure)
F → ‧(E) (by closure)
F → ‧id (by closure)

We can build all the states of the Transition


Diagram this way.
40
An LR(0) Complete Example
Grammar:

S’→ S $
S→ ID

41
LR(0) Transition Diagram

State 0 State 1
id
S’ →‧S$ S→id‧
S→‧id
S
State 2
S’ →S‧$
$

State 3
S’ →S$‧

42
LR(0) Transition Diagram (Cont.)
Each state in the Transition Diagram,

either signals a shift


(‧moves to right of a terminal)

or signals a reduce
(reducing the RHS handle to LHS)

43
LR(0) Go_to table
State Symbol
ID $ S
0 1 2
1
2 3
3

The blanks above indicate errors.


44
LR(0) Action table

State 0 1 2 3
Action S R2 S A
 S for shift
 A for accept
 R2 for reduce by Rule 2
 Each state has only one action.

45
LR(0) Parsing
Stack Input Action
S0 id $ shift
S0 id S1 $ reduce r2
S0 S S2 $ shift
S0 S S2 $ S3 reduce r1
S0 S’ accept

46
Another LR(0) Example
Grammar:
S→E $ r1
E→E+T r2
| T r3
T→ID r4
| (E) r5

47
LR(0) Transition Diagram
T State 9
State 0 E→T‧
S→‧E$
( T
E→‧E+T
E→‧T State 6
T→‧id id State 5 T→(‧E)
id
T→‧(E) E→‧E+T
T→id‧
E→‧T (
T→‧id
E id
T→‧(E)
(
State 3
E→E+‧T E
State 1 +
T→‧id +
S→E‧$ State 7
E→E‧+T T→‧(E)
T→(E‧)
$ T E→E‧+T
State 4 )
State 2
S→E$‧ E→E+T‧ State 8
T→(E) ‧ 48
LR(0) Go_to table
State Symbol
E T + ( ) $ id
0 1 9 6 5
1 3 2
2 4 6 5
3 4 6 5
4
5
6 7 9 6 5
7 3 8
49

8
LR(0) Action table

State: 0 1 2 3 4 5 6 7 8 9
Action: S S A S R2 R4 S S R5 R3

50
LR(0) Parsing
Stack Input Action
S0 id + id $ shift
S0 id S5 + id $ reduce r4
S0 T S9 + id $ reduce r3
S0 E S1 + id $ shift
S0 E S1 + S3 id $ shift
S0 E S1 + S3 id S5 $ reduce r4
S0 E S1 + S3 T S4 $ reduce r2
S0 E S1 $ shift
S0 E S1 $ S2 reduce r1
S0 S accept
51
Simple LR(1), SLR(1), Parsing

SLR(1) has the same Transition Diagram and


Goto table as LR(0)

BUT with different Action table


because it looks ahead 1 token.

52
SLR(1) Look-ahead
• SLR(1) parsers are built first by
constructing Transition Diagram, then by
computing Follow set as SLR(1) look-aheads.

• The ideas is:


A handle (RHS) should NOT be reduced to
N
if the look ahead token is NOT in follow(N)

53
SLR(1) Look-ahead (Cont.)
S→ E $ r1
E→ E + T r2
| T r3
T→ ID r4
T→ ( E ) r5
Follow (S) = { $}
Follow (E) = { ), +, $}
Follow (T) = { ), +, $}

Use the follow sets as look-aheads in reduction.

54
SLR(1) Transition Diagram
T State 9
State 0 E→T‧ { ), +, $}
S→‧E$
( T
E→‧E+T
E→‧T State 6
T→‧id id State 5 T→(‧E)
id
T→‧(E) E→‧E+T
T→id‧ { ), +, $} E→‧T
T→‧id
E id
T→‧(E) (
(
State 3
E→E+‧T E
State 1 +
T→‧id +
S→E‧$ State 7
E→E‧+T T→‧(E)
T→(E‧)
$ T E→E‧+T
State 4 )
State 2
E→E+T‧ { ), +, $} State 8
S→E$‧{$} {

T→(E) ‧ { ), +, $} 55
SLR(1) Goto table
ID + ( ) $ E T
0 5 1 6
1 3 2
2
3 5 7 4
4
5
6
7 5 7 8 6 56
SLR(1) Action table,
which expands LR(0) Action table
ID + ( ) $
0 S S
1 S S
2 R1
3 S S
4 R2 R2 R2
5 R4 R4 R4
6 R3 R3 R3
7 S S 57
An SLR(1) Problem
• The SLR(1) grammar below causes a shift-
reduce conflict:
r1,2 S→A | xb
r3,4 A→ aAb | B
r5 B→ x
Use follow(S) = {$},
follow(A) = follow(B) = {b $}
in the SLR(1) Transition Diagram next.
58
SLR(1) Transition Diagram
State 4 B State 7
A → B‧ {b$} B→ x‧ {b$}

B x
State 0 State 3
S →‧A A → a‧Ab
S →‧xb A → ‧aAb
a a
A →‧aAb A → ‧B
A →‧B B → ‧x
B →‧x
A
A
State 1 State 6
S→A‧ {$} A → aA‧b

b
State 2
x S → x ‧b State 8
B → x‧ {b$} A → aAb‧ {b$}
Shift-reduce b
conflict
State 5
59
S → xb‧ {$}
SLR(1) Go_to table
0 1 2 3 4 5 6 7 8

A 6

B 4 4

a 3 3

b 5 8

x 2 7
60
SLR(1) Action table
state 0 1 2 3 4 5 6 7 8
token
b R5/S R4 S R5 R3

$ R1 R5 R4 R2 R5 R3

a S S

x S S

State 2 (R5/S) causes shift-reduce conflict:


When handling ‘b’, the parser doesn’t know whether to reduce by
rule 5 (R5) or to shift (S).
Solution: Use more powerful LR(1) 61
LR(1) Parsing
The reason why the FOLLOW set does not
work as well as one might wish is that:
It replaces the look-ahead of a single item of a
rule N in a given LR state by:
the whole FOLLOW set of N,
which is the union of all the look-aheads of all
alternatives of N in all states.

Solution: Use LR(1)


62
LR(1) Parsing
LR(1) item sets are more discriminating:
A look-ahead set is kept with each separate
item, to be used to resolve conflicts when a
reduce item has been reached.

This greatly increases the strength of the


parser, but also the size of its tables.

63
LR(1) item
An LR(1) item is of the form:
A→X1…Xi‧Xi+1…Xj, l
where l belongs to Vt U {λ}
l is look-ahead
Vt is vocabulary of terminals
λ is the look-ahead after end marker
$

64
LR(1) item look-ahead set
Rules for look-ahead sets:
1) initial item set: the look-ahead set of the initial item set S0
contains only one token, the end-of-file token ($), the only
token that follows the start symbol.

2) other item set:

Given P → α‧Nβ {σ}, we have


N → ‧γ {FIRST(β{σ}) } in the item set.

65
LR(1) look-ahead
The LR(1) look-ahead set FIRST(β{σ}) is:

If β can produce λ (β →* λ),


FIRST(β{σ}) is:
FIRST(β) plus the tokens in {σ}, excludes λ.
else
FIRST(β{σ}) just equals FIRST(β);

66
An LR(1) Example
Given the grammar below,
create the LR(1) Transition Diagram.

r1,2 S→A | xb
r3,4 A→ aAb | B
r5 B→ x

67
LR(1) Transition Diagram
State 4 State 9 State 7
B x
A → B‧ {$} A → B‧ {b} B → x‧ {b}

B B x
State 0 State 3 State 10
S →‧A {$} A → a‧Ab {$} A → a‧Ab {b}
S →‧xb {$} A → ‧aAb {b} A → ‧aAb {b}
a a a
A →‧aAb {$} A → ‧B {b} A → ‧B {b}
A →‧B {$} B → ‧x {b} B → ‧x {b}
B →‧x {$}
A A
A
State 1 State 6
State 11
S→A‧ {$} A → aA‧b {$}
A → aA‧b {b}

b b
State 2
x S → x‧b {$} State 8 State 12
B → x‧ {$} A → aAb‧ {$} A → aAb‧ {b}
b
State 5
68
S → xb‧ {$}
LR(1) Go_to table

0 1 2 3 4 5 6 7 8 9 10 11 12
A 1 6 11
B 4 9 9
a 3 10 10
b 5 8 12
x 2 7 7
69
LR(1) Action table
State
token 0 1 2 3 4 5 6 7 8 9 10 11 12
$ R1 R5 R4 R2 R3

b S S R5 R4 S R3

a S S S

x S S S

The states are from 0 to 12 and


70
the terminal symbols include $,b,a,x.
LR(1) Parsing
• LR(1)’s problem is that:
The LR(1) Transition Diagram
contains so many states that
the Go_to and Action tables
become prohibitively large.

• Solution: Use LALR(1) (look-ahead LR(1) ) to


reduce table sizes.
71

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy