0% found this document useful (0 votes)
26 views67 pages

4 - Top-Down

The document discusses top-down parsing and predictive parsing. It covers recursive-descent parsing, LL(1) parsing, and how to handle left recursion and indirect left recursion in context-free grammars.

Uploaded by

Aya Saafan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views67 pages

4 - Top-Down

The document discusses top-down parsing and predictive parsing. It covers recursive-descent parsing, LL(1) parsing, and how to handle left recursion and indirect left recursion in context-free grammars.

Uploaded by

Aya Saafan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 67

Prof.

Radu Prodan

SYNTACTIC ANALYSIS
TOP-DOWN PARSING
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 1
Overview
▪ Top-down parsing
– Starts with start symbol and follows leftmost derivation steps
– Traverses parse tree in pre-order from root to leaves

▪ Predictive parsers
– Choose next grammar rule using one or more lookahead tokens

▪ Backtracking parsers
– Try different grammar rule possibilities
– Back up in input if one possibility fails
– Slow and unsuitable for practical compilers

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 2


Predictive Top-Down Parsing
▪ Recursive-descendant parsing
– Ad-hoc, handwritten for each input grammar

▪ LL(1) parsing
– Automatically-generated
– Process input from Left to right, builds a Leftmost derivation
and uses 1 lookahead symbol

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 3


Agenda
▪ Recursive-descendant parsing

▪ LL(1) parsing

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 4


Recursive Descendent Parsing
▪ Nonterminals are parsed by a separate procedure
– Calls other parsing procedures in correct sequence given by
body of its BNF definition

▪ Terminals are parsed by a match procedure


– Receives expected token parameter as input
– Checks if next input token is identical with expected token
parameter and consumes it if it succeeds
– Gives an error if not

▪ One global lookahead variable keeps next input token


17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 5
Arithmetic Expression Grammar
TokenType token ; exp → exp addop term | term
addop → + | –
procedure factor () ; term → term mulop factor | factor
begin mulop → *
case token of factor → ( exp ) | number
( : match ( ( ) ;
exp () ;
match ( ) ) ; procedure match ( expectedToken ) ;
number : match (number) ; begin
else error () ; if token = expectedToken then
end case ; getToken () ;
end factor ; else
error () ;
end if ;
end match ;

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 6


Arithmetic Expression Grammar (2)
▪ exp → exp addop term | term procedure exp ;
– Calling first exp leads to immediate recursive begin
loop term () ;
– exp and term can begin with same tokens: while token = + or
number or ( token = – do
match (token) ;
term () ;
end while ;
▪ Translate grammar into EBNF end exp ;

procedure term ;
exp → term { addop term }
begin
term → factor { mulop factor } factor () ;
while token = * do
▪ Eliminate addop and mulop nonterminals match (token) ;
that only match tokens (operators) factor ();
end while ;
end term ;

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 7


Arithmetic Expression Calculation
function exp: integer ; exp → term { addop term }
var temp: integer ; term → factor { mulop factor }
begin
temp := term () ;
while token = + or token = – do
match (token) ;
case token of
+ : temp := temp + term () ;
– : temp := temp – term () ;
end case ;
end while ;
return temp ;
end exp ;
▪ Left associativity implied in
EBNF definition
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 8
Syntax Tree for Arithmetic
Expressions
function exp : syntaxTree ;
var temp, newtemp : syntaxTree ;
begin
temp := term () ;
+
while token = + or token = – do
match (token) ; + 5
newtemp := makeOpNode(token) ;
leftChild(newtemp) := temp ; 3 4
rightChild(newtemp) := term () ;
temp := newtemp ;
end while ;
return temp ;
end exp ; exp → term { addop term }
term → factor { mulop factor }
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 9
If Statement Grammar
if-stmt → if ( exp ) statement procedure ifStmt ;
| if ( exp ) statement else statement begin
match (if) ;
match ( ( ) ;
▪ EBNF grammar exp () ;
match ( ) ) ;
statement () ;
if-stmt → if ( exp ) statement
if token = else then
[ else statement ]
match (else) ;
statement () ;
end if ;
▪ Parser uses most closely nested end ifStmt ;
disambiguating rule

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 10


Syntax Tree for If Statement
function ifStatement : syntaxTree ;
var temp : syntaxTree ;
begin
match (if) ;
if
match ( ( ) ;
temp := makeStmtNode(if) ; exp statement statement
testChild(temp) := exp () ;
match ( ) ) ;
thenChild(temp) := statement () ;
if token = else then
match (else) ;
elseChild(temp) := statement () ;
else elseChild(temp) := nil ;
end if ;
end ifStatement ;

if-stmt → if ( exp ) statement [ else statement ]

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 11


Recursive Descendent Parsing
Problems
▪ It may be difficult to convert a BNF grammar into EBNF
– Solution: left recursion removal

▪ Predictive parser that needs only one lookahead


character
– Solution: left factoring

▪ Recursive-descendent parsers are powerful but ad-hoc


and handwritten
– Solution: automatic LL parser generator

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 12


Agenda
▪ Recursive-descendant parsing
– Left recursion removal
– Left factoring

▪ LL(1) parsing

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 13


Left Recursion Removal
▪ Immediate left recursion

▪ A → A  | 
–  and  are strings of terminals and nonterminals
–  does not begin with A
– L(G) = {  n | n  0 }

▪ Equivalent grammar that uses right recursion


A →  A
A →  A | 

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 14


Immediate Left Recursion Removal
▪ Left recursive grammar
A → A1 | A2 | ... | An | 1 | 2 | ... | m

– 1, 2, ..., m do not begin with an A

▪ Removed left recursion


A → 1 A | 2 A | . . . | m A
A → 1 A | 2 A | . . . | n A | 

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 15


Indirect Left Recursion Removal
A → B a | c
B → A b | d

▪ Transform all indirect left recursions into immediate left recursions

▪ Choose an arbitrary order of nonterminals A1, …, Am

▪ Eliminate all rules of form Ai → Aj , with j  i


– Replace Aj by its definition

Indirect Left Direct Left Recursive Right Recursive


Recursive
A1 → A2 a | c A1 → A2 a | c A1 → A2 a | c
A2 → A1 b | d A2 → A2 a b | c b | d A2 → c b A2 | d A2
A2 → a b A2 | 
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 16
Indirect Left Recursion Removal
Algorithm
(* for all nonterminals in a well defined ranking *)
for i:= 1 to m do
(* for all nonterminal with a smaller rank *)
for j:= 1 to i–1 do
Replace each grammar rule Ai → Aj  by rule
Ai → 1  | 2  | . . . | k ,
where Aj → 1 | 2 | . . . | k
Eliminate direct left recursions of Ai

▪ No cycles and -productions


17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 17
Indirect Left Recursion Removal
Example
A1 → A2 a | A1 a | c
A2 → A2 b | A1 b | d

▪ Left recursion does not change language, but changes grammar


▪ Changes parse trees and complicates parser
Outer loop Inner loop Action Grammar
i=1 Inner loop Remove A1 → A2 a A1 | c A1
does not immediate left A1 → a A1 | 
execute recursion on A1 A2 → A2 b | A1 b | d
i=2 j=1 Eliminate rule A1 → A2 a A1 | c A1
A2 → A1 b A1 → a A1 | 
A2 → A2 b | A2 a A1 b | c A1 b | d
i=2 Inner loop Remove left A1 → A2 a A1 | c A1
done recursion on A2 A1 → a A1 | 
A2 → c A1 b A2 | d A2
A2 → b A2 | a A1 b A2 | 
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 18
Arithmetic Expression Grammar
▪ Left recursive grammar
exp → exp + term | exp – term | term

▪ Right recursive grammar


exp → term exp
exp → + term exp | – term exp | 

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 19


Right Recursive Expression Parser
Left Recursive Grammar Equivalent Right Recursive Grammar
exp → exp addop term | term exp → term exp
addop → + | – exp → addop term exp | 
term → term multop factor | factor addop → + | –
mulop → * term → factor term
factor → ( exp ) | number term → mulop factor term | 
mulop → *
factor → ( exp ) | number
procedure exp ; procedure exp ;
begin begin
term () ; case token of
exp () ; + : match (+) ;
end exp ; term () ;
exp () ;
– : match (–) ;
term () ;
exp () ;
end case ;
end exp ;

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 20


Loss of Left Associativity
▪ Parse tree for 3 – 4 – 5
exp
term exp

factor term addop term exp

number  – factor term addop term exp


(3)
number  – factor term 
Expression Grammar (4)
exp → term exp number 
exp → addop term exp |  (5)
addop → + | –
term → factor term
term → mulop factor term | 
mulop → *
factor → ( exp ) | number
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 21
Left Recursive Parser
Expression Grammar
function exp : integer ; exp → term exp
var temp : integer ; exp → addop term exp | 
begin addop → + | –
temp := term () ; term → factor term
return exp (temp) ; term → mulop factor term | 
end exp ; mulop → *
factor → ( exp ) | number
function exp (valsofar : integer) : integer ;
begin
if token = + or token = – then
match (token) ; ▪ 3–4–5
case token of
+ : valsofar := valsofar + term () ;
– : valsofar := valsofar – term () ;
end case ; –
return exp (valsofar) ;
else return valsofar ; – 5
end exp ; 3 4

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 22


Agenda
▪ Recursive-descendant parsing
– Left recursion removal
– Left factoring

▪ LL(1) parsing

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 23


Left Factoring
▪ Two or more grammar rule choices share a common prefix
string
– A →   |  

▪ More than one lookahead character necessary

▪ Rewrite rule as two rules with  as common factor


– A →  A
– A →  | 

▪ Longest common substring  in different non-terminal


definitions
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 24
Arithmetic Expression Grammar
exp → term + exp | term

▪ After left factoring


exp → term exp
exp → + exp | 

▪ Replacing exp with term exp in second rule gives


identical results as after left recursion removal
exp → term exp
exp → + term exp | 

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 25


Grammar of If Statements
if-stmt → if ( exp ) statement
| if ( exp ) statement else statement

▪ After left factoring


if-stmt → if ( exp ) statement else-part
else-part → else statement | 

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 26


Left Factoring Algorithm
while there are changes to grammar do
for  A  N  A → 1 | 2 | … | n  P do
let  be a prefix of maximum length that is
shared by two or more production
choices for A
if    then
suppose that 1, …, k share , so that
A →  1 | … |  k | k+1 | … | n,
j’s share no common prefix (j[1..k])
and k+1, …, n do not share 
replace rule A → 1 | 2 | … | n
by rules:
A →  A | k+1 | … | n
A → 1 | … | k

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 27


Grammar of Statement Sequences
▪ Right recursive form
stmt-sequence → stmt ; stmt-sequence | stmt
stmt → s

▪ After left factoring


stmt-sequence → stmt stmt-seq
stmt-seq → ; stmt-sequence | 

▪ Left recursive form


stmt-sequence → stmt-sequence ; stmt | stmt
stmt → s

▪ Left recursion removal


stmt-sequence → stmt stmt-seq
stmt-seq → ; stmt stmt-seq | 
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 28
Agenda
▪ Recursive-descendant parsing
– Left recursion removal
– Left factoring

▪ LL(1) parsing

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 29


LL(1) Parsing Overview
▪ Requires a right recursive and left factored grammar

▪ Uses an explicit stack instead of recursive calls

▪ Mark bottom of stack with dollar ($) character

▪ Match a token on top of the stack with next input token

▪ Generate replaces a nonterminal A at top of stack by string  using grammar


rule A → 
–  pushed onto stack in reversed order of symbols
Parsing stack Input Action
1 $ Start symbol Input string
. . . . . .
. . . . . .
$ $ accept
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 30
Balanced Parentheses Grammar
S → ( S ) S Parsing Input Action
|  Stack
1 $ S ( ) $ S → ( S ) S
2 $ S ) S ( ( ) $ match
3 $ S ) S ) $ S → 
4 $ S ) ) $ match
5 $ S $ S → 
6 $ $ accept

M[N, T] ( ) $
S S → ( S ) S S →  S → 

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 31


If-Statement Grammar
statement → if-stmt | other
if-stmt → if ( exp ) statement else-part
else-part → else statement | 
exp → 0 | 1

M[N, T] if other else 0 1 $


statement → statement
statement
if-stmt → other
if-stmt →
if ( exp )
if-stmt
statement
else-part
else-part →
else statement else-part
else-part
→ 
else-part → 
exp exp → 0 exp → 1
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 32
LL(1) Parsing Actions for
Grammar of if-Statements
Parsing Stack Input Action
$ statement if(0) if(1) other else other $ statement → if-stmt
if-stmt → if ( exp )
$ if-stmt if(0) if(1) other else other $
statement else-part
$ else-part statement ) exp ( if if(0) if(1) other else other $ match
$ else-part statement ) exp ( (0) if(1) other else other $ match
$ else-part statement ) exp 0) if(1) other else other $ exp → 0
$ else-part statement ) 0 0) if(1) other else other $ match
$ else-part statement ) ) if(1) other else other $ match
$ else-part statement if(1) other else other $ statement → if-stmt
if-stmt → if ( exp )
$ else-part if-stmt if(1) other else other $
statement else-part
$ else-part else-part statement ) exp ( if if(1) other else other $ match
$ else-part else-part statement ) exp ( (1) other else other $ match
$ else-part else-part statement ) exp 1) other else other $ exp → 1
$ else-part else-part statement ) 1 1) other else other $ match
$ else-part else-part statement ) ) other else other $ match
$ else-part else-part statement other else other $ statement → other
$ else-part else-part other other else other $ match
$ else-part else-part else other $ else-part → else statement
$ else-part statement else else other $ match
$ else-part statement other $ statement → other
$ else-part other other $ match
$ else-part $ else-part → 
$ $ accept

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 33


LL(1) Parsing Algorithm
(* assumes $ marks bottom of stack and end of input *)
while top(parsing stack)  $  token  $ do
if top(parsing stack) = a  T  token = a
then (* match *)
pop(a, parsing stack) ;
token = getToken() ;
else if top(parsing stack) = A  N  token = a  T 
 A → X1X2 … Xn  M[A, a]
then (* generate *)
pop(A, parsing stack) ;
???
for i := n downto 1 do
push(Xi, parsing stack) ;
else error ;
if top(parsing stack) = $  token = $
then accept
else error ;
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 34
LL(1) Parsing Table
▪ Context-free grammar: G = (T, N, P, S)

▪ Parsing table indexed by nonterminals and terminals which


contains production rules to use when
– Nonterminal is on top of stack
– Terminal is next in input

▪ A production (A → )  M[A, a] in two cases:


– (  * a)  a  T
•  starts with terminal a: a  First()
– (  * )  (S$ * Aa)  a  T  $
• A is followed by terminal a if it can disappear: a  Follow(A)

M[N, T] ( ) $
S → ( S ) S | 
S S → ( S ) S S →  S → 
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 35
LL(1) Parsing Table Construction Algorithm
for  A  N   A →   P do
for  a  First() do
M[A, a] = M[A, a]  { A →  }
if   First() then
for  a  Follow(A) do
M[A, a] = M[A, a]  { A →  }

▪ If (A →   P)  (  * a)  (a  T)  M[A, a] = M[A, a]  { A →  }


– a  First()

▪ If (A →   P)  (  * )  (S $ *  A a )  (a  T  $) 
 M[A, a] = M[A, a]  { A →  }
– a  Follow(A)
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 36
Agenda
▪ Recursive-descendant parsing
– Left recursion removal
– Left factoring

▪ LL(1) parsing
– FIRST sets
– FOLLOW sets
– Parsing table
– LL(1) grammars

▪ Error recovery

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 37


First Sets
▪ G = (T, N, P, S)

▪ XTN

▪ Set First(X)  T   is defined as follows:


– If X  T    First(X) = { X }
– If X  N, then  X → X1 X2 … Xn  P 
• First(X1) –   First(X)
• If   First(X1)  …    First(Xi)  i < n  First(Xi+1) – {  }  First(X)
• If   First(X1)  …    First(Xn)    First(X)

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 38


Integer Expression Grammar:
First Sets Computation
exp → exp addop term | term addop → + | –
term → term mulop factor | factor mulop → *
factor → ( exp ) | number

Grammar Rule Iteration 1 Iteration 2 Iteration 3


exp → exp
addop term
exp → term First(exp) = { (, number }
addop → + First(addop) = { + }
addop → – First(addop) = { +, – }
term → term
mulop factor
term → factor First(term) = { (, number }
mulop → * First(mulop) = { * }
factor → ( exp ) First(factor) = { ( }
factor → number First(factor) = { (, number }
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 39
Statement Sequence Grammar:
First Sets Computation
▪ Left recursive
stmt-sequence → stmt-sequence ; stmt | stmt
stmt → s

▪ Left factored right recursive


stmt-sequence → stmt stmt-seq
stmt-seq → ; stmt-sequence | 
stmt → s

Grammar Productions Iteration 1 Iteration 2


stmt-sequence → First(stmt-sequence) =
stmt stmt-seq ={s}
stmt-seq → ; stmt-sequence First(stmt-seq) = { ; }
stmt-seq →  First(stmt-seq) = { ;, }
stmt → s First(stmt) = { s }
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 40
If-Statement Grammar:
First Sets Computation
statement → if-stmt | other
if-stmt → if ( exp ) statement else-part
else-part → else statement | 
exp → 0 | 1

Grammar Rule Iteration 1 Iteration 2


First(statement) =
statement → if-stmt
= { if, other }
statement → other First(statement) = { other }
if-stmt → if ( exp ) statement
First(if-stmt) = { if }
else-part
else-part → else statement First(else-part) = { else }
else-part →  First(else-part) = { else,  }
exp → 0 First(exp) = { 0 }
exp → 1 First(exp) = { 0, 1 }

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 41


First Set Computation Algorithm
for  A  N do
First(A) :=  ;

while there are changes to any First(A) do


for  A → X1 X2 … Xn do
k := 1 ;
continue := true ;
while continue = true  k  n do
First(A) := First(A)  First(Xk) – {  } ;
if   First(Xk) then
continue := false ;
k := k + 1 ;
if continue = true then
First(A) := First(A)  {  } ;

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 42


Agenda
▪ Recursive-descendant parsing
– Left recursion removal
– Left factoring

▪ LL(1) parsing
– FIRST sets
– FOLLOW sets
– Parsing table
– LL(1) grammars

▪ Error recovery

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 43


Follow Sets
▪ G = (T, N, P, S) and A  N

▪ Set Follow(A)  T  $ is defined as follows:


– If A = S  $  Follow(A)
– If ( B →  A   P)  First() –   Follow(A)
– If ( B →  A   P)    First()
 Follow(B)  Follow(A)
• B →  A is a common special case

▪  is never an element of Follow set

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 44


Simple Expression Grammar:
Follow Sets Computation

Grammar Rule Iteration 1 Iteration 2


Follow(exp) = $  First(addop) = { $, +, – }
exp → exp Follow(term) = Follow(exp) =
Follow(addop) = First(term) = { (, number }
addop term = { $, +, –, ) }
Follow(term) = Follow(exp) = { $, +, – }
Follow(term) = Follow(exp) =
exp → term Follow(term) = Follow(exp) = { $, +, – }
= { $, +, –, ) }
Follow(term) = First(mulop) = { $, +, –, * }
term → term Follow(factor) = Follow(term) =
Follow(mulop) = First(factor) = { (, number }
mulop factor = { $, +, –, *, ) }
Follow(factor) = Follow(term) = { $, +, –, * }
Follow(factor) = Follow(term) =
term → factor Follow(factor) = Follow(term) = { $, +, –, * }
= { $, +, –, *, ) }
factor → ( exp ) Follow(exp) = First( ) ) = { $, +, –, ) }

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 45


Statement Sequence Grammar:
Follow Sets Computation

Grammar Rule Iteration 1


Follow(stmt-sequence) = { $ }
stmt-sequence → Follow(stmt) = First(stmt-seq) – {  } = { ; }
stmt stmt-seq Follow(stmt) = Follow(stmt-sequence) = { ;, $ }
Follow(stmt-seq) = Follow(stmt-sequence) = { $ }
stmt-seq → ; stmt-sequence Follow(stmt-sequence) = Follow(stmt-seq) = { $ }

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 46


If-Statement Grammar:
Follow Sets Computation
Grammar Rule Iteration 1 Iteration 2
statement → Follow(statement) = { $ } Follow(if-stmt) =
if-stmt Follow(if-stmt) = Follow(statement) = { $ } Follow(statement) = { $, else }
if-stmt → Follow(exp) = First( ) ) = { ) } Follow(statement) =
if ( exp ) Follow(statement) = First(else-part) – {  } = Follow(if-stmt) = { $, else }
statement = { $, else }
else-part Follow(statement) = Follow(if-stmt) = Follow(else-part) =
= { $, else } Follow(if-stmt) = { $, else }
Follow(else-part) = Follow(if-stmt) = { $ }
else-part → Follow(statement) = Follow(else-part) = Follow(statement) =
else statement = { $, else } Follow(else-part) = { $, else }
exp → 0 | 1

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 47


Follow Set Computation Algorithm
Follow(S) := $ ;
for  A  N – { S } do
Follow(A) :=  ;

while there are changes to any Follow sets do


for  A → X1 X2 … Xn  P do
for  i[1..n] do
Follow(Xi) := Follow(Xi) 
First(Xi+1 Xi+2 … Xn) – {  } ;
(* Note: if i=n, then Xi+1 Xi+2 … Xn =  *)
if   First(Xi+1 Xi+2 … Xn) then
Follow(Xi) := Follow(Xi)  Follow(A) ;

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 48


Agenda
▪ Recursive-descendant parsing
– Left recursion removal
– Left factoring

▪ LL(1) parsing
– FIRST sets
– FOLLOW sets
– Parsing table
– LL(1) grammars

▪ Error recovery

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 49


Simple Expression Grammar:
Parsing Table
Grammar Rule First Set Follow Set
exp → exp addop term | term First(exp) = { (, number } Follow(exp) = { $, +, –, ) }
addop → + | – First(addop) = { +, – } Follow(addop) = { (, number }
term → term mulop factor | factor First(term) = { (, number } Follow(term) = { $, +, –, ) }
mulop → * First(mulop) = { * } Follow(mulop) = { (, number }
factor → ( exp ) | number First(factor) = { (, number } Follow(factor) = { $, +, –, *, ) }

M[N,T] ( number ) + – * $
exp → exp addop exp → exp addop
exp
term | term term | term
addop addop → + addop → –
term → term mulop term → term mulop
term
factor | factor factor | factor
mulop mulop → *
factor factor → ( exp ) factor → number

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 50


Statement Sequence Grammar:
Parsing Table

Grammar Rule First Set Follow Set


stmt-sequence → stmt stmt-seq First(stmt-sequence) = { s } Follow(stmt-sequence) = { $ }
stmt-seq → ; stmt-sequence |  First(stmt-seq) = { ;,  } Follow(stmt-seq) = { $ }
stmt → s First(stmt) = { s } Follow(stmt) = { ;, $ }

M[N, T] s ; $
stmt-sequence →
stmt-sequence
stmt stmt-seq
stmt-seq →
stmt-seq stmt-seq → 
; stmt-sequence
stmt stmt → s

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 51


If-Statement Grammar:
Parsing Table
Grammar Rule First Set Follow Sets
statement→ if-stmt | other First(statement) = { if, other } Follow(statement) = { $, else }
if-stmt → if ( exp )
First(if-stmt) = { if } Follow(if-stmt) = { $, else }
statement else-part
else-part →
First(else-part) = { else,  } Follow(else-part) = { $, else }
else statement | 
exp → 0 | 1 First(exp) = { 0, 1 } Follow(exp) = { ) }

M[N, T] if other else 0 1 $


statement
statement statement → if-stmt
→ other
if-stmt → if ( exp )
if-stmt
statement else-part
else-part →
else statement else-part
else-part
→ 
else-part → 
exp exp → 0 exp → 1

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 52


Expression Grammar:
First Sets Computation
Grammar Rule Iteration 1 Iteration 2 Iteration 3
First(exp) = First(term)
exp → term exp
= { (, number }
exp → First(exp) =
addop term exp First(addop) = { +, –,  }
exp →  First(exp) = {  }
addop → + First(addop) = { + }
addop → – First(addop) = { +, – }
term → First(term) = First(factor) =
factor term = { (, number }
term → multop First(term) =
factor term First(mulop) = { *,  }
term →  First(term) = {  }
mulop → * First(mulop) = { * }
factor → ( exp ) First(factor) = { ( }
factor → number First(factor) = { (, number }

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 53


Expression Grammar:
Follow Sets Computation
Grammar Rule Iteration 1 Iteration 2
Follow(exp) = { $ }
Follow(exp) =
exp → term exp Follow(term) = First(exp) = { +, – }
Follow(exp) = { $, ) }
Follow(exp) = Follow(exp) = { $ }
Follow(addop) = First(term) = { (, number }
exp → Follow(term) =
Follow(term) = First(exp) = { +, – }
addop term exp Follow(exp) = { $, ), +, – }
Follow(term) = Follow(exp) = { $, +, – }
Follow(factor) = First(term) = { * }
term → Follow(term) =
Follow(factor) = Follow(term) = { $, +, –, * }
factor term Follow(term) = { $, ), +, – }
Follow(term) = Follow(term) = { $, +, – }
Follow(mulop) = First(factor) = { (, number }
term → multop Follow(factor) =
Follow(factor) = First(term) = {$, +, –, * }
factor term Follow(term) = { $, ), +, –, * }
Follow(factor) = Follow(term) = { $, +, –, * }
factor → ( exp ) Follow(exp) = First( ) ) = { $, ) }

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 54


Expression Grammar:
LL(1) Parsing Table Grammar Productions First Sets Follow Sets
exp → term exp First(exp) = { (, number } Follow(exp) = { $, ) }
exp → addop term exp |  First(exp) = { +, –,  } Follow(exp) = { $, ) }
addop → + | – First(addop) = { +, – } Follow(addop) = { (, number }
term → factor term First(term) = { (, number } Follow(term) = { $, ), +, – }
term → multop factor term |  First(term) = { *,  } Follow(term) = { $, ), +, – }
mulop → * First(mulop) = { * } Follow(mulop) = { (, number }
factor → ( exp ) | number First(factor) = { (, number } Follow(factor) = { $, ), +, –, * }

M[N,T] ( number ) + – * $
exp → exp →
exp
term exp term exp
exp exp → addopexp → addop exp
exp
→  term exp term exp → 
addop addop → + addop → –
term → term →
term
factor term factor term
term term → mulop term
term term →  term → 
→  factor term → 
mulop mulop → *
factor → factor →
factor
( exp ) number
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 55
Agenda
▪ Recursive-descendant parsing
– Left recursion removal
– Left factoring

▪ LL(1) parsing
– FIRST sets
– FOLLOW sets
– Parsing table
– LL(1) grammars

▪ Error recovery

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 56


LL(1) Grammar
▪ Grammar is LL(1) if associated LL(1) parsing table has at
most one production in each table entry
– An LL(1) grammar cannot be ambiguous

▪ G = (T, N, P, S) is LL(1) if following conditions are satisfied

– First(i)  First(j) = ,  A → 1 | 2 | … | n  i, j  [1..n]  i  j

– First(A)  Follow(A) = ,  A  N    First(A)

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 57


Non-LL(1) Programming Language
statement → assign-stmt | call-stmt | other
assign-stmt → identifier := exp
call-stmt → identifier ( exp-list )

▪ Replace assign-stmt and call-stmt by right-hand sides of


their defining productions
statement → identifier := exp
| identifier ( exp-list )
| other

▪ Left factoring
statement → identifier statement | other
statement → := exp | ( exp-list )

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 58


Expression Evaluation
in LL(1) Parsing
▪ Expression grammar Parsing
Input Action
Value
E → E + n | n Stack Stack
$ E 3 + 4 + 5 $ E → n E $
▪ Left recursion removal $ E n 3 + 4 + 5 $ match/push $
E → n E
$ E + 4 + 5 $ E → + n # E 3 $
E → + n E | 
$ E # n + + 4 + 5 $ match 3 $
▪ Value stack $ E # n 4 + 5 $ match/push 3 $
– Push number after each match $ E # + 5 $ add stack 4 3 $
– Add operation indicated on
parsing stack by a special $ E + 5 $ E → + n # E 7 $
pound symbol (#) $ E # n + + 5 $ match 7 $
– Left associativity
$ E # n 5 $ match/push 7 $

E → n E $ E # $ add stack 5 7 $
E → + n # E |  $ E $ E →  12 $
$ $ accept 12 $

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 59


Agenda
▪ Recursive-descendant parsing
– Left recursion removal
– Left factoring

▪ LL(1) parsing
– FIRST sets
– FOLLOW sets
– Algorithm

▪ Error recovery

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 60


Error Recovery
▪ Recogniser
– Determines if a program is syntactically correct
– Displays a helpful error message

▪ Goals
– Find error as soon as possible
– Find a good place to resume parsing
– Find as many real errors as possible
– Avoid error cascade
– Avoid infinite loops on errors

▪ Error correction or error repair


– Find a correct program closest to wrong one
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 61
Panic Mode Error Recovery
▪ Recursive descendant parsers

▪ Synchronising tokens for each recursive procedure


– Scan ahead (ignore input tokens) on errors until reaching one
synchronising token
– First sets and follow sets as synchronising tokens
– First sets allow parser detect errors early in parse

▪ In a recursive procedure parsing nonterminal N


– Check if next token is in First(N)
– If an error happens, ignore tokens until First(N)  Follow(N)
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 62
Recursive Descendant Error
Recovery in Simple Expression Grammar
procedure checkinput ( firstset, followset ) ; procedure scanto ( syncset ) ;
begin begin
if ( token  firstset ) then while token  syncset  { $ } do
error ; token = getToken () ;
scanto (firstset  followset ) ; end while ;
end if ; end scanto ;
end checkinput ;

procedure factor ( syncset ) ;


procedure exp ( syncset ) ; begin
begin checkinput ( { (, number }, syncset ) ;
checkinput ( { (, number }, syncset ) ; if ( token  { (, number } ) then
if ( token  { (, number } ) then case token of
term ( syncset  { +, – } ) ; (: match ( ( ) ;
while token = + or token = – do exp ( { ) } ) ;
match ( token ) ; match ( ) ) ;
term ( syncset  { +, – } ) ; number : match ( number ) ;
end while ; else error ;
checkinput ( syncset, { (, number } ) ; end case ;
end if ; checkinput ( syncset, { (, number } ) ;
end exp ; end if ;
end factor ;

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 63


Error Recovery in LL(1) Parsers
▪ Nonterminal A on top of stack
▪ Input token T

▪ If M[A, T] =   Error
– T  First(A)
– T  Follow(A), if   First(A)

▪ T = $  T  Follow(A)
– Pop A from stack

▪ T  $  T  First(A)  Follow(A)
– Scan input until T  First(A)  Follow(A)

▪ Push a new nonterminal onto stack

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 64


Parsing Table with
Error Recovery in Expression Grammar
Grammar Productions First Sets Follow Sets
exp → term exp First(exp) = { (, number } Follow(exp) = { $, ) }
exp → addop term exp |  First(exp) = { +, –,  } Follow(exp) = { $, ) }
addop → + | – First(addop) = { +, – } Follow(addop) = { (, number }
term → factor term First(term) = { (, number } Follow(term) = { $, ), +, – }
term → multop factor term |  First(term) = { *,  } Follow(term) = { $, ), +, – }
mulop → * First(mulop) = { * } Follow(mulop) = { (, number }
factor → ( exp ) | number First(factor) = { (, number } Follow(factor) = { $, ), +, –, * }

M[N,T] ( number ) + – * $
exp → exp →
exp pop scan scan scan pop
term exp term exp
exp → addop exp → addop
exp scan scan exp →  scan exp → 
term exp term exp
addop pop pop scan addop → + addop → – scan pop
term → term →
term pop pop pop scan pop
factor term factor term
term → mulop
term scan scan term →  term →  term →  term → 
factor term
mulop pop pop scan scan scan mulop → * pop
factor → factor →
factor pop pop pop pop pop
( exp ) number
17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 65
Error Recovery in
LL(1) Expression Grammar
Parsing Stack Input Action
$ exp ( 2 + * ) $ exp→ term exp
$ exp term ( 2 + * ) $ term → factor term
$ exp term factor ( 2 + * ) $ factor → ( exp )
$ exp term ) exp ( ( 2 + * ) $ match
$ exp term ) exp 2 + * ) $ exp → term exp
$ exp term ) exp term 2 + * ) $ term → factor term
$ exp term ) exp term factor 2 + * ) $ factor → 2
$ exp term ) exp term 2 2 + * ) $ match
$ exp term ) exp term + * ) $ term → 
$ exp term ) exp + * ) $ exp → addop term exp
$ exp term ) exp term addop + * ) $ addop → +
$ exp term ) exp term + + * ) $ match
$ exp term ) exp term * ) $ scan
$ exp term ) exp term ) $ pop
$ exp term ) exp ) $ exp → 
$ exp term ) ) $ match
$ exp term $ term → 
$ exp $ exp → 
$ $ accept

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 66


Conclusions
▪ Top-down parsing

▪ Recursive descendant parsing

▪ LL(1) parser generation algorithm

▪ Right recursive and left-factored grammars

▪ Panic mode error recovery

17.04.2024 R. Prodan, Compiler Construction, Summer Semester 2024 67

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy