0% found this document useful (0 votes)
29 views50 pages

Lecture 6

This document discusses parsing in compilers. It explains how a lexer generates tokens that are input to a parser. The parser then builds a parse tree from these tokens according to the rules of a context-free grammar. Context-free grammars and parse trees are described in detail.

Uploaded by

Vedang Chavan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views50 pages

Lecture 6

This document discusses parsing in compilers. It explains how a lexer generates tokens that are input to a parser. The parser then builds a parse tree from these tokens according to the rules of a context-free grammar. Context-free grammars and parse trees are described in detail.

Uploaded by

Vedang Chavan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

CS327 - Compilers

Parsing

Abhishek Bichhawat 02/02/2024


Parsing
● Lexer generates sequence of tokens that is input to parser
● Output is parse tree
○ if x == 0 then y = 1 else z = 2
INPUT: IF ID RELOP ID THEN ID = INT ELSE ID = INT
OUTPUT:
IF-THEN-ELSE

RELOP THEN ELSE

ID ID = =

ID INT ID INT
Parsing
● Lexer generates sequence of tokens that is input to parser
● Output is parse tree
● Not all tokenstreams are valid programs
● Parser needs to distinguish between valid and invalid programs
● Need a language for describing valid tokens
○ Regular languages are weakest formal languages
○ Many languages cannot be expressed in regular languages
■ Balanced parentheses {(i)i | i ≥ 0}
■ Arithmetic expressions
○ FA does not remember the number of times it has passed a state
○ Need a language that can recursively refer to constructs
Context Free Grammars
● Use CFGs for parsing
● Consist:
○ Set of terminals T
○ Set of non-terminals N
○ A non-terminal start symbol S
○ Set of productions of the form:
X → α1α2α3..αn
where X ∈ N and αi ∈ T ∪ N ∪ {ε}
● Context free means that the non-terminals can be replaced in any
order to get same result.
Context Free Grammars
String of balanced parentheses using CFG:

P→(P)
P→ε
Context Free Grammars
● Begin with a string consisting of the start symbol “S”
● Replace any non-terminal X in the string by the right-hand side
of some production:
X → α1 ... αn
● Repeat the above step until there are no non-terminals in the
string
Context Free Grammars
Formally, replace (a derivation is a step)
X1 … Xi-1 Xi Xi+1… → X1 … Xi-1 α1 … αm Xi+1 …
if there is a production
Xi → α1 … αm
Context Free Grammars
Formally,
X1 … Xn →* α1 … αm
if there are productions s.t.
X1 … Xn → … → α1 … αm
Language of Context Free Grammars
The language of a context-free grammar, G, having the start symbol
S is:
{α1 … αn | S →* α1 … αn ; forall i = 1 to n, αi is a terminal}

Terminals cannot be replaced in the string

In the context of Compilers, terminals are tokens of the language


CFG for Arithmetic Expressions
E→n
E → id
E→E+E
E→E–E
E→E*E
E → (E)
CFG for Arithmetic Expressions
E→n
| id
|E+E
|E–E
|E*E
| (E)
CFG for a Language with Conditionals
R → == | > | < | >= | <=

E → id | n | E + E | E R E

S → id = E | if E then S else S
| while E do S
Parse Trees
● Parse trees are representation of derivations that show a
sequence of productions leading to only terminals.
● Start symbol is the root of the tree
● For every production, from the left-hand non-terminal (X), add
an edge to the (non-)terminals (αi) on the right-hand side, each of
which become the children of X
● Terminals form the leaf nodes of the tree
● In-order traversal of the leaf nodes gives the input
Parse Trees - Example
CFG : E → n | id | E + E | E – E | E * E | (E)

String : id * (id + id)

Derivations: E → E * E → id * E → id * (E) → id * (E + E)
→ id * (id + E) → id * (id + id)
Parse Trees - Example E
E CFG :
E → n | id | E + E
|E–E|E*E
| (E)

Str : id * (id + id)


Parse Trees - Example E
E CFG :
E → n | id | E + E E * E
→ E*E
|E–E|E*E
| (E)

Str : id * (id + id)


Parse Trees - Example E
E CFG :
E → n | id | E + E E * E
→ E*E
|E–E|E*E
→ id * E | (E) id

Str : id * (id + id)


Parse Trees - Example E
E CFG :
E → n | id | E + E E * E
→ E*E
|E–E|E*E
→ id * E | (E) id ( E )

→ id * (E) Str : id * (id + id)


Parse Trees - Example E
E CFG :
E → n | id | E + E E * E
→ E*E
|E–E|E*E
→ id * E | (E) id ( E )

→ id * (E) Str : id * (id + id)


E + E
→ id * (E + E)
Parse Trees - Example E
E CFG :
E → n | id | E + E E * E
→ E*E
|E–E|E*E
→ id * E | (E) id ( E )

→ id * (E) Str : id * (id + id)


E + E
→ id * (E + E)

→ id * (id + E) id
Parse Trees - Example E
E CFG :
E → n | id | E + E E * E
→ E*E
|E–E|E*E
→ id * E | (E) id ( E )

→ id * (E) Str : id * (id + id)


→ id * (E + E) E + E

→ id * (id + E) id id
→ id * (id + id)
Left- and Right- Derivations
● Previous derivation was a left-derivation
○ Replaced left-most non-terminal symbol
● Similar right-derivation possible
○ E →E*E
→ E * (E)
→ E * (E + E)
→ E * (E + id)
→ E * (id + id)
→ id * (id + id)
Parse Trees - Example E
E CFG :
E → n | id | E + E E * E
→ E*E
|E–E|E*E
→ E * (E) | (E) ( E )

Str : id * (id + id)


Parse Trees - Example E
E CFG :
E → n | id | E + E E * E
→ E*E
|E–E|E*E
→ E * (E) | (E) ( E )

→ E * (E + E) Str : id * (id + id)


E + E
Parse Trees - Example E
E CFG :
E → n | id | E + E E * E
→ E*E
|E–E|E*E
→ E * (E) | (E) ( E )

→ E * (E + E) Str : id * (id + id)


E + E
→ E * (E + id)
id
Parse Trees - Example E
E CFG :
E → n | id | E + E E * E
→ E*E
|E–E|E*E
→ E * (E) | (E) ( E )

→ E * (E + E) Str : id * (id + id)


E + E
→ E * (E + id)

→ E * (id + id) id id
Parse Trees - Example E
E CFG :
E → n | id | E + E E * E
→ E*E
|E–E|E*E
→ E * (E) | (E) id ( E )

→ E * (E + E) Str : id * (id + id)


→ E * (E + id) E + E

→ E * (id + id) id id
→ id * (id + id)
CFG for Conditionals in Python
expr ::= ...
stmt ::= ... | cond | …
stmts ::= stmt | stmts stmt
bstmt ::= INDENT stmts DEDENT
cond ::= "if" expr ‘:’ bstmt elifs else
elifs ::= ε | "elif" expr ‘:’ bstmt elifs
else ::= ε | "else" ‘:’ bstmt
Question
Which of the strings are in the language given by the CFG:
S → aXa
X → ε | bY
Y → ε | cXc

1. abcba
2. acca
3. aba
4. abcbcba
Question
Which of the strings are in the language given by the CFG:
S → aXa
X → ε | bY
Y → ε | cXc

1. abcba
2. acca
3. aba
4. abcbcba
Question
Which of the following are valid derivations for :
S → aXa
X → ε | bY
Y → ε | cXc

1. S → aXa → abYa → acXca → acca


2. S → aXa → aa
3. S → aXa → abYa → abcXca → abcbYca → abcbca
4. S → aXa → abYa → abcXcba → abccba
Question
Which of the following are valid derivations for :
S → aXa
X → ε | bY
Y → ε | cXc

1. S → aXa → abYa → acXca → acca


2. S → aXa → aa
3. S → aXa → abYa → abcXca → abcbYca → abcbca
4. S → aXa → abYa → abcXcba → abccba
Parse Trees - Example (Left) E
E CFG :
E → n | id | E + E
|E–E|E*E

Str : id * id + id
Parse Trees - Example (Left) E
E CFG :
E → n | id | E + E E * E
→ E*E
|E–E|E*E

Str : id * id + id
Parse Trees - Example (Left) E
E CFG :
E → n | id | E + E E * E
→ E*E
|E–E|E*E
→ id * E id
Str : id * id + id
Parse Trees - Example (Left) E
E CFG :
E → n | id | E + E E * E
→ E*E
|E–E|E*E
→ id * E id
Str : id * id + id E + E
→ id * E + E
Parse Trees - Example (Left) E
E CFG :
E → n | id | E + E E * E
→ E*E
|E–E|E*E
→ id * E id
Str : id * id + id E + E
→ id * E + E
id
→ id * id + E
Parse Trees - Example (Left) E
E CFG :
E → n | id | E + E E * E
→ E*E
|E–E|E*E
→ id * E id
Str : id * id + id E + E
→ id * E + E
id id
→ id * id + E

→ id * id + id
Parse Trees - Example (Right) E
E CFG :
E → n | id | E + E
|E–E|E*E

Str : id * id + id
Parse Trees - Example (Right) E
E CFG :
E → n | id | E + E E + E
→ E+E
|E–E|E*E

Str : id * id + id
Parse Trees - Example (Right) E
E CFG :
E → n | id | E + E E + E
→ E+E
|E–E|E*E
id
→ E + id
Str : id * id + id
Parse Trees - Example (Right) E
E CFG :
E → n | id | E + E E + E
→ E+E
|E–E|E*E
id
→ E + id
Str : id * id + id E * E
→ E * E + id
Parse Trees - Example (Right) E
E CFG :
E → n | id | E + E E + E
→ E+E
|E–E|E*E
id
→ E + id
Str : id * id + id E * E
→ E * E + id
id
→ E * id + id
Parse Trees - Example (Right) E
E CFG :
E → n | id | E + E E + E
→ E+E
|E–E|E*E
id
→ E + id
Str : id * id + id E * E
→ E * E + id
id id
→ E * id + id

→ id * id + id
Multiple Parse Trees

Left-most derivation Right-most derivation


E E

E * E E + E

id id
E + E E * E

id id id id
Multiple Parse Trees
E
Left-most derivation Right-most derivation
E E E * E

E * E E + E id ( E )

id id
E + E E * E E + E

id id id id id id
Multiple Parse Trees
● A grammar is ambiguous if it has more than one parse tree for
some string, i.e., it has more than one left-most or right-most
derivation
● Not good for compilation
○ Programs are ill-defined
Ambiguity
S → if E then S
| if E then S else S
Ambiguity
S → if E then S
| if E then S else S

Str: if E1 then if E2 then S1 else S2

if

E1 if S2

E2 S1
Ambiguity
S → if E then S
| if E then S else S

Str: if E1 then if E2 then S1 else S2

if if

E1 if S2 E1 if

E2 S1 E2 S1 S2

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy