0% found this document useful (0 votes)
10 views22 pages

Lecture 9

Uploaded by

Moumer Zaryab
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views22 pages

Lecture 9

Uploaded by

Moumer Zaryab
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 22

Context-Free Languages

Context-free grammar

• This is a different model for describing languages


• The language is specified by productions (substitution rules) that tell how
strings can be obtained, e.g.

A → 0A1 A, B are variables


A→B 0, 1, # are terminals
B→# A is the start variable

• Using these rules, we can derive strings like this:

A  0A1  00A11 000A111


 000B111
 000#111
Programming languages
• Context-free grammars are also used to describe (parts of)
programming languages
• For instance, expressions like (2 + 3) * 5 or
3 + 8 + 2 * 7 can be described by the CFG

<expr>  <expr> + <expr> Variables: <expr>


<expr>  <expr> * <expr> Terminals: +, *, (, ), 0, 1, …, 9
<expr>  (<expr>)
<expr>  0
<expr>  1

<expr>  9
Motivation for studying CFGs

• Context-free grammars are essential for understanding


the meaning of computer programs
code: (2 + 3) * 5

meaning: “add 2 and 3, and then multiply by 5”

• They are used in compilers


Definition of context-free grammar

• A context-free grammar (CFG) is a 4-tuple


(V, T, P, S) where
• V is a finite set of variables or non-terminals
• T is a finite set of terminals (V T = )
• P is a set of productions or substitution rules of the form
A→
where A is a symbol in V and is a string over V  T
• S is a variable in V called the start variable
Shorthand notation for productions

• When we have multiple productions with the same variable on


the left like

EE+E N  0N Variables: E, N
EE*E N  1N Terminals: +, *, (, ), 0,
E  (E) N0 1
EN N1 Start variable: E

we can write this in shorthand as


E  E + E | E * E | (E) | 0 | 1
N  0N | 1N | 0 | 1
Derivation
• A derivation is a sequential application of productions:
EE*E 
 (E) * E
 (E) * N means  can be obtained

derivation
 (E + E ) * N from  with one production
 (E + E ) * 1
 (E + N) * 1 *

 (N + N) * 1
means  can be obtained
 (N + 1N) * 1
from  after zero or more
 (N + 10) * 1
productions
 (1 + 10) * 1
Example 1

A → 0A1 | B variables: A, B
B→# terminals: 0, 1, #
start variable: A

• Is the string 00#11 in L?


• How about 00#111, 00#0#1#11?

• What is the language of this CFG?

L = {0n#1n: n ≥ 0}
Example 2

S  SS | (S) | 
convention: variables in uppercase,
terminals in lowercase, start variable first
• Give derivations of (), (()())

S  (S) (rule S  (S) (rule 2)


2)  (SS)
 () (rule (rule 1)
3)  ((S)S) (rule 2)
 ((S)(S)) (rule 2)
 (()(S)) (rule 3)
• How about ())?
 (()())
(rule 3)
Examples: Designing CFGs

• Write a CFG for the following languages

• The language L = {anbncmdm | n  0, m  0}

• The language L = {anbmcmdn | n  0, m  0}


Examples: Designing CFGs
Context-free versus regular
• Write a CFG for the language (0 + 1)*111
S  A111
A   | 0A | 1A
• Can you do so for every regular language?

Every regular language is context-free


From regular to context-free
regular expression CFG

 grammar with no rules


 S→
a (alphabet symbol) S →a
E1 + E 2 S→ S1 | S2
E1 E2 S→ S1S2
E1 * S→ SS1 | 

n all cases, S becomes the new start symbol


Context-free versus regular

• Is every context-free language regular?


• No! We already saw some examples:
A → 0A1 | B
B→#
L = {0n#1n: n ≥ 0}

• This language is context-free but not regular


Parse tree
• Derivations can also be represented using parse trees

E  E + E | E - E | (E) | E
V
Vx|y|z E+E
EE+E
V+E V ( E )
x+E
 x + (E)
x
 x + (E  E) E  E
 x + (V  E)
 x + (y  E) V V
 x + (y  V)
 x + (y  z)
y z
Definition of parse tree

• A parse tree for a CFG G is an ordered tree with labels on


the nodes such that
• Every internal node is labeled by a variable
• Every leaf is labeled by a terminal or 
• Leaves labeled by  have no siblings
• If a node is labeled A and has children A1, …, Ak from left to right,
then the rule
A → A1…Ak
is a production in G.
Left derivation
• Always derive the leftmost variable first:
E
EE+E
V+E E +E
x+E
 x + (E) V ( E )
 x + (E  E)
 x + (V  E) x E  E
 x + (y  E) V V
 x + (y  V)
 x + (y  z) y z

• Corresponds to a left-to-right traversal of parse tree


Ambiguity
• A grammar is ambiguous if some strings have more than one
parse tree

• Example: E  E + E | E  E | (E) |
V
Vx|y|z

E E

E +E E +E
x+y+z
V E +E E +E V
x V V V V z
y z x y
Why ambiguity matters
• The parse tree represents the intended meaning:

E E
E +E E +E
x+y+z
V E +E E +E V

x V V V V z
y z x y

“first add y and z, “first add x and y,


and then add this to x” and then add z to this”
Why ambiguity matters
• Suppose we also had multiplication:

E  E + E | E  E | E  E | (E) |


V
Vx|y|z
E E
E * E E +E
xy+z
V E +E E E V

x V V V V z
y z x y

“first y + z, then x ” “first x  y, then + z”


Disambiguation
• Sometimes we can rewrite the grammar to remove the
ambiguity
E  E + E | E  E | E  E | (E) |
V
Vx|y|z

• Rewrite grammar so  cannot be broken by +:

ET|E+T|E T stands for term: x * (y +


T z)
TF|TF F stands for factor: x, (y +
F  (E) | V z)
Vx|y|z A term always splits into
factors
A factor is either a variable
Disambiguation
• Example
E
ET|E+T|E
T
E T
TF|TF
F  (E) | V T
Vx|y|z
T F F

V V V

x  y + z

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy