Conversion of A Chomsky Normal Form Grammar To Greibach Normal Form
Conversion of A Chomsky Normal Form Grammar To Greibach Normal Form
aababa
S aSX
aaSXX
aabXX
aabXbX
aababX
aababa 3
Left recursion
A Aa | b
4
Left Recursion Removal
A bZ | cZ | b | c
Solution:
Z aZ | bZ | a | b
A bZ | b
Z aZ | a
8
Definition
• A CFG is in Greibach normal form if each
rule has one of these forms:
i. S aA1A2…An
ii. A aA1A2…An
iii. A a
Conversion
• Convert from Chomsky to Greibach in
two steps:
1. From Chomsky to intermediate grammar
a. Eliminate direct left recursion
b. Use A uBv rules transformations to
improve references (explained later)
2. From intermediate grammar into
Greibach
Eliminate direct left recursion
• Before
A Aa | b
• After
A bZ | b
Z aZ | a
• Remove the rule with direct left recursion,
and create a new one with recursion on
the right
Eliminate direct left recursion
• Before
A Aa | Ab | b | c
• After
A bZ | cZ | b | c
Z aZ | bZ | a | b
• Remove the rules with direct left
recursion, and create new ones with
recursion on the right
Substitution Rule
Transform A uBv rules
• Before
A uBb
B w1 | w2 |…| wn
• After
Add A uw1b | uw2b |…| uwnb
Delete A uBb
Conversion: Step 1
• Goal: construct intermediate grammar in
this format
i. A aw
ii. A Bw
iii. S
where w V* and B comes after A
Conversion: Step 1
• Assign a number to all variables starting
with S, which gets 1
• Transform each rule following the order
according to given number from lowest to
highest
– Eliminate direct left recursion
– If RHS of rule starts with variable with lower
order, apply A uBb transformation to fix it
Conversion: Step 2
• Goal: construct Greibach grammar out of
intermediate grammar from step 1
• Fix A Bw rules into A aw format
– After step 1, last original variable should have
all its rules starting with a terminal
– Working from bottom to top, fix all original
variables using A uBb transformation
technique, so all rules become A aw
• Fix introduced recursive rules same way
Conversion Example
• Convert the following grammar from
Chomsky normal form, into Greibach
normal form
1. S AB |
2. A AB | CB | a
3. B AB | b
4. C AC | c
Conversion Strategy
• Goal: transform all rules which RHS does
not start with a terminal
• Apply two steps conversion
• Work rules in sequence, eliminating direct
left recursion, and enforcing variable
reference to higher given number
• Fix all original rules, then new ones
Step 1: S rules
• Starting with S since it has a value of 1
• S AB |
• S rules comply with two required
conditions
– There is no direct left recursion
– Referenced rules A and B have a given number
higher than 1. A corresponds to 2 and B to 3.
Step 1: A rules
• A AB | CB | a
• Direct left recursive rule A AB needs to
be fixed. Other A rules are fine
• Apply direct left recursion transformation
A CBR1 | aR1 | CB | a
R1 BR1 | B
Step 1: B rules
• B AB | b
• B AB rule needs to be fixed since B
corresponds to 3 and A to 2. B rules can
only have on their RHS variables with
number equal or higher. Use A uBb
transformation technique
• B CBR1B | aR1B | CBB | aB | b
Step 1: C rules
• C AC | c
• C AC rule needs to be fixed since C
corresponds to 4 and A to 2. Use same A
uBb transformation technique
• C CBR1C | aR1C | CBC | aC | c
• Now variable references are fine according
to given number, but we introduced direct
left recursion in two rules…
Step 1: C rules
• C CBR1C | aR1C | CBC | aC | c
• Eliminate direct left recursion
C aR1CR2 | aCR2 | cR2 | aR1C | aC | c
R2 BR1CR2 | BCR2 | BR1C | BC
Step 1: Intermediate grammar
• S AB |
• A CBR1 | aR1 | CB | a
• B CBR1B | aR1B | CBB | aB | b
• C aR1CR2 | aCR2 | cR2 | aR1C | aC | c
• R1 BR1 | B
• R2 BR1CR2 | BCR2 | BR1C | BC
Step 2: Fix starting symbol
• Rules S, A, B and C don’t have direct left
recursion, and RHS variables are of higher
number
• All C rules start with terminal symbol
• Proceed to fix rules B, A and S in bottom-
up order, so they start with terminal symbol.
• Use A uBb transformation technique
Step 2: Fixing B rules
• C aR1CR2 | aCR2 | cR2 | aR1C | aC | c
• Before
B CBR1B | aR1B | CBB | aB | b
• After
B aR1B | aB | b
B aR1CR2BR1B | aCR2BR1B | cR2BR1B |
aR1CBR1B | aCBR1B | cBR1B
B aR1CR2BB | aCR2BB | cR2BB | aR1CBB |
aCBB | cBB
Step 2: Fixing A rules
• C aR1CR2 | aCR2 | cR2 | aR1C | aC | c
• Before
A CBR1 | aR1 | CB | a
• After
A aR1 | a
A aR1CR2BR1 | aCR2BR1 | cR2BR1 | aR1CBR1 |
aCBR1 | cBR1
A aR1CR2B | aCR2B | cR2B | aR1CB | aCB | cB
Step 2: Fixing S rules
• Before
S AB |
• After
S
S aR1B | aB
S aR1CR2BR1B | aCR2BR1B | cR2BR1B | aR1CBR1B |
aCBR1B | cBR1B
S aR1CR2BB | aCR2BB | cR2BB | aR1CBB | aCBB |
cBB
Step 2: Complete conversion
• All original rules S, A, B and C are fully
converted now
• New recursive rules need to be converted
next
R1 BR1 | B
R2 BR1CR2 | BCR2 | BR1C | BC
• Use same A uBb transformation
technique replacing starting variable B
Conclusions
• After conversion, since B has 15 rules, and R1
references B twice, R1 ends with 30 rules
• Similar for R2 which references B four times.
Therefore, R2 ends with 60 rules
• All rules start with a terminal symbol (with the
exception of S )
• Parsing algorithms top-down or bottom-up would
complete on a grammar converted to Greibach
normal form
Comparison of Normal forms
S SaB | aB
B bB |
• Adding a non-recursive start symbol S’ and
removing and chain rules yields
S’ SaB | Sa | aB | a
S SaB | Sa | aB | a
B bB | b
Theory of Automata 31
• Chomsky Normal • Greibach Normal
form is obtained as: form is obtained as:
S’ ST | SA | AB | a S’ aBZT | aZT |aBT |
S ST | SA | AB | a aT |aBZA |aZA | aBA |
B CB | b aA |aB |a
T CB S aBZ |aZ |aB | a
Aa B bB | b
Cb T bB
Aa
Cb
Z aBZ | aZ | aB |a
Theory of Automata 32
abaaba
G CNF GNF
S SaB S SA S aBZA
SaBaB STA abZA
SaBaBaB SATA abaZA
aBaBaBaB ABATA abaaBA
abaBaBaBaB aBATA abaabA
Þ abaBaBaB Þ abATA Þ abaaba
Þ abaaBaB Þ abaTA
Þ abaabBaB Þ abaABA
Þ abaabaB Þ abaaBA
Þ abaaba Þ abaabA
ÞTheory
abaaba
of Automata 33
Examples
Convert to GNF
S->aSX|b
X->Xb|a
Theory of Automata 34
Nov 1,2012 Mehreen Alam - Fall 2012