Regular Expressions
Regular Expressions
Regular Language
• A language L is known as regular if and only if it is recognized by a finite
accepter (FA).
• Language L is regular if and only if it is recognized by a DFA. (??)
• Language L is regular if and only if it is recognized by an NFA. (??)
ü r+L=L+r =r
ü r × L = L ×r = r
ü L* = L
ü (r + L)+ = r*
ü r+s=s+r
ü r×(s + t) = r×s + r×t
ü r×(s.t) = (r×s) ×t
ü r+ = r r*
ü r* = r*(r + L) = r* r* = (r*)*
ü (r*s*)* = (r + s)*
Valid Regular Expressions: Example
• Let Σ ={a, b, c}
• f, λ, a, b, c
• a*, b*
• a.b, b.a, a+b, (a.b)*, (b.a)*, (a+b)*
• a + b is equivalent to b+a
• a.b is not equivalent to b.a
• (a + b.c)*
• (c+f)
Why ?
Invalid Regular Expressions: Example
• Let Σ ={a, b, c}
Why ?
Some Notations
• Parentheses in regular expressions can be omitted when the order
of evaluation is clear.
• ((0+1)*) = (0+1)* ¹ 0+1*
• ((0*)+(1*)) = 0* + 1*
• For concatenation, × can be omitted.
• r× r× r… r is denoted by rn.
n times
Simple Examples over S= {0,1}
• {aÎS*|a does not contain 1’s}
• 0*
• S*
• (0+1)*
• Note: 0* + 1* ¹ (0+1)*
Examples over S = {0,1}
• Strings of even length, L={00,01,10,11} *
• (00+01+10+11) * or
• ((0+1)(0+1))*
• (0 | 1)*011
• all strings ending with “011”
• 0*1*
• all strings with no “0” after “1”
• 00*11*
• all strings with at least one “0” and one “1”, and no “0” after “1”
Regular Expressions : Example
( ) ( )
• r1 + r2 is R.E., then L r1 + r2 = L r1 È L r2 ( )
L(r1)ÈL(r2) = {w | w Î L(r1) or w Î L(r2)}
(a + b ) × a *
Regular Expression:
r = (a + b ) * (a + bb )
L(r ) = {a, bb, aa, abb, ba, bbb,...}
r = (aa ) * (bb ) * b
L(r ) = {a b b : n, m ³ 0}
2n 2m
r = (0 + 1) * 00 (0 + 1) *
L(r ) = { all strings containing substring 00 }
RE & RL: Example
• λ* is RE, then the language
L(λ *) = {λ}* = {λ}
• (01*)(01)
• {001, 0101, 01101, 011101, …..}
• (0 | 1)*
• {0, 1, 00, 01, 10, 11, …..}
• i.e., all strings of 0 and 1
• (0 | 1)* 00 (0 | 1)*
• {00, 1001, …..}
• i.e., all 0 and 1 strings containing a “00”
EQUIVALENT REs
• Two regular expressions r and s are equivalent (r=s), if and only if r and s
represent/generate the same language.
• Example-:
• r = a|b, s = b|a è r = s Why?
• Since L(r) = L(s) = {a, b}
L(Æ) = { } = Æ
start q0 qf
start a
q0 qf L (a)= {a}
Connection Between RE & RL
If r1 and r2 are regular expressions, Mr1 and Mr2 are their NFAs.
Then, r1 + r2 has NFA. Convert Mr1 into NFA with single final state.
Convert Mr2 into NFA with single final state.
! Mr1 !
start
qi qf
! !
Mr2
L( (r1 + r2 ) ) = L(Mr1) È L(Mr2)
where qi and qf are new initial / final states, and !-moves are introduced from qi to the old
start states of Mr1 and Mr2 as well as from all of their final states to qf.
Connection Between RE & RL
If r1 and r2 are regular expressions, Mr1 and Mr2 are their NFAs.
Then, r1.r2 has NFA.
Convert Mr1 into NFA with single final state.
Convert Mr2 into NFA with single final state.
L( (r1.r2 ) ) = L(Mr1).L(Mr2)
start
qi ! ! ! qf
Mr1 Mr2
where qi is the new initial state of Mr1 and qf is the new final state of Mr2.
!-move is introduced from final state of Mr1 to initial state of Mr2 .
Connection Between RE & RL
If r1 is a regular expressions and Mr1 its NFA,
(r1)* (Kleene star) has NFA:
L( (r1)* ) = ( L(r1) )*
!
start
qi ! ! qf
Ms
start a
start b
start b ! a
q1
a
!
!
start
! b
!
Example-1
• Build an NFA-e that accepts (a|b)*ba
(a|b)*
!
a
!
! !
!
! b !
!
Example-1
• Build an NFA-e that accepts (a|b)*ba
a
!
! !
!
! b !
!
b ! a
Example-2
R.E. a ( b | c )*
1. a, b, & c S0
a
S1 S0
b
S1 S0
c
S1
b
e S1 S2 e
2. b | c S0 S5
c
e S3 S4 e
e
b
e S2 S3 e
S0
e S1 S6
e S7
e c
3. ( b | c )* S4 S5 e
e
Example-2
4. a ( b | c )*
e e e
b
e S4 S5 e
S0
a
S1 e
S2 S3 S8 S9
e c
S6 S7 e
e
b|c
a
S0 S1
Example-3
NFA for : a | abb | a*b+
NFA’s :
a start a
1 2
abb start a b b
3 4 5 6
a b
a*b+ start
7 8
b
Example-3
NFA for : a | abb | a*b+
! a
1 2
start ! a b b
0 3 4 5 6
a b
!
7 8
b
Example-4
e
a e b e c
2 3 4 5 6 7
e e
e
17
1 b
10 11
e e e
e e
a e c e
8 9 12 13 14 15 16
e
Construct NFAs for RE over S = {0,1}
• Strings of even length, L={00,01,10,11} *
• (00+01+10+11) * or
• ((0+1)(0+1))*
Incomplete GTG
Complete GTG
Incomplete GTG
Complete GTG
Complete GTG
Incomplete GTG
Complete GTG
GTG Reduction to Regular Expression
• A GTG G can be reduced to one GTG G’ with just two states (Initial and final
states)
• If we reduce an NFA-e in this way, the arc label then corresponds to the
regular expression representing it.
RE for GTG
• For two-state complete GTG, the Regular Expression is given as.
r = (r1)*r2 (r4 + r3(r1)*r2)*
• The regular expression r covers all possible paths from initial state to final
state.
• First path q0 to qf (Self Loop + Direct path from q0 to qf )
OR
• Second path q0 to qf (Self Loop + Direct path from q0 to qf + Indirect path (qf to q0 to qf))
Example
r = (a)*(a+b) ( c + f(a)*(a+b) )*
r = (a)*(a+b) ( c + f)*
r = a*(a|b) c*
b b
Transition labels a
are regular q0 q1 a + b q2
expressions
b
RE for GTG: Example
Covert to
Complete GTG
Remove q1
Simplify It
RE for GTG: Example
r = (bb*a)*(bb*(a+b)) ( b + f(bb*a)*(bb*(a+b)) )*
r = (bb*a)*(bb*(a|b)).b*
NFA to RE :Procedure
• Step-1:
• Convert NFA N into NFA N’ with single final State distinct from its initial state.
• Step-2:
• Convert NFA N’ into complete GTG G.
• Let rij stand for the label of the edge from qi to qj.
• Step-3:
• If the GTG has only two states, with qi as its initial state and qj its final state.
• The associated regular expression is r.
qf
Convert to
Complete GTG
Example-1
Remove State OE
Considerable Paths
EE to EE
EE to OO
OO to EE
OO to OO
EO to EO
Example-1
Remove State OO
Considerable Paths
EE to EE
EE to EO
EO to EE
EO to EO
Example-1
r = (aa+ab(bb)*ba)*(b+ ab(bb)*a) .
Find RE (a(bb)*a + (b+a(bb)*ba)(aa+ab(bb)*ba)*(b+ ab(bb)*a) )*
Find RE for NFA: Example-2
Example-2
b b
a
q0 q1 a, b q2
b
Convert transitions into RE
b b
a
q0 q1 a + b q2
b
Example-2
Covert to
Complete GTG
Remove q1
Simplify It
Example-2
r = (bb*a)*(bb*(a+b)) ( b + f(bb*a)*(bb*(a+b)) )*
r = (bb*a)*(bb*(a|b)).b*
Find RE for NFAs
0 0,1
1 1
Start 3 1 2
0
Theorem
Languages
Generated by
Regular Expressions
= Regular
Languages
Theorem - Part 1
Languages
Generated by Í Regular
Languages
Regular Expressions
Languages
Generated by Ê Regular
Languages
Regular Expressions
L( M1) = Æ = L(Æ)
regular
L( M 2 ) = {l} = L(l )
languages
a
L( M 3 ) = {a} = L(a )
Inductive Hypothesis
L(r1 × r2 )
Are regular
Languages
L(r1 *)
L((r1 ))
• By definition of regular expressions:
L(r1 *) = ( L(r1 )) *
L((r1 )) = L(r1 )
By inductive hypothesis we know:
L(r1) and L(r2 ) are regular languages
We also know:
There exists NFAs for the following languages
Union L(r1 ) È L(r2 )
Concatenation L(r1 ) L(r2 )
Star ( L(r1 )) *
• Therefore:
Are regular
L(r1 × r2 ) = L(r1 ) L(r2 )
languages
L(r1 *) = ( L(r1 )) *
• And trivially:
Example:
S ® aSb S ® Ab
S ®l A ® aAb
A®l
A Non-Linear Grammar
Grammar G: S ® SS
S ®l
S ® aSb
S ® bSa
L(G ) = {w : na ( w) = nb ( w)}
Number of a in string w
Another Linear Grammar
G S®A
Grammar : A ® aB | l
B ® Ab
n n
L(G ) = {a b : n ³ 0}
Right-Linear Grammars
A grammar G = (V, T, S, P) is said to be right-linear if all productions are
of the form
A ® xB
or
A® x
Where
A, B ε V and x ε T*
Example
S ® abS
S ®a
Left-Linear Grammars
A grammar G = (V, T, S, P) is said to be left-linear if all productions are of
the form
A ® Bx
or
A® x
Where
A, B ε V and x ε T*
Left-Linear Grammars
Example
S ® Aab S ® Aab
A ® Aab | B A ® Aab | B
B®a B®a
Regular Grammars
Regular Grammars
A regular grammar is one that is either right-linear or left-linear grammar.
Examples:
G1 G2
S ® abS S ® Aab
S ®a A ® Aab | B
B®a
Observation
Regular grammars generate regular languages L(G2 ) = aab(ab) *
Examples:
G2
G1 S ® Aab
S ® abS A ® Aab | B
S ®a B®a
L(G1) = L( (ab)*a ) L(G1) = L(aab(ab)*)
Example
G1 G2 G3
S S1ab
S aaA | Abb S Aabc
S1 S1ab | S2
A aB A abBc S2 c
B b B c Linear, Left Linear
Linear Only Linear Only Regular
G5 G6
G4 S S1
S abS1
S A S1 abS1 | S2 S1 S2
A aB | l S2 c S2 a| b | c
B Ab Linear, Right Linear, Linear, Right Linear, Left Linear
Regular Regular
Linear only
Regular Grammars
Generate
Regular Languages
Theorem
Languages
Generated by
Regular Grammars
= Regular
Languages
Theorem - Part 1
Languages
Generated by
Regular Grammars
Í Regular
Languages
Languages
Generated by
Regular Grammars
Ê Regular
Languages
L( M ) = L(G )
Right-Linear Grammar To NFA
Grammar G is right-linear.
G = (V, T, S, P) = ({S, A, B}, {a, b}, S, P)
S ® aA | B
Example:
A ® aa B
B®b B|a
Right-Linear Grammar To NFA
Construct NFA M such that every state is a grammar variable:
M = (Q, {a, b}, δ, S, {VF})
A
Q = {S, A, B, VF} special
S VF final state
B
S ® aA | B
A ® aa B
B®b B|a
Right-Linear Grammar To NFA
Add edges for each production: Q = {S, A, B, VF}
S ® aA
a A
S VF
B
Right-Linear Grammar To NFA
Q = {S, A, B, VF}
S ® aA | B
a A
S VF
l
B
Right-Linear Grammar To NFA
A Q = {S, A, B, C, VF}
S ® aA | B a
a
A ® aa B
C
S a VF
l
B
Right-Linear Grammar To NFA
A Q = {S, A, B, C, VF}
S ® aA | B
a a
A ® aa B
C
B ® bB
S a VF
l
B
b
Right-Linear Grammar To NFA
A
S ® aA | B
a a
A ® aa B
C
B ® bB | a
S a VF
l a
M =(Q, Σ, δ, S, F)
B Q = {S, A, B, C, VF}
Σ = {a, b}
b F = {VF}
Right-Linear Grammar To NFA
d*(q0, aaaba)
= d( d*(q0, aaab), a)
= d( d ( d*(q0, aaa), b), a)
= d( d ( d ( d*(q0, aa), a), b), a)
= d( d ( d ( d (d*(q0, a), a), a), b), a)
= d( d ( d ( d (d (d*(q0, l), a), a), a), b), a)
= d( d ( d ( d (d (q0, a), a), a), b), a)
= d( d ( d ( d (A, a), a), b), a)
= d( d ( d ( C, a), b), a)
= d( d ( B, b), a)
= d( B, a) = VF
V1
V0 V3
V2 V4
VF
Add one special state VF.
special
V0 is initial state and VF is final.
final state
Right-Linear Grammar To NFA
Vi a1 a2 qj ……… am V
qi qk j
Vi a1 a2 ……… am
q’i q’j q’k
VF
Exercise-3
S®e A ® bA C ® aC
S ® aB A ® cA C ® bC
S ® aC A®e C®e
S ® bA B ® aB
S ® bC B ® cB
S ® cA B®e
S ® cB
The case of Left-Linear Grammars
Proof idea:
We will construct a right-linear grammar G¢such that
R
L(G ) = L(G¢)
The case of Left-Linear Grammars
Since G is left-linear grammar.
The productions look like:
A ® Ba1a2 !ak
A ® a1a2 !ak
The case of Left-Linear Grammars
Let G be left-linear grammar.
• Construct right-linear grammar G’ by reversing the production rules
• Construct NFA N for right-linear grammar G’
• Convert NFA N into NFA N’ with single final state
• Reverse direction of all edges (Now NFA N’’)
• Mark initial state as final state and final state as initial state.
The case of Left-Linear Grammars
Construct right-linear grammar G’ by reversing the production rules
Left A ® Ba1a2 !ak
linear
G
A → Bv
Right
G¢ A ® ak !a2a1B
linear
R
A®v B
The case of Left-Linear Grammars
Construct right-linear grammar G’ by reversing the production rules
Left
G A ® a1a2 !ak
linear
A®v
Right
linear G¢ A ® ak !a2a1
R
A®v
The case of Left-Linear Grammars
G Left-Linear G’ Right-Linear
S Aabc S cbaA
A Ab | B A bA | B
B Cac B caC
C Cb | l C bC | l
The case of Left-Linear Grammars
G’ Right-Linear
S cbaA Construct NFA N for right-linear grammar G’
A bA | B
B caC
C bC | l
S c D
b
E a A l B c
F
a
b
M =(Q, Σ, δ, S, F) l C
Q = {S, A, B, C,D, E, F, VF} VF
Σ = {a, b, c} b
F = {VF}
The case of Left-Linear Grammars
c b a A l B c a l VF
S D E F C
b
b
The case of Left-Linear Grammars
G Left-Linear
S Aabc
A Ab | B L(G) = L(b*acb*abc)
B Cac
C Cb | l
c b a A l B c a l VF
S D E F C
M b
L(M) = L( b*acb*abc) b
The case of Left-Linear Grammars
R
It is easy to see that: L(G ) = L(G¢)
Since G¢ is right-linear, we have:
R
L(G¢) L(G¢) L(G )
Regular Regular Regular
Language Language Language
Find NFA
• Regular grammar: G = (V, T, S, P), where
• V = {S, B, C, D}, T = {a, b},
Exercise-1 • P = {S ® Sa, S ® Sb, S ® Ba, B ® Ca, C ® Da, D ® a}
Exercise-3
S®e A ® Ab C ® Ca
S ® Ba A ® Ac C ® Cb
S ® Ca A®e C®e
S ® Ab B ® Ba
S ® Cb B ® Bc
S ® Ac B®e
S ® Bc
Proof - Part 2
Languages
Generated by
Regular Grammars
Ê Regular
Languages
Proof idea:
Let M be the NFA with L = L(M ).
Construct from M a regular grammar G
such that
L( M ) = L(G )
NFA-to-RG
Since L is regular
there is an NFA M such that L = L(M )
b
Example:
M a
a
q0 q1 q2
l b
L = L( ab*ab(b*ab)* ) q3
L = L(M )
NFA-to-RG
Convert M to a right-linear grammar G
q0 ® aq1
NFA-to-RG
Convert M to a right-linear grammar G
q0 ® aq1
q1 ® bq1
q1 ® aq2
NFA-to-RG
Convert M to a right-linear grammar G
q0 ® aq1
q1 ® bq1
q1 ® aq2
q2 ® bq3
NFA-to-RG
Convert M to a right-linear grammar G
q0 ® aq1
q1 ® bq1
q1 ® aq2
q2 ® bq3
q3 ® q1
q3 ® l
NFA-to-RG
q0 ® aq1
G q1 ® bq1
q1 ® aq2
q2 ® bq3
q3 ® q1
L(M) = L( ab*ab(b*ab)* )
q3 ® l
q a p
For any transition:
Add production: q ® ap
Add production: qf ®l
NFA-to-RG: In General
with L(G ) = L( M ) = L
Find Regular Grammar(Right-Linear)
Find Regular Grammar(Right-Linear)
Any regular language L is generated
by some regular grammar G
Proof idea:
Let M be the NFA with L = L(M ).
Construct from M a regular grammar G
such that
L( M ) = L(G )
NFA-to-RG (Left Linear Grammar)
Since L is regular
there is an NFA M such that L = L(M )
b
Example:
M a
a
q0 q1 q2
l b
L = L( ab*ab(b*ab)* ) q3
L = L(M )
NFA-to-RG (Left Linear Grammar)
b
M’
Construct NFA M’ with single a a
final state q0 q1 q2
Reverse direction of all
edges (Now NFA N’’) l b
Mark initial state as final
state and final state as q3
initial state.
NFA-to-RG (Left Linear Grammar)
M’
Find Right-Linear Grammar G’
q3 bq2
q2 aq1
q1 bq1
q1 aq0
q1 q3
q0 l
NFA-to-RG (Left Linear Grammar)
M’
Find reverse Linear Grammar G
Left-Linear Grammar G
q3 q2 b
q2 q1 a
q1 q1 b
q1 q0 a
q1 q3
q0 l L = L( ab*ab(b*ab)* )
q3 is start variale
NFA-to-RG (Left Linear Grammar):In-general
Let L be regular language.
L is recognized by NFA M.
Construct NFA with single final state
Reverse direction of all edges
Mark initial state as final state and final state as initial state (Now NFA M’)
Find Right-linear grammar G’
Find reverse grammar G
G is Left-linear grammar.
Find Regular Grammar(Left-Linear)
Find Regular Grammar(Left-Linear)
Right-Linear Grammar to Left Linear Grammar
Let G be right-linear.
Construct NFA N for grammar G
Convert NFA N into NFA with single final state
Reverse direction of all edges
Mark initial state as final state and final state as initial state (Now NFA N’)
Find Right-linear grammar G’ from NFA
Find reverse grammar G using G’
G is Left-linear grammar.
Right-Linear Grammar to Left Linear Grammar
S ® aA | B A Construct NFA N for grammar G
Convert NFA N into NFA with single final state
A ® aa B a a
B ® bB | a C
S a VF
l a
B
b
Right-Linear Grammar to Left Linear Grammar
Construct NFA N for grammar G
S ® aA | B A Convert NFA N into NFA N’ with single final state
Reverse direction of all edges
Mark initial state as final state and final state as initial state
A ® aa B a a Find Right-linear grammar G’ from NFA
B ® bB | a C
VF aB
S a VF B bB
l a B S
B B aC
C aA
b A aS
S !
Right-Linear Grammar to Left Linear Grammar
S ® aA | B A
Find reverse grammar G using G’
G’
A ® aa B a a
B ® bB | a C
S a VF
VF Ba l a
B Bb
B S
B
B Ca
C Aa
b
A Sa
S !
Left-Linear Grammar G: V is start variable
Left-Linear Grammar to Right-Linear Grammar
Let G be left-linear.
Find reverse grammar G’
Construct NFA N for grammar G’
Convert NFA N into NFA with single final state
Reverse direction of all edges
Mark initial state as final state and final state as initial state (Now NFA
N’)
Find Right-linear grammar from NFA
Left-Linear Grammar to Right-Linear Grammar
G Left-Linear G’ Right-Linear
S Aabc S cbaA Find reverse grammar G’
A Ab | B A bA | B
B Cac B caC
C Cb | l C bC | l
Left-Linear Grammar to Right-Linear Grammar
G’ Right-Linear
S cbaA Construct NFA N for right-linear grammar G’
A bA | B
B caC
C bC | l
S c D
b
E a A l B c
F
a
b
l C
VF
b
Left-Linear Grammar to Right-Linear Grammar
c b a A l B c a l VF
S D E F C
b
b
Left-Linear Grammar to Right-Linear Grammar
Find Right-linear grammar.
VF C C bC
C aF F cB
B A A bA
A aE E bD
D cS S l VF –Start Variable
c b a A l B c a l VF
S D E F C
b
b
M
L(M) = L( b*acb*abc)
Left-Linear Grammar to Right-Linear Grammar
Standard Representation for Regular Languages
Thank You