Closure CFL
Closure CFL
• R = R1 ∪ R2 ∪ {S → S1 |S2 }
We need to show that L(G) = L(G1 ) ∪ L(G2 ). Consider w ∈ L(G). That means there is a
∗
derivation S ⇒G w. Since the only rules involving S are S → S1 and S → S2 , this derivation
∗ ∗
is either of the form S ⇒G S1 ⇒G w or S ⇒G S2 ⇒G w. Consider the first case. Since the
∗ ∗
only rules for variables in V1 are those belonging to R1 and since S1 ⇒G w, we have S1 ⇒G1 w,
∗ ∗
and so w ∈ L1 = L(G1 ). If the derivation S ⇒G w is of the form S ⇒G S2 ⇒G w, then by a
similar reasoning we can conclude that w ∈ L(G2 ). Hence if w ∈ L(G) then w ∈ L(G1 ) ∪ L(G2 ).
Conversely, consider w ∈ L(G1 ) ∪ L(G2 ). Suppose w ∈ L(G1 ); the case that w ∈ L(G2 ) is similar
∗ ∗
and skipped. That means that S1 ⇒G1 w. Since R1 ⊆ R, we have S1 ⇒G w. Thus, we have
∗
S ⇒G S1 ⇒G w which means that w ∈ L(G). This completes the proof.
1
Kleene Closure Let G = (V = V1 ∪ {S}, Σ, R = R1 ∪ {S → SS1 | }, S), where S 6∈ V1 . We will
show that L(G) = (L(G1 ))∗ . We will show if w ∈ L(G) then w ∈ (L(G1 ))∗ by induction
on the length of the leftmost derivation of w. For the base case, consider w such that
S ⇒G w. Since S → is the only rule for S whose right-hand side has terminals, this
means that w = . Further, ∈ (L(G1 ))∗ which establishes the base case. The induction
∗ G
hypothesis assumes that for all strings w, if S ⇒lm w in < n steps then w ∈ (L(G1 ))∗ .
∗ G
Consider w such that S ⇒lm w in n steps. Any leftmost derivation has the following form:
∗ G ∗ G ∗ G
S ⇒G SS1 ⇒lm w1 S1 ⇒lm w1 w2 = w. Now we have S ⇒lm w1 is < n steps (because
∗ G ∗ G
S1 ⇒lm w2 takes at least one step), and S1 ⇒lm w2 . This means that w1 ∈ (L(G1 ))∗ (by
induction hypothesis) and w2 ∈ L(G1 ) (since the only rules in R for variables in V1 are those
belonging to R1 ). Thus, w = w1 w2 ∈ (L(G1 ))∗ . For the converse, suppose w ∈ (L(G1 ))∗ . By
definition, this means that there are w1 , w2 , . . . wn (for n ≥ 0) such that wi ∈ L(G1 ) for all
i. Now if n = 0 (i.e., w = ) then we have S ⇒G w because S → is a rule. Otherise, since
∗ ∗
wi ∈ L(G1 ), we have S1 ⇒G1 wi , for each i. Since R1 ⊆ R, S1 ⇒G wi . Hence we have the
following derivation
∗ ∗ ∗
S ⇒G SS1 ⇒G SSS1 ⇒G · · · ⇒G S(S1 )n ⇒G (S1 )n ⇒G w1 (S1 )n−1 ⇒G · · · ⇒G w1 w2 · · · wn = w
• L2 = {ai bj cj | i, j ≥ 0} is a CFL.
Proof. Let P be the PDA that accepts L, and let M be the DFA that accepts R. A new PDA
P 0 will simulate P and M simultaneously on the same input and accept if both accept. Then P 0
accepts L ∩ R.
2
• The stack of P 0 is the stack of P
• The final states of P 0 are those in which both the state of P and state of M are accepting.
More formally, let M = (Q1 , Σ, δ1 , q1 , F1 ) be a DFA such that L(M ) = R, and P = (Q2 , Σ, Γ, δ2 , q2 , F2 )
be a PDA such that L(P ) = L. Then consider P 0 = (Q, Σ, Γ, δ, q0 , F ) such that
• Q = Q1 × Q2
• q0 = (q1 , q2 )
• F = F1 × F2
One can show by induction on the number of computation steps, that for any w ∈ Σ∗
w w w
hq0 , i −→P 0 h(p, q), σi iff q1 −→M p and hq2 , i −→P hq, σi
The proof of this statement is left as an exercise. Now as a consequence, we have w ∈ L(P 0 )
w w
iff hq0 , i −→P 0 h(p, q), σi such that (p, q) ∈ F (by definition of PDA acceptance) iff hq0 , i −→P 0
w w
h(p, q), σi such that p ∈ F1 and q ∈ F2 (by definition of F ) iff q1 −→M p and hq2 , i −→P hq, σi and
p ∈ F1 and q ∈ F2 (by the statement to be proved as exercise) iff w ∈ L(M ) and w ∈ L(P ) (by
definition of DFA acceptance and PDA acceptance).
Why does this construction not work for intersection of two CFLs?
Complementation
Proof. [Proof 1] Suppose CFLs were closed under complementation. Then for any two CFLs L1 ,
L2 , we have
• L1 and L2 are CFL. Then, since CFLs closed under union, L1 ∪ L2 is CFL. Then, again by
hypothesis, L1 ∪ L2 is CFL.
• i.e., L1 ∩ L2 is a CFL
3
Set Difference
Proof. Because CFLs not closed under complementation, and complementation is a special case of
set difference. (How?)
Proof. L \ R = L ∩ R
1.3 Homomorphisms
Homomorphism
4
1.4 Inverse Homomorphisms
Inverse Homomorphisms
Proof Idea
For regular language L: the DFA for h−1 (L) on reading a symbol a, simulated the DFA for L on
h(a). Can we do the same with PDAs?
• Key idea: store h(a) in a “buffer” and process symbols from h(a) one at a time (according
to the transition function of the original PDA), and the next input symbol is processed only
after the “buffer” has been emptied.
• Q0 = Q × ∆≤n , where ∆≤n is the collection of all strings of length at most n over ∆.
• q00 = (q0 , )
• F 0 = F × {}
• δ 0 is given by
(
{((q, h(x)), )} if v = a =
δ 0 ((q, v), x, a) =
{((p, u), b) | (p, b) ∈ δ(q, y, a)} if v = yu, x = , and y ∈ (∆ ∪ {})