0% found this document useful (0 votes)
82 views126 pages

Formal Language

- Automata theory studies abstract machines and the computation problems that can be solved using these machines. An automaton is an abstract self-propelled computing device that follows a predetermined sequence of operations automatically. - A finite automaton (FA) has a finite number of states and can be represented as a 5-tuple (Q, Σ, δ, q0, F) where Q is a set of states, Σ is a finite alphabet, δ is the transition function, q0 is the initial state, and F is the set of final/accepting states. - Chomsky hierarchy categorizes formal grammars into four types from unrestricted Type-0 grammars to regular Type-3 gramm

Uploaded by

divya singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
82 views126 pages

Formal Language

- Automata theory studies abstract machines and the computation problems that can be solved using these machines. An automaton is an abstract self-propelled computing device that follows a predetermined sequence of operations automatically. - A finite automaton (FA) has a finite number of states and can be represented as a 5-tuple (Q, Σ, δ, q0, F) where Q is a set of states, Σ is a finite alphabet, δ is the transition function, q0 is the initial state, and F is the set of final/accepting states. - Chomsky hierarchy categorizes formal grammars into four types from unrestricted Type-0 grammars to regular Type-3 gramm

Uploaded by

divya singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 126

Formal Language & Automata Theory

Introduction :- Alphabet

Automata – What is it?


The term "Automata" is derived from the Greek word "αὐτόματα" which means "self-
acting". An automaton (Automata in plural) is an abstract self-propelled computing
device which follows a predetermined sequence of operations automatically.
An automaton with a finite number of states is called a Finite Automaton (FA)
or Finite State Machine (FSM).

Formal definition of a Finite Automaton


An automaton can be represented by a 5-tuple (Q, ∑, δ, q0, F), where −
• Q is a finite set of states.
• ∑ is a finite set of symbols, called the alphabet of the automaton.
• δ is the transition function.
• q0 is the initial state from where any input is processed (q0 ∈ Q).
• F is a set of final state/states of Q (F ⊆ Q).

Related Terminologies

Alphabet
• Definition − An alphabet is any finite set of symbols.
• Example − ∑ = {a, b, c, d} is an alphabet set where ‘a’, ‘b’, ‘c’, and ‘d’
are symbols.
String
• Definition − A string is a finite sequence of symbols taken from ∑.
• Example − ‘cabcad’ is a valid string on the alphabet set ∑ = {a, b, c, d}
Length of a String
• Definition − It is the number of symbols present in a string. (Denoted
by |S|).
• Examples −
o If S = ‘cabcad’, |S|= 6
o If |S|= 0, it is called an empty string (Denoted by λ or ε)

Kleene Star
• Definition − The Kleene star, ∑*, is a unary operator on a set of symbols
or strings, ∑, that gives the infinite set of all possible strings of all
possible lengths over ∑ including λ.
• Representation − ∑* = ∑0 ∪ ∑1 ∪ ∑2 ∪……. where ∑p is the set of all
possible strings of length p.
• Example − If ∑ = {a, b}, ∑* = {λ, a, b, aa, ab, ba, bb,………..}
Kleene Closure / Plus
• Definition − The set ∑+ is the infinite set of all possible strings of all
possible lengths over ∑ excluding λ.
• Representation − ∑+ = ∑1 ∪ ∑2 ∪ ∑3 ∪…….
∑+ = ∑* − { λ }
• Example − If ∑ = { a, b } , ∑+ = { a, b, aa, ab, ba, bb,………..}
Language
• Definition − A language is a subset of ∑* for some alphabet ∑. It can be
finite or infinite.
• Example − If the language takes all possible strings of length 2 over ∑
= {a, b}, then L = { ab, aa, ba, bb }

Theory of Automata
Theory of automata is a theoretical branch of computer science and mathematical. It
is the study of abstract machines and the computation problems that can be solved
using these machines. The abstract machine is called the automata. The main
motivation behind developing the automata theory was to develop methods to
describe and analyse the dynamic behaviour of discrete systems.

This automaton consists of states and transitions. The State is represented by circles,
and the Transitions is represented by arrows.

Automata is the kind of machine which takes some string as input and this input goes
through a finite number of states and may enter in the final state.

There are the basic terminologies that are important and frequently used in automata

Symbols:

Symbols are an entity or individual objects, which can be any letter, alphabet or any
picture.

Example:
1, a, b, #

Alphabets:
Alphabets are a finite set of symbols. It is denoted by ∑.

Examples:

1. ∑ = {a, b}
2.
3. ∑ = {A, B, C, D}
4.
5. ∑ = {0, 1, 2}
6.
7. ∑ = {0, 1, ....., 5]
8.
9. ∑ = {#, β, Δ}

String:

It is a finite collection of symbols from the alphabet. The string is denoted by w.

Example 1:
If ∑ = {a, b}, various string that can be generated from ∑ are {ab, aa, aaa, bb, bbb, ba,
aba.....}.

o A string with zero occurrences of symbols is known as an empty string. It is represented


by ε.
o The number of symbols in a string w is called the length of a string. It is denoted by
|w|.

Example 2:

1. w = 010
2.
3. Number of Sting |w| = 3

Language:

A language is a collection of appropriate string. A language which is formed over Σ can


be Finite or Infinite.

Example: 1
L1 = {Set of string of length 2}
= {aa, bb, ba, bb} Finite Language

Example: 2
L2 = {Set of all strings starts with 'a'}

= {a, aa, aaa, abb, abbb, ababb} Infinite Language

Language and Grammar :-


n the literary sense of the term, grammars denote syntactical rules for conversation
in natural languages. Linguistics have attempted to define grammars since the
inception of natural languages like English, Sanskrit, Mandarin, etc.
The theory of formal languages finds its applicability extensively in the fields of
Computer Science. Noam Chomsky gave a mathematical model of grammar in 1956
which is effective for writing computer languages.

Grammar
A grammar G can be formally written as a 4-tuple (N, T, S, P) where −
• N or VN is a set of variables or non-terminal symbols.
• T or ∑ is a set of Terminal symbols.
• S is a special variable called the Start symbol, S ∈ N
• P is Production rules for Terminals and Non-terminals. A production
rule has the form α → β, where α and β are strings on VN ∪ ∑ and least
one symbol of α belongs to VN.
Example
Grammar G1 −
({S, A, B}, {a, b}, S, {S → AB, A → a, B → b})
Here,
• S, A, and B are Non-terminal symbols;
• a and b are Terminal symbols
• S is the Start symbol, S ∈ N
• Productions, P : S → AB, A → a, B → b
Example
Grammar G2 −
(({S, A}, {a, b}, S,{S → aAb, aA → aaAb, A → ε } )
Here,
• S and A are Non-terminal symbols.
• a and b are Terminal symbols.
• ε is an empty string.
• S is the Start symbol, S ∈ N
• Production P : S → aAb, aA → aaAb, A → ε

Derivations from a Grammar


Strings may be derived from other strings using the productions in a grammar. If a
grammar G has a production α → β, we can say that x α y derives x β y in G. This
derivation is written as −
x α y ⇒G x β y

Example
Let us consider the grammar −
G2 = ({S, A}, {a, b}, S, {S → aAb, aA → aaAb, A → ε } )
Some of the strings that can be derived are −
S ⇒ aAb using production S → aAb
⇒ aaAbb using production aA → aAb
⇒ aaaAbbb using production aA → aaAb
⇒ aaabbb using production A → ε

Grammar in theory of computation is a finite set of formal rules that are generating
syntactically correct sentences.
The formal definition of grammar is that it is defined as four tuples −
G=(V,T,P,S)
• G is a grammar, which consists of a set of production rules. It is used to
generate the strings of a language.
• T is the final set of terminal symbols. It is denoted by lower case letters.
• V is the final set of non-terminal symbols. It is denoted by capital letters.
• P is a set of production rules, which is used for replacing non-terminal
symbols (on the left side of production) in a string with other terminals
(on the right side of production).
• S is the start symbol used to derive the string.
Grammar is composed of two basic elements
Terminal Symbols - Terminal symbols are the components of the sentences that are
generated using grammar and are denoted using small case letters like a, b, c etc.
Non-Terminal Symbols - Non-Terminal Symbols take part in the generation of the
sentence but are not the component of the sentence. These types of symbols are
also called Auxiliary Symbols and Variables. They are represented using a capital
letter like A, B, C, etc.

Example 1
Consider a grammar
G = (V , T , P , S)
Where,
V = { S , A , B } ⇒ Non-Terminal symbols
T={a,b} ⇒ Terminal symbols
Production rules P = { S → ABa , A → BB , B → ab , AA → b }
S={S} ⇒ Start symbol

Example 2
Consider a grammar
G=(V,T,P,S)
Where,
V= {S, A, B} ⇒ non terminal symbols
T = { 0,1} ⇒ terminal symbols
Production rules P = { S→A1B
A→0A| ε
B→0B| 1B| ε }
S= {S} ⇒ start symbol.

Types of grammar
The different types of grammar −

Grammar Language Automata Production


rules

Type-0 Recursively Turing machine No restriction


enumerable

Type-1 Context-sensitive Linear-bounded non- αAβ→αγβ


deterministic machine

Type-2 Context-free Non-deterministic push down A→γ


automata

Type-3 Regular Finite state automata A→αBA→α

The diagram representing the types of grammar in the theory of computation (TOC)
is as follows −

Chomsky hierarchy of language :-


Chomsky Hierarchy in Theory of Computation
According to Chomsky hierarchy, grammar is divided into 4 types as follows:
1. Type 0 is known as unrestricted grammar.
2. Type 1 is known as context-sensitive grammar.
3. Type 2 is known as a context-free grammar.
4. Type 3 Regular Grammar.

Type 0: Unrestricted Grammar:


Type-0 grammars include all formal grammar. Type 0 grammar languages are
recognized by turing machine. These languages are also known as the
Recursively Enumerable languages.

Grammar Production in the form of where


\alpha is ( V + T)* V ( V + T)*
V : Variables
T : Terminals.

is ( V + T )*.
In type 0 there must be at least one variable on the Left side of production.
For example:
Sab --> ba
A --> S
Here, Variables are S, A, and Terminals a, b.
Type 1: Context-Sensitive Grammar
Type-1 grammars generate context-sensitive languages. The language generated
by the grammar is recognized by the Linear Bound Automata
In Type 1
• First of all Type 1 grammar should be Type 0.
• Grammar Production in the form of

|\alpha |<=|\beta |

That is the count of symbol in is less than or equal to


Also β ∈ (V + T)+
i.e. β can not be ε

For Example:
S --> AB
AB --> abc
B --> b
Type 2: Context-Free Grammar: Type-2 grammars generate context-free
languages. The language generated by the grammar is recognized by a Pushdown
automata. In Type 2:
• First of all, it should be Type 1.
• The left-hand side of production can have only one variable and there is

no restriction on
|\alpha | = 1.
For example:
S --> AB
A --> a
B --> b
Type 3: Regular Grammar: Type-3 grammars generate regular languages.
These languages are exactly all languages that can be accepted by a finite-state
automaton. Type 3 is the most restricted form of grammar.
Type 3 should be in the given form only :
V --> VT / T (left-regular grammar)
(or)
V --> TV /T (right-regular grammar)
For example:
S --> a
The above form is called strictly regular grammar.
There is another form of regular grammar called extended regular grammar. In
this form:
V --> VT* / T*. (extended left-regular grammar)
(or)
V --> T*V /T* (extended right-regular grammar)
For example :
S --> ab.

Introduction
According to the given hierarchy, Grammar is divided into four types
Type 0 Unrestricted Grammar

Type 1 Context-Sensitive Grammar

Type 2 Context-Free Grammar

Type 3 Regular Grammar

Type 0: Unrestricted Grammar


Language recognized by Turing Machine is known as Type 0 Grammar. They are
also known as Recursively Enumerable Languages.
Grammar Production for Type 0 is given by
α —> β
For Example:
Sba —> a
S —> B
Where S and B are Variables.
And a and b are Terminals.
Type 1: Context-Sensitive Grammar
Languages recognized by Linear Bound Automata are known as Type 1 Grammar.
Context-sensitive grammar represents context-sensitive languages.
For grammar to be context-sensitive, it must be unrestricted. Grammar Production
for Type 1 is given by
α —> β (ensuring count symbol in LHS must be less than or equal to RHS)
For Example,
S —> BA
BA —> bca
B —> b
Type 2: Context-Free Grammar
Languages recognized by Pushdown Automata are known as Type 2
Grammar. Context-free grammar represents context-free languages.
For grammar to be context-free, it must be context-sensitive. Grammar Production
for Type 2 is given by
A —> α
Where A is a single non-terminal.
For Example,
A —> aB
B —> b
Type 3: Regular Grammar
Languages recognized by Finite Automata are known as Type 3 Grammar. Regular
grammar represents regular languages.
For grammar to be regular, it must be context-free. Grammar Production for Type 3
is given by
V → T*V / T*
For Example,
A —> ab
Frequently Asked Questions
Name the types of grammar based on Chomsky Hierarchy.
The four types of Grammar are Type 0, Type 1, Type 2, and Type 3.
Which languages are recognized by Turing Machine?
Type 0 Grammar
Which languages are recognized by Pushdown Automata?
Type 2 Grammar
Which languages are recognized by Linear Bound Automata?
Type 1 Grammar
Which languages are recognized by Finite Automata?
Type 2 Grammar

Module 2
Regular language and finite automata : (Regular expression and
language)
A Regular Expression can be recursively defined as follows −
• ε is a Regular Expression indicates the language containing an empty
string. (L (ε) = {ε})
• φ is a Regular Expression denoting an empty language. (L (φ) = { })
• x is a Regular Expression where L = {x}
• If X is a Regular Expression denoting the language L(X) and Y is a
Regular Expression denoting the language L(Y), then
o X + Y is a Regular Expression corresponding to the
language L(X) ∪ L(Y) where L(X+Y) = L(X) ∪ L(Y).
o X . Y is a Regular Expression corresponding to the
language L(X) . L(Y) where L(X.Y) = L(X) . L(Y)
o R* is a Regular Expression corresponding to the
language L(R*)where L(R*) = (L(R))*
• If we apply any of the rules several times from 1 to 5, they are Regular
Expressions.

Some RE Examples

Regular Regular Set


Expressions

(0 + 10*) L = { 0, 1, 10, 100, 1000, 10000, … }

(0*10*) L = {1, 01, 10, 010, 0010, …}

(0 + ε)(1 + ε) L = {ε, 0, 1, 01}

(a+b)* Set of strings of a’s and b’s of any length including the null
string. So L = { ε, a, b, aa , ab , bb , ba, aaa…….}

(a+b)*abb Set of strings of a’s and b’s ending with the string abb. So L =
{abb, aabb, babb, aaabb, ababb, …………..}

(11)* Set consisting of even number of 1’s including empty string,


So L= {ε, 11, 1111, 111111, ……….}

(aa)*(bb)*b Set of strings consisting of even number of a’s followed by


odd number of b’s , so L = {b, aab, aabbb, aabbbbb, aaaab,
aaaabbb, …………..}

(aa + ab + ba + bb)* String of a’s and b’s of even length can be obtained by
concatenating any combination of the strings aa, ab, ba and bb
including null, so L = {aa, ab, ba, bb, aaab, aaba, …………..}
Regular Expressions
Regular Expressions are used to denote regular languages. An expression is
regular if:
• ɸ is a regular expression for regular language ɸ.
• ɛ is a regular expression for regular language {ɛ}.
• If a ∈ Σ (Σ represents the input alphabet), a is regular expression
with language {a}.
• If a and b are regular expression, a + b is also a regular expression
with language {a,b}.
• If a and b are regular expression, ab (concatenation of a and b) is
also regular.
• If a is regular expression, a* (0 or more times a) is also regular.

Regular
• Expression Regular Languages

set of vovels (a∪e∪i∪o∪u) {a, e, i, o, u}

a followed by 0 or
(a.b*) {a, ab, abb, abbb, abbbb,….}
more b

any no. of vowels v*.c* ( where v – { ε , a ,aou, aiou, b, abcd…..} where


followed by any no. of vowels and c – ε represent empty string (in case 0
consonants consonants) vowels and o consonants )

Regular Grammar : A grammar is regular if it has rules of form A -> a


or A -> aB or A -> ɛ where ɛ is a special symbol called NULL.

Regular Languages : A language is regular if it can be expressed in


terms of regular expression.

Closure Properties of Regular Languages


Union : If L1 and If L2 are two regular languages, their union L1 ∪ L2
will also be regular. For example, L1 = {an | n ≥ 0} and L2 = {bn | n ≥ 0}
L3 = L1 ∪ L2 = {an ∪ bn | n ≥ 0} is also regular.
Intersection : If L1 and If L2 are two regular languages, their
intersection L1 ∩ L2 will also be regular. For example,
L1= {am bn | n ≥ 0 and m ≥ 0} and L2= {a m bn ∪ bn am | n ≥ 0 and m ≥ 0}
L3 = L1 ∩ L2 = {am bn | n ≥ 0 and m ≥ 0} is also regular.
Concatenation : If L1 and If L2 are two regular languages, their
concatenation L1.L2 will also be regular. For example,
L1 = {an | n ≥ 0} and L2 = {bn | n ≥ 0}
L3 = L1.L2 = {am . bn | m ≥ 0 and n ≥ 0} is also regular.
Kleene Closure : If L1 is a regular language, its Kleene closure L1*
will also be regular. For example,
L1 = (a ∪ b)
L1* = (a ∪ b)*
Complement : If L(G) is regular language, its complement L’(G) will
also be regular. Complement of a language can be found by
subtracting strings which are in L(G) from all possible strings. For
example,
L(G) = {an | n > 3}
L’(G) = {an | n <= 3}

Note : Two regular expressions are equivalent if languages generated


by them are same. For example, (a+b*)* and (a+b)* generate same
language. Every string which is generated by (a+b*)* is also generated
by (a+b)* and vice versa.

Deterministic finite automata and equivalence with regular


expression :-

Deterministic Finite Automaton (DFA)


In DFA, for each input symbol, one can determine the state to which the machine will
move. Hence, it is called Deterministic Automaton. As it has a finite number of
states, the machine is called Deterministic Finite Machine or Deterministic Finite
Automaton.

Formal Definition of a DFA


A DFA can be represented by a 5-tuple (Q, ∑, δ, q0, F) where −
• Q is a finite set of states.
• ∑ is a finite set of symbols called the alphabet.
• δ is the transition function where δ: Q × ∑ → Q
• q0 is the initial state from where any input is processed (q0 ∈ Q).
• F is a set of final state/states of Q (F ⊆ Q).

Graphical Representation of a DFA


A DFA is represented by digraphs called state diagram.

• The vertices represent the states.


• The arcs labeled with an input alphabet show the transitions.
• The initial state is denoted by an empty single incoming arc.
• The final state is indicated by double circles.
Example
Let a deterministic finite automaton be →

• Q = {a, b, c},
• ∑ = {0, 1},
• q0 = {a},
• F = {c}, and
Transition function δ as shown by the following table −

Present State Next State for Input 0 Next State for Input 1

a a b

b c a

c b c

Its graphical representation would be as follows −

Regular Expression to DFA

Utility – To construct DFA from a given regular expression, we can first construct
an NFA for the given expression and then convert this NFA to DFA by a subset
construction method. But to avoid this two-step procedure, the other way round
is to directly construct a DFA for the given expression.
DFA refers to Deterministic Finite Automata. In DFA, for each state, and for each
input symbol there is one and only one state to which the automaton can have a
transition from its current state. DFA does not accept any ∈-transition.
In order to construct a DFA directly from a regular expression, we need to follow
the steps listed below:
Example: Suppose given regular expression r = (a|b)*abb
1. Firstly, we construct the augmented regular expression for the given
expression. By concatenating a unique right-end marker ‘#’ to a regular expression
r, we give the accepting state for r a transition on ‘#’ making it an important state
of the NFA for r#.
So, r' = (a|b)*abb#
2. Then we construct the syntax tree for r#.

Syntax tree for (a|b)*abb#

3. Next we need to evaluate four functions nullable, firstpos, lastpos, and


followpos.
1. nullable(n) is true for a syntax tree node n if and only if the regular
expression represented by n has € in its language.
2. firstpos(n) gives the set of positions that can match the first symbol of a
string generated by the subexpression rooted at n.
3. lastpos(n) gives the set of positions that can match the last symbol of a
string generated by the subexpression rooted at n.
We refer to an interior node as a cat-node, or-node, or star-node if it is labeled by
a concatenation, | or * operator, respectively.
Rules for computing nullable, firstpos, and lastpos:
Node n nullable(n) firstpos(n) lastpos(n)

n is a leaf node
true ∅ ∅
labeled €

n is a leaf node
labelled with false {i} {i}
position i
Node n nullable(n) firstpos(n) lastpos(n)

n is an or node
with left child c1 nullable(c1) or firstpos(c1) ∪ lastpos(c1) ∪
and right child nullable(c2) firstpos(c2) lastpos(c2)
c2

n is a cat node If nullable(c1) then If nullable(c2) then


nullable(c1)
with left child c1 firstpos(c1) ∪ lastpos(c2) ∪
and
and right child firstpos(c2) else lastpos(c1) else
nullable(c2)
c2 firstpos(c1) lastpos(c2)

n is a star node
with child node true firstpos(c1) lastpos(c1)
c1

Rules for computing followpos:


1. If n is a cat-node with left child c1 and right child c2 and i is a position in
lastpos(c1), then all positions in firstpos(c2) are in followpos(i).
2. If n is a star-node and i is a position in lastpos(n), then all positions in firstpos(n)
are in followpos(i).
3. Now that we have seen the rules for computing firstpos and lastpos, we now
proceed to calculate the values of the same for the syntax tree of the given regular
expression (a|b)*abb#.
firstpos and lastpos for nodes in syntax tree for (a|b)*abb#

Let us now compute the followpos bottom up for each node in the syntax tree.

NODE followpos

1 {1, 2, 3}

2 {1, 2, 3}

3 {4}

4 {5}

5 {6}
NODE followpos

6 ∅

4. Now we construct Dstates, the set of states of DFA D and Dtran, the transition
table for D. The start state of DFA D is firstpos(root) and the accepting states are
all those containing the position associated with the endmarker symbol #.
According to our example, the firstpos of the root is {1, 2, 3}. Let this state be A and
consider the input symbol a. Positions 1 and 3 are for a, so let B = followpos(1) ∪
followpos(3) = {1, 2, 3, 4}. Since this set has not yet been seen, we set Dtran[A, a]
:= B.
When we consider input b, we find that out of the positions in A, only 2 is
associated with b, thus we consider the set followpos(2) = {1, 2, 3}. Since this set
has already been seen before, we do not add it to Dstates but we add the transition
Dtran[A, b]:= A.
Continuing like this with the rest of the states, we arrive at the below transition
table.

Input

State a b

⇢A B A

B B C

C B D

D B A

Here, A is the start state and D is the accepting state.


5. Finally we draw the DFA for the above transition table.
The final DFA will be :
DFA for (a|b)*abb

NonDeterminstic and equivalence with regular expression :-


In NDFA, for a particular input symbol, the machine can move to any combination of
the states in the machine. In other words, the exact state to which the machine moves
cannot be determined. Hence, it is called Non-deterministic Automaton. As it has
finite number of states, the machine is called Non-deterministic Finite
Machine or Non-deterministic Finite Automaton.

Formal Definition of an NDFA


An NDFA can be represented by a 5-tuple (Q, ∑, δ, q0, F) where −
• Q is a finite set of states.
• ∑ is a finite set of symbols called the alphabets.
• δ is the transition function where δ: Q × ∑ → 2Q
(Here the power set of Q (2Q) has been taken because in case of NDFA,
from a state, transition can occur to any combination of Q states)
• q0 is the initial state from where any input is processed (q0 ∈ Q).
• F is a set of final state/states of Q (F ⊆ Q).
Graphical Representation of an NDFA: (same as DFA)
An NDFA is represented by digraphs called state diagram.

• The vertices represent the states.


• The arcs labeled with an input alphabet show the transitions.
• The initial state is denoted by an empty single incoming arc.
• The final state is indicated by double circles.
Example
Let a non-deterministic finite automaton be →

• Q = {a, b, c}
• ∑ = {0, 1}
• q0 = {a}
• F = {c}
The transition function δ as shown below −

Present State Next State for Input 0 Next State for Input 1

a a, b b

b c a, c

c b, c c

Its graphical representation would be as follows −

DFA vs NDFA
The following table lists the differences between DFA and NDFA.

DFA NDFA

The transition from a state is to a single The transition from a state can be to
particular next state for each input symbol. multiple next states for each input
Hence it is called deterministic. symbol. Hence it is called non-
deterministic.

Empty string transitions are not seen in NDFA permits empty string
DFA. transitions.

Backtracking is allowed in DFA In NDFA, backtracking is not always


possible.
Requires more space. Requires less space.

A string is accepted by a DFA, if it transits A string is accepted by a NDFA, if at


to a final state. least one of all possible transitions
ends in a final state.

Acceptors, Classifiers, and Transducers

Acceptor (Recognizer)
An automaton that computes a Boolean function is called an acceptor. All the states
of an acceptor is either accepting or rejecting the inputs given to it.

Classifier
A classifier has more than two final states and it gives a single output when it
terminates.

Transducer
An automaton that produces outputs based on current input and/or previous state is
called a transducer. Transducers can be of two types −
• Mealy Machine − The output depends both on the current state and the
current input.
• Moore Machine − The output depends only on the current state.

Acceptability by DFA and NDFA


A string is accepted by a DFA/NDFA iff the DFA/NDFA starting at the initial state
ends in an accepting state (any of the final states) after reading the string wholly.
A string S is accepted by a DFA/NDFA (Q, ∑, δ, q0, F), iff
δ*(q0, S) ∈ F
The language L accepted by DFA/NDFA is
{S | S ∈ ∑* and δ*(q0, S) ∈ F}
A string S′ is not accepted by a DFA/NDFA (Q, ∑, δ, q0, F), iff
δ*(q0, S′) ∉ F
The language L′ not accepted by DFA/NDFA (Complement of accepted language L)
is
{S | S ∈ ∑* and δ*(q0, S) ∉ F}
Example
Let us consider the DFA shown in Figure 1.3. From the DFA, the acceptable strings
can be derived.

Strings accepted by the above DFA: {0, 00, 11, 010, 101, ...........}
Strings not accepted by the above DFA: {1, 011, 111, ........}

Conversion from NFA to DFA


In this section, we will discuss the method of converting NFA to its equivalent DFA. In
NFA, when a specific input is given to the current state, the machine goes to multiple
states. It can have zero, one or more than one move on a given input symbol. On the
other hand, in DFA, when a specific input is given to the current state, the machine
goes to only one state. DFA has only one move on a given input symbol.

Let, M = (Q, ∑, δ, q0, F) is an NFA which accepts the language L(M). There should be
equivalent DFA denoted by M' = (Q', ∑', q0', δ', F') such that L(M) = L(M').

Steps for converting NFA to DFA:


Step 1: Initially Q' = ϕ

Step 2: Add q0 of NFA to Q'. Then find the transitions from this start state.

Step 3: In Q', find the possible set of states for each input symbol. If this set of states
is not in Q', then add it to Q'.

Step 4: In DFA, the final state will be all the states which contain F(final states of NFA)
Example 1:
Convert the given NFA to DFA.

Solution: For the given transition diagram we will first construct the transition table.

State 0 1

→q0 q0 q1

q1 {q1, q2} q1

*q2 q2 {q1, q2}

Now we will obtain δ' transition for state q0.

1. δ'([q0], 0) = [q0]
2. δ'([q0], 1) = [q1]

The δ' transition for state q1 is obtained as:

1. δ'([q1], 0) = [q1, q2] (new state generated)


2. δ'([q1], 1) = [q1]

The δ' transition for state q2 is obtained as:

1. δ'([q2], 0) = [q2]
2. δ'([q2], 1) = [q1, q2]

Now we will obtain δ' transition on [q1, q2].


1. δ'([q1, q2], 0) = δ(q1, 0) ∪ δ(q2, 0)
2. = {q1, q2} ∪ {q2}
3. = [q1, q2]
4. δ'([q1, q2], 1) = δ(q1, 1) ∪ δ(q2, 1)
5. = {q1} ∪ {q1, q2}
6. = {q1, q2}
7. = [q1, q2]

The state [q1, q2] is the final state as well because it contains a final state q2. The
transition table for the constructed DFA will be:

State 0 1

→[q0] [q0] [q1]

[q1] [q1, q2] [q1]

*[q2] [q2] [q1, q2]

*[q1, q2] [q1, q2] [q1, q2]

The Transition diagram will be:


The state q2 can be eliminated because q2 is an unreachable state.

Example 2:
Convert the given NFA to DFA.

Solution: For the given transition diagram we will first construct the transition table.

State 0 1

→q0 {q0, q1} {q1}

*q1 ϕ {q0, q1}

Now we will obtain δ' transition for state q0.

1. δ'([q0], 0) = {q0, q1}


2. = [q0, q1] (new state generated)
3. δ'([q0], 1) = {q1} = [q1]

The δ' transition for state q1 is obtained as:

1. δ'([q1], 0) = ϕ
2. δ'([q1], 1) = [q0, q1]

Now we will obtain δ' transition on [q0, q1].

1. δ'([q0, q1], 0) = δ(q0, 0) ∪ δ(q1, 0)


2. = {q0, q1} ∪ ϕ
3. = {q0, q1}
4. = [q0, q1]

Similarly,

1. δ'([q0, q1], 1) = δ(q0, 1) ∪ δ(q1, 1)


2. = {q1} ∪ {q0, q1}
3. = {q0, q1}
4. = [q0, q1]

As in the given NFA, q1 is a final state, then in DFA wherever, q1 exists that state
becomes a final state. Hence in the DFA, final states are [q1] and [q0, q1]. Therefore
set of final states F = {[q1], [q0, q1]}.

The transition table for the constructed DFA will be:

State 0 1

→[q0] [q0, q1] [q1]

*[q1] ϕ [q0, q1]

*[q0, q1] [q0, q1] [q0, q1]

The Transition diagram will be:


Even we can change the name of the states of DFA.

Suppose

1. A = [q0]
2. B = [q1]
3. C = [q0, q1]

With these new names the DFA will be as follows:


Grammar and Equivalence with finite Automata :-

Type-3 grammar/regular grammar:

Regular grammar generates regular language. They have a single non-terminal


on the left-hand side and a right-hand side consisting of a single terminal or
single terminal followed by a non-terminal.
The productions must be in the form:
A ⇢ xB
A ⇢ x
A ⇢ Bx
where A, B ∈ Variable(V) and x ∈ T* i.e. string of terminals.

Types of regular grammar:

• Left Linear grammar(LLG)


• Right linear grammar(RLG)
1. Left linear grammar(LLG):
In LLG, the productions are in the form if all the productions are of the form
A ⇢ Bx
A ⇢ x
where A,B ∈ V and x ∈ T*
2. Right linear grammar(RLG):
In RLG, the productions are in the form if all the productions are of the form
A ⇢ xB
A ⇢ x
where A,B ∈ V and x ∈ T*
The language generated by type-3 grammar is a regular language, for which a FA
can be designed. The FA can also be converted into type-3 grammar
Example: FA for accepting strings that start with b

∑ = {a,b}
Initial state(q0) = A
Final state(F) = B
The RLG corresponding to FA is
A ⇢ bB
B ⇢ ∈/aB/bB
The above grammar is RLG, which can be written directly through FA.

This grammar derives strings that are stated with B

The above RLG can derive strings that start with b and after that any input
symbol(i.e. ∑ ={a, b} can be accepted).
The regular language corresponding to RLG is
L= {b, ba, bb, baa, bab ,bba,bbb ..... }
If we reverse the above production of the above RLG, then we get
A ⇢ Bb
B ⇢ ∈/Ba/Bb
It derives the language that contains all the strings which end with
b.
i.e. L' = {b, bb, ab, aab, bab, abb, bbb .....}
So we can conclude that if we have FA that represents language L and if we
convert it, into RLG, which again represents
language L, but after reversing RLG we get LLG which represents language L'(i.e.
reverse of L).
For converting the RLG into LLG for language L, the following procedure
needs to be followed:
Step 1: Reverse the FA for language L
Step 2: Write the RLG for it.
Step 3: Reverse the right linear grammar.
after this we get the grammar that generates the language that
represents the LLG for the same language L.

This represents the same procedure as above for converting RLG to LLG

Here L is a language for FA and LR is a reversal of the language L.


Example:
The above FA represents language L(i.e. set of all strings over input symbols a
and b which start with b).
We are converting it into LLG.
Step1: The reversal of FA is

The reversal of FA represents all strings starting with b.

Step 2: The corresponding RLG for this reversed FA is


B ⇢ aB/bB/bA
A ⇢ ∈
Step 3: The reversing the above RLG we get
B ⇢ Ba/Bb/Ab
A ⇢ ∈
So this is LLG for language L( which represents all strings that start with b).
L= {b, ba, bb, baa, bab ,bba, bbb ….. }
Conversion of RLG to FA:
• Start from the first production.
• From every left alphabet (or variable) go to the symbol followed by it.
• Start state: It will be the first production state.
• Final state: Take those states which end up with terminals without
further non-terminals.
Example: The RLL grammar for Language(L), represents a set of all strings
which end with 0.
A ⇢ 0A/1B/0B
B ⇢ ∈
So the FA for corresponding to RLG can be found out as
Start with variable A and use its production.
• For production A ⇢ 0A, this means after getting input symbol 0, the
transition will remain in the same state.
• For production, A ⇢ 1B, this means after getting input symbol 1, the
state transition will take place from State A to B.
• For production A ⇢ 0B, this means after getting input symbol 0, the
state transition will take place from State A to B.
• For production B ⇢ ∈, this means there is no need for state transition.
This means it would be the final state in the corresponding FA as RHS is
terminal.
So the final NFA for the corresponding RLG is

Set of all strings that end with 0

Conversion of LLG to FA:

Explanation: First convert the LLG which represents the Language(L) to RLG,
which represents, the reversal of language L(i.e.LR) then design FA corresponding
to it(i.e. FA for Language LR ). Then reverse the FA. Then the final FA is FA for
language L).
Conversion of LLG to RLG: For example, the above grammar is taken which
represents language L(i.e. set of all strings that start with b)
The LLG for this grammar is
B ⇢ Ba/Bb/Ab
A ⇢ ∈
Step 1: Convert the LLG into FA(i.e. the conversion procedure is the same as
above)

Step 2: Reverse the FA(i.e. initial state is converted into final state and convert
final state to initial state and reverse all edges)

Step 3: Write RLG corresponding to reversed FA.


A ⇢ bB
B ⇢ aB/bB/∈

They can be easily converted to other

All have the same power and can be converted to other

Equivalence of Regular Grammar and Finite Automata


Regular Grammar / type -3 grammar
A regular grammar or type-3 defines the language called regular language
that is accepted by finite Automata.A Regular Grammar G consists of 4 tuples
(V, T, P, S).
Linear Grammar: Agrammar is called linear in which at most one non-
terminal can occur on the right side of any production rule. Following are the
types of Linear Grammar:

• Right Linear Grammar


• Left Linear Grammar

Right Linear Grammar: A right linear grammar is a grammar G = (V, T, P, S)


such that all the production rules P are one of the following forms:
A -> a
A -> aB
In this, A and B are variables in V i.e. A and B belongs to variable V and a is a
terminal. The left-hand side of production rule in right linear grammar consists
of only one symbol from set of variables, and right hand side contains either
strings of terminals or only one variable present at rightmost position.
Left Linear Grammar: A left linear grammar is a grammar G = (V, T, P, S)
such that all the production rules P are one of the following forms:
A -> a
A -> Ba
In this A and B are variables in V i.e. A and B belongs to variables and a is a
terminal.
Finite Automata
Finite Automata are the simplest model accepted the language called regular
language. The term finite in finite automata is that it has a limited number of
states and the limited number of alphabets in the strings. Finite Automata
consists of 5 tuples (Q, ?, q0, F, ?).
The relationship of regular grammar and finite automata is shown below:

Theorem:
If G is a regular grammar then L (G) is a regular language.
Proof:
The regular languages are recognized by finite automata. So, first of all we will
construct a NFA equivalent to given right linear grammar which accepts the
language defined by the given regular grammar G.
Let G = (V, T, P, S) be a right linear grammar. Let V = {A0A1…..An}, where A0
is the start symbol S. We define an NFA N = ({q0q1….qnqf}, ?, ?, q0,
qf}) where ? is defined as:
• For each production Ai -> bAj. In this N has transition from qi to qj with
label b.
• For each production Ak -> b. In this N has transition from qk to qf with
label b.

From the construction, it is clear that A0 => b1A1 => b1b2A2 => b1b2b3A3 =>
…..=>b1bn-1 => b1….bn, if and only if there is a path in N starting from initial
state q0 and ends on final state qf with path value b1b2….bn.
Therefore L (G) = T (N).
Example:
Let G = (V, T, P, S) be a regular grammar, where
V = {A0, A1, A2}
T = {0, 1}
S is the start symbol of the grammar.
P is the set of production rules defined as:
A0 -> 0A1
A0 -> 1A2
A1 -> 0A2
A2 -> 0
Construct a finite-automata that accepts the language generated by a given
grammar G.
Solution:
Let M = (Q, ?, q0, F, ?) be a finite-automata that accepts L (G), where
Q = {q0, q1, q2, qf}
? = {0, 1}
q0 is the initial state
F = {qf}
The states q0, q1, q2 corresponds to A0, A1, A2, and qf is the new final state of
M.
Initially we have 4 states of finite automata M.

The production rule A0 -> 0A1 includes a transition from q0 to q1 with label 0.
After this production rule, we have following partial diagram of finite automata.
The production rule A0 -> 1A2 includes a transition from q0 to q2 with label 1.
After this production rule we have following partial diagram of finite automata.

The production rule A1 -> 0A2 includes a transition from q1 to q2 with label 0.
After this production rule we have following partial diagram of finite automata.

Similarly, for the production rule A2 -> 0includes a transition from q2 to qf with
label 0. After this production rule we have following final diagram of finite
automata accepting L (G).

Properties of Regular Languages:-


In an automata theory, there are different closure properties for regular languages.
They are as follows −

• Union
• Intersection
• concatenation
• Kleene closure
• Complement
Let see one by one with an example

Union
If L1 and If L2 are two regular languages, their union L1 U L2 will also be regular.
Example
L1 = {an | n > O} and L2 = {bn | n > O}
L3 = L1 U L2 = {an U bn | n > O} is also regular.

Intersection
If L1 and If L2 are two regular languages, their intersection L1 ∩ L2 will also be
regular.
Example
L1= {am bn | n > 0 and m > O} and
L2= {am bn U bn am | n > 0 and m > O}
L3 = L1 ∩ L2 = {am bn | n > 0 and m > O} are also regular.

Concatenation
If L1 and If L2 are two regular languages, their concatenation L1.L2 will also be
regular.
Example
L1 = {an | n > 0} and L2 = {bn | n > O}
L3 = L1.L2 = {am . bn | m > 0 and n > O} is also regular.

Kleene Closure
If L1 is a regular language, its Kleene closure L1* will also be regular.
Example
L1 = (a U b )
L1* = (a U b)*

Complement
If L(G) is a regular language, its complement L'(G) will also be regular. Complement
of a language can be found by subtracting strings which are in L(G) from all possible
strings.
Example
L(G) = {an | n > 3} L'(G) = {an | n <= 3}
Note − Two regular expressions are equivalent, if languages generated by them are
the same. For example, (a+b*)* and (a+b)* generate the same language. Every string
which is generated by (a+b*)* is also generated by (a+b)* and vice versa.

(A) Closure Properties


1. Complementation
If a language L is regular its complement L' is regular.
Let DFA(L) denote the DFA for the language L. Modify the DFA as follows to obtain DFA(L').

1. Change the final states to non-final states.


2. Change the non-final states to final states.

Since there exists a DFA(L') now, L' is regular.


This can be shown by an example using a DFA. Let L denote the language containing strings
that begins and ends with a. Σ = {a, b}. The DFA for L is given below.

Note: q3 denotes
the dead state.
Once you enter
q3, you remain in it
forever.

L' denotes the language that does not contain strings that begin and end with a. This implies L'
contains strings that

• begins with a and ends with b


• begins with b and ends with a
• begins with b and ends with b

The DFA for L' is obtained by flipping the final states of DFA(L) to non-final states and vice-
versa. The DFA for L' is given below.
• q0 ensures ε is
accepted

• q1 ensures all strings


that begin with a and
end with b are
accepted.

• q3 ensures all strings


that begin with b
(ending with either a
or b) are accepted.

Important Note: While specifying the DFA for L, we have also included the dead state q3. It is
important to include the dead state(s) if we are going to derive the complement DFA since, the
dead state(s) too would become final in the complementation. If we didn't add the dead state(s)
originally, the complement will not accept all strings supposed to be accepted.
In the above example, if we didn't include q3 originally, the complement will not accept strings
starting with b. It will only accept strings that begin with a and end with b which is only a subset
of the complement.
CONCLUSION: REGULAR LANGUAGES ARE CLOSED UNDER COMPLEMENTATION.

2. Union
If L1 and L2 are regular, then L1 ∪ L2 is regular.
This is easier proved using regular expressions. If L1 is regular, there exists a regular
expression R1 to describe it. Similarly, if L2 is regular, there exists a regular expression R2 to
describe it. R1 + R2 denotes the regular expression that describe L1 ∪ L2. Therefore, L1 ∪ L2 is
regular.
This again can be shown using an example. If L1 is a language that contains strings that begin
with a and L2 is a language that contain strings that end with a, then L1 ∪ L2 denotes the
language the contain strings that either begin with a or end with a.
- a(a+b)* is the regular expression that denotes L1.
- (a+b)*a is the regular expression that denotes L2.
- L1 ∪ L2 is denoted by the regular expression a(a+b)* + (a+b)*a. Therefore, L1 ∪ L2 is
regular.
In terms of DFA, we can say that a DFA(L1 ∪ L2) accepts those strings that are accepted by
either DFA(L1) or DFA(L2) or both.

• DFA(L1 ∪ L2) can be constructed by adding a new start state and new final state.
• The new start state connects to the two start states of DFA(L1) and DFA(L2) by
εtransitions.
• Similarly, two ε transitions are added from the final states of DFA(L1) and DFA(L2) to the
new final state.
• Convert this resulting NFA to its equivalent DFA.
As an exercise you can try this approach of DFA construction for union for the given example.
CONCLUSION: REGULAR LANGUAGES ARE CLOSED UNDER UNION.

3. Intersection
If L1 and L2 are regular, then L1 ∩ L2 is regular.
Since a language denotes a set of (possibly infinite) strings and we have shown above that
regular languages are closed under union and complementation, by De Morgan's law can be
applied to show that regular languages are closed under intersection too.
L1 and L2 are regular ⇒ L1' and L2' are regular (by Complementation property)
L1' ∪ L2' is regular (by Union property)
L1 ∩ L2 is regular (by De Morgan's law)
In terms of DFA, we can say that a DFA(L1 ∩ L2) accepts those strings that are accepted by
both DFA(L1) and DFA(L2).
CONCLUSION: REGULAR LANGUAGES ARE CLOSED UNDER INTERSECTION.

4. Concatenation
If L1 and L2 are regular, then L1 . L2 is regular.
This can be easily proved by regular expressions. If R1 is a regular expression denoting L1 and
R2 is a regular expression denoting L2, then we R1 . R2 denotes the regular expression
denoting L1 . L2. Therefore, L1 . L2 is regular.
In terms of DFA, we can say that a DFA(L1 . L2) can be constructed by adding an ε-trainstion
from the final state of DFA(L1) - which now ceases to be the final state - to the start state of
DFA(L2). You can try showing this using an example.
CONCLUSION: REGULAR LANGUAGES ARE CLOSED UNDER CONCATENATION.

5. Kleene star
If L is regular, then L* is regular.
This can be easily proved by regular expression. If L is regular, then there exists a regular
expression R. We know that if R is a regular expression, R* is a regular expression too. R*
denotes the language L*. Therefore L* is regular.
In terms of DFA, in the DFA(L) we add two ε transitions, one from start state to final state and
another from final state to start state. This denotes DFA(L*). You can try showing this for an
example.
CONCLUSION: REGULAR LANGUAGES ARE CLOSED UNDER KLEENE STAR.

6. Difference
If L1 and L2 are regular, then L1 - L2 is regular.
We know that L1 - L2 = L1 ∩ L2'
L1 and L2 are regular ⇒ L1 and L2' are regular (by Complementation property)
L1 ∩ L2' is regular (by Intersection property)
L1 - L2 is regular (by De Morgan's law)
In terms of DFA, we can say that a DFA(L1 - L2) accepts those strings that are accepted by both
DFA(L1) and not accepted by DFA(L2). You can try showing this for an example.
CONCLUSION: REGULAR LANGUAGES ARE CLOSED UNDER DIFFERENCE.

7. Reverse
If L is regular, then LR is regular.
Let DFA(L) denote the DFA of L. Make the following modifications to construct DFA(LR).

1. Change the start state of DFA(L) to the final state.


2. Change the final state of DFA(L) to the start state.

In case there are more than one final state in DFA(L), first add a new final state and
add ε- transitions from the final states (which now cease to be final states any more)
and perform this step.

3. Reverse the direction of the arrows.

You can try showing this using an example.


CONCLUSION: REGULAR LANGUAGES ARE CLOSED UNDER REVERSAL.

(B) Decision Properties


1. Membership question
Does a string w belong to L? i.e. Is w ∈ L?
This can be validated as follows.

• Construct DFA(L).
• Run w on DFA(L).
• If DFA(L) accepts w, then w ∈ L. Else w ∉ L.

CONCLUSION: MEMBERSHIP QUESTION IN REGULAR LANGUAGES IS DECIDABLE.

2. Emptiness question
Is L = φ?
This can be validated as follows.

• Construct DFA(L).
• If there exists no path from start state to final state, L = φ. Else L ≠ φ.

CONCLUSION: EMPTINESS OF REGULAR LANGUAGES IS DECIDABLE.

3. Equivalence question
Is L1 = L2?
This can be validated as follows.
• Construct DFA(L1) and DFA(L2).
• Reduce them to their minimal DFAs: MinDFA(L1) and MinDFA(L2).
• If MinDFA(L1) = MinDFA(L2) then L1 = L2. Else L1 ≠ L2.

CONCLUSION: EQUIVALENCE OF REGULAR LANGUAGES IS DECIDABLE.

4. Subset question
Is L1 ⊂ L2?
This can be validated as follows.

• If (L1 - L2) = φ and (L2 - L1) ≠ φ then L1 ⊂ L2. Else L1 ⊄ L2.

CONCLUSION: SUBSET PROPERTY OF REGULAR LANGUAGES IS DECIDABLE.

5. Infinite question
Is L infinite?
This can be validated as follows.

• Construct DFA(L).
• If DFA(L) has at least one loop, then L is infinite. Else L finite.

Pumping lemma for regular language :-

What is the pumping lemma for regular language

There are two Pumping Lemmas (PL), which are defined for Regular Languages and
Context - Free Languages.

Pumping lemma for Regular languages

• It gives a method for pumping (generating) many substrings from a given


string.
• In other words, we say it provides means to break a given long input
string into several substrings.
• Lt gives necessary condition(s) to prove a set of strings is not regular.

Theorem
For any regular language L, there exists an integer P, such that for all w in L
|w|>=P
We can break w into three strings, w=xyz such that.
(1)lxyl < P
(2)lyl > 1
(3)for all k>= 0: the string xykz is also in L

Application of pumping lemma


Pumping lemma is to be applied to show that certain languages are not regular.
It should never be used to show a language is regular.

• If L is regular, it satisfies the Pumping lemma.


• If L does not satisfy the Pumping Lemma, it is not regular.
Steps to prove that a language is not regular by using PLare as follows−

• step 1 − We have to assume that L is regular


• step 2 − So, the pumping lemma should hold for L.
• step 3 − It has to have a pumping length (say P).
• step 4 − All strings longer that P can be pumped |w|>=p.
• step 5 − Now find a string 'w' in L such that |w|>=P
• step 6 − Divide w into xyz.
• step 7 − Show that xyiz ∉ L for some i.
• step 8 − Then consider all ways that w can be divided into xyz.
• step 9 − Show that none of these can satisfy all the 3 pumping conditions
at same time.
• step 10 − w cannot be pumped = CONTRADICTION.
CONCLUSION: INFINITE PROPERTY OF REGULAR LANGUAGES IS DECIDABLE.

Introduction
The language accepted by Finite Automata is known as Regular Language.
Pumping Lemma is used to prove that a Language is not Regular. It cannot be used
to prove that a language is Regular.
The term Pumping Lemma is made up of two words:-
Pumping: The word pumping refers to generating many input strings by pushing a
symbol in an input string repeatedly.
Lemma: The word Lemma refers to the intermediate theorem in a proof.
There are two Pumping Lemmas, that are defined for
• Regular Languages
• Context-Free Languages
Let’s now learn about Pumping Lemma for Regular Languages in-depth.
Pumping Lemma For Regular Languages
Theorem: If A is a Regular Language, then A has a Pumping Length ‘P’ such that
any string ‘S’ where |S ≥ P may be divided into three parts S = xyz such that the
following conditions must be true:
1.) xyiz ∈ A for every i ≥ 0
2.) |y| > 0
3.) |xy| ≤ P
In simple words, if a string y is ‘pumped’ or insert any number of times, the resultant
string still remains in A.
Pumping Lemma is used as proof of the irregularity of a language. It means, that if
a language is regular, it always satisfies the pumping lemma. If at least one string is
made from pumping, not in language A, then A is not regular.
We use the CONTRADICTION method to prove that a language is not Regular.
To prove that a language is not Regular using Pumping Lemma, follow the below
steps:
Step 1: Assume that Language A is Regular.
Step 2: It has to have a Pumping Length (say P).
Step 3: All strings longer than P can be pumped |S| ≥ P.
Step 4: Now, find a string ‘S’ in A such that |S| ≥ P.
Step 5: Divide S into x y z strings.
Step 6: Show that xyiz ∉ A for some i.
Step 7: Then consider how S can be divided into x y z.
Step 8: Show that none of the above strings satisfies all three pumping conditions
simultaneously.
Step 9: S cannot be pumped == CONTRADICTION.
Let’s apply these above steps to check whether a Language is not Regular with the
help of Pumping Lemma.
Implementation of Pumping lemma for regular languages
Example: Using Pumping Lemma, prove that the language A = {anbn | n≥0} is Not
Regular.
Solution: We will follow the steps we have learned above to prove this.
Assume that A is Regular and has a Pumping length = P.
Let a string S = apbp.
Now divide the S into the parts, x y z.
To divide the S, let’s take the value of P = 7.
Therefore, S = aaaaaaabbbbbbb (by putting P=7 in S = apbp).
Case 1: Y consists of a string having the letter only ‘a’.

Case 2: Y consists of a string having the letter only ‘b’.


Case 3: Y consists of a string with the letters ‘a’ and ‘b’.

For all the above cases, we need to show xyiz ∉ A for some i.
Let the value of i = 2. xyiz => xy2z
In Case 1. xy2z = aa aaaa aaaa abbbbbbb
No of ‘a’ = 11, No. of ‘b’ = 7.
Since the No of ‘a’ != No. of ‘b’, but the original language has an equal number of
‘a’ and ‘b’; therefore, this string will not lie in our language.
In Case 2. xy2z = aaaaaaabb bbbb bbbb b
No of ‘a’ = 7, No. of ‘b’ = 11.
Since the No of ‘a’ != No. of ‘b’, but the original language has an equal number of
‘a’ and ‘b’; therefore, this string will not lie in our language.
In Case 3. xy2z = aaaa aabb aabb bbbbb
No of ‘a’ = 8, No. of ‘b’ = 9.
Since the No of ‘a’ != No. of ‘b’, but the original language has an equal number of
‘a’ and ‘b’, and also, this string did not follow the anbn pattern; therefore, this string will
not lie in our language.
We can see at i = 2 all the above three strings do not lie in the language A = {anbn |
n≥0}.
Therefore, the language A = {anbn | n≥0} is not Regular.
Also see, Turing Machine in TOC.
Pumping Lemma for Context-free Languages
The Bar-Hillel lemma, also referred to as the pumping lemma for context-free
languages, is a generalisation of the pumping lemma for regular languages in formal
language theory and computer science. It creates a characteristic that all context-
free languages share.

The pumping lemma can be used to show by contradiction that a particular language
is not context-free. In contrast, there are other conditions that must be met in order to
guarantee that a language is context-free, such as Ogden's lemma or the
Interchange lemma.
Implementation of Pumping lemma for Context-free Languages
Suppose a language L is context free, then there will exist any integer p≥1, also
called pumping length such that every string S in the context free language L which
has a length of p or more can be written as :
s = uvwxy
with the substrings following the following property:
1.) uvnwxny∈ L for every n ≥ 0
2.) |vx| ≥ 1
3.) |vwx| ≤ P
Applications of Pumping Lemma
Applying the Pumping Lemma will demonstrate that some languages are irregular.
Never utilise a language to demonstrate its regularity.
• Pumping Lemma is satisfied if L is regular.
• L is not regular if it does not satisfy the Pumping Lemma.
Frequently Asked Questions
What do you mean by pumping lemma?
The term Pumping Lemma is made up of two words:-
Pumping: The word pumping refers to generating many input strings by pushing a
symbol in an input string repeatedly.
Lemma: The word Lemma refers to the intermediate theorem in a proof.
What is pumping lemma and its application?
Applying the Pumping Lemma will demonstrate that some languages are irregular.
Never utilise a language to demonstrate its regularity.
• Pumping Lemma is satisfied if L is a regular language.
• L is not regular if it does not satisfy the Pumping Lemma.
What do you mean by pumping Lemma for regular language?
If A is a Regular Language, then A has a Pumping Length ‘P’ such that any string ‘S’
where |S ≥ P may be divided into three parts S = xyz such that the following
conditions must be true:
1.) xyiz ∈ A for every i ≥ 0
2.) |y| > 0
3.) |xy| ≤ P
What do you mean by pumping Lemma for context free
languages?
Suppose a language L is context free, then there will exist any integer p≥1, also
called pumping length such that every string S in the context free language L which
has a length of p or more can be written as :
s = uvwxy
The properties are:
1.) uvnwxny∈ L for every n ≥ 0
2.) |vx| ≥ 1
3.) |vwx| ≤ P
What is pumping lemma in finite automata?
Applying the Pumping Lemma will demonstrate that some languages are irregular.
Never utilise a language to demonstrate its regularity. Pumping Lemma is satisfied if
L is a regular language. L is not regular if it does not satisfy the Pumping Lemma.
What are the conditions of the pumping lemma?
Suppose a language L is context free, then there will exist any integer p≥1, also
called pumping length such that every string S in the context free language L which
has a length of p or more can be written as :
s = uvwxy
The conditions for pumping lemma are as follows:
1.) uvnwxny∈ L for every n ≥ 0
2.) |vx| ≥ 1
3.) |vwx| ≤ P

Minimization of finite automata :-

DFA Minimization using Myphill-Nerode Theorem

Algorithm
Input − DFA
Output − Minimized DFA
Step 1 − Draw a table for all pairs of states (Qi, Qj) not necessarily connected directly
[All are unmarked initially]
Step 2 − Consider every state pair (Qi, Qj) in the DFA where Qi ∈ F and Qj ∉ F or vice
versa and mark them. [Here F is the set of final states]
Step 3 − Repeat this step until we cannot mark anymore states −
If there is an unmarked pair (Qi, Qj), mark it if the pair {δ (Qi, A), δ (Qi, A)} is marked for
some input alphabet.
Step 4 − Combine all the unmarked pair (Qi, Qj) and make them a single state in the
reduced DFA.

Example
Let us use Algorithm 2 to minimize the DFA shown below.

Step 1 − We draw a table for all pair of states.


a b c d e f

Step 2 − We mark the state pairs.

a b c d e f

c ✔ ✔

d ✔ ✔

e ✔ ✔

f ✔ ✔ ✔

Step 3 − We will try to mark the state pairs, with green colored check mark,
transitively. If we input 1 to state ‘a’ and ‘f’, it will go to state ‘c’ and ‘f’ respectively.
(c, f) is already marked, hence we will mark pair (a, f). Now, we input 1 to state ‘b’
and ‘f’; it will go to state ‘d’ and ‘f’ respectively. (d, f) is already marked, hence we will
mark pair (b, f).

a b c d e f

c ✔ ✔

d ✔ ✔

e ✔ ✔
f ✔ ✔ ✔ ✔ ✔

After step 3, we have got state combinations {a, b} {c, d} {c, e} {d, e} that are
unmarked.
We can recombine {c, d} {c, e} {d, e} into {c, d, e}
Hence we got two combined states as − {a, b} and {c, d, e}
So the final minimized DFA will contain three states {f}, {a, b} and {c, d, e}

DFA Minimization using Equivalence Theorem


If X and Y are two states in a DFA, we can combine these two states into {X, Y} if they
are not distinguishable. Two states are distinguishable, if there is at least one string
S, such that one of δ (X, S) and δ (Y, S) is accepting and another is not accepting.
Hence, a DFA is minimal if and only if all the states are distinguishable.

Algorithm 3
Step 1 − All the states Q are divided in two partitions − final states and non-final
states and are denoted by P0. All the states in a partition are 0th equivalent. Take a
counter k and initialize it with 0.
Step 2 − Increment k by 1. For each partition in Pk, divide the states in Pk into two
partitions if they are k-distinguishable. Two states within this partition X and Y are k-
distinguishable if there is an input S such that δ(X, S) and δ(Y, S) are (k-1)-
distinguishable.
Step 3 − If Pk ≠ Pk-1, repeat Step 2, otherwise go to Step 4.
Step 4 − Combine kth equivalent sets and make them the new states of the reduced
DFA.

Example
Let us consider the following DFA −

q δ(q,0) δ(q,1)

a b c

b a d

c e f

d e f

e e f

f f f

Let us apply the above algorithm to the above DFA −

• P0 = {(c,d,e), (a,b,f)}
• P1 = {(c,d,e), (a,b),(f)}
• P2 = {(c,d,e), (a,b),(f)}
Hence, P1 = P2.
There are three states in the reduced DFA. The reduced DFA is as follows −
Q δ(q,0) δ(q,1)

(a, b) (a, b) (c,d,e)

(c,d,e) (c,d,e) (f)

(f) (f) (f)

Minimization of Finite Automata


The term minimization refers to the construction of finite automata with a
minimum number of states, which is equivalent to the given finite
automata. The number of states used in finite automata directly depend upon
the size of the automata that we have used. So, it is important to reduce the
number of states. We minimize the finite automata by detecting those states of
automata whose presence or absence does not affect the language accepted
by the finite automata.
Some important concepts used in the minimization of finite automata

• Unreachable state
• Dead State

Unreachable state: Unreachable state is that state in which finite automata


never reaches during the transition from one state to another state.

In the above DFA, we have unreachable state E, because on any input from
the initial state, we are unable to reach to that state. This state is useless in
finite automata. So, the best solution is to eliminate these types of states to
minimize the finite automata.
Dead State: It is a non-accepting state, which goes itself for every possible
input symbol.

In the above DFA, we have q5, and q6 are dead states because every
possible input symbol goes to itself.
Minimization of Deterministic Finite Automata
The following steps are used to minimize a Deterministic Finite Automata.
Step1: First of all, we detect the unreachable state.
Step2: After detecting the unreachable state, in the second step, we
eliminate the unreachable state (if found).
Step3: Identify the equivalent states and merge them.

a. In this, we divide all the states into two groups:


b. Group A: This group contains all accepting states of automata.
c. Group B: This group contains all non-accepting states of automata.

This step is repeated for every group. Find group the input lead to if there are
differences the partition the group into sets containing states which go to the
same groups under the inputs.

• The resulting final partition contains equivalent states now merge them
into a single state.

Step4: In this step, we detect dead states.


Step5: After detecting the dead states, the last step is to eliminate dead
states.
Example:
Minimize the following DFA.
Solution
Step1: Detect unreachable states.

• Start form initial state. Add q0 to temporary state (T)

T = {q0}

• Now, for all states in temporary state set T, find transition from each
state on each input symbol in ?. If resulting state is not in T add that
state in T.

? (q0, a) = q1

? ( q0, b) = q2

Now, T = {q0, q1, q2}

Again

? ( q1, a) = q3

? ( q1, b) = q4

Now, T = {q0, q1, q2, q3, q4}

Again

? ( q2, a) = q3
? ( q2, b) = q5

Now, T = {q0, q1, q2, q3, q4, q5}

Again

? ( q3, a) = q3

? ( q3, b) = q1

Now, change in T because q1, q3 are already in set T.

T = {q0, q1, q2, q3, q4, q5}

Again

? ( q4, a) = q4

? ( q4, b) = q5

Now, change in T because q4, q5 are already in set T

T = {q0, q1, q2, q3, q4, q5}

Again

? ( q5, a) = q5

? ( q5, b) = q4

T = {q0, q1, q2, q3, q4, q5}

• Repeat previous step until T does not change

Finally we get T as:


T = {q0, q1, q2, q3, q4, q5}

• Now we will find unreachable states

U = Q – T

Q = {q0, q1, q2, q3, q4, q5, q6}

U = {q0, q1, q2, q3, q4, q5, q6} – {q0, q1, q2, q3, q4, q5}

U = {q6}, is the unreachable state


Step2: In this step, we eliminate the unreachable state found in first step.
Step3: Identify the equivalent steps and merge them.

• First of all, divide the states into two groups

Group A – q3, q4,q5 (contains accepting state)


Group B – q0, q1,q2 (contains non-accepting state)
Check Group A for input a
? (q3, a) = q3
? (q4, a) = q4
? (q5, a) = q5
In Similar way, check group A for input b
? (q3, b) = q1
? (q4, b) = q5
? (q5, b) = q4
q1 belongs to group B for input b, and q4 and q5 belong to group A for input b.
So, we partition group A as:
Group A1 - q3
Group A2 – q4, q5
Group B – q0, q1, q2
Now, we check Group B – q0, q1, q2 for both input symbols
For input a, we have:
? (q0, a) = q1
? (q1, a) = q3
? (q2, a) = q3
For input b, we have
? (q0, b) = q2
? (q1, b) = q4
? (q2, b) = q5
q2 belongs to group B and q4, q4 belongs to group A2 for input b. So, we
partition group B as:
Group B1 – q0
Group B2 – q1, q2
Check Group A2 for input a
? (q4, a) = q4

? (q5, a) = q5
Check Group A2 for input b
? (q4, b) = q5

? (q5, b) = q4
As both belong to the same group, the further division is not possible.
Now, we check Group B2 for input a and b
?(q1, a) = q3
?(q2, a) = q3
?(q1, b) = q4
?(q2, b) = q5
q4 and q5 belong to group A2 for input b. So no further partitioning is possible.
Finally, the following groups are formed:
Group A1 – q3
Group A2 – q4, q5
Group B1 – q0
Group B2 – q1, q2
The resulting automata is given below:

Step4: In this step, we detected dead states. There are no dead states in the
above DFA; hence it is minimized.
-: Module 3:-
-:Context-sensitive languages :-(context-sensitive grammars and
context-sensitive languages.)
The Context sensitive grammar (CSG) is defined as G=(V,Σ,P,S)
Where,

• V: Non terminals or variables.


• Σ: Input symbols.
• P: Production rule.
• P:{αAβ → αγβ, A ϵ V,α ϵ (V∪Σ)*, β ϵ (V∪Σ)*
• S: Starting symbol.

Example

• aS→SAa|aA
• aA→abc
In context sensitive grammar, there is either left context or right context (αAβ i.e. α
is left context and β is right) with variables.
But in context free grammar (CFG) there will be no context.
For example in production rule
S →0 B S 2 ,
B0→0B
We cannot replace B until we get B0.
Therefore, CSG is harder to understand than the CFG.
The CFG, CSG and the unrestricted grammar are depicted below −
Context-sensitive Grammar (CSG) and Language (CSL)
Context-Sensitive Grammar –
A Context-sensitive grammar is an Unrestricted grammar in which all the
productions are of form –

Where α and β are strings of non-terminals and terminals.


Context-sensitive grammars are more powerful than context-free grammars
because there are some languages that can be described by CSG but not by
context-free grammars and CSL are less powerful than Unrestricted grammar.
That’s why context-sensitive grammars are positioned between context-free and
unrestricted grammars in the Chomsky hierarchy.

Context-sensitive grammar has 4-tuples. G = {N, Σ, P, S}, Where


N = Set of non-terminal symbols
Σ = Set of terminal symbols
S = Start symbol of the production
P = Finite set of productions
All rules in P are of the form α1 A α2 –> α1 β α2

Context-sensitive Language: The language that can be defined by context-


sensitive grammar is called CSL. Properties of CSL are :
• Union, intersection and concatenation of two context-sensitive
languages is context-sensitive.
• Complement of a context-sensitive language is context-sensitive.
Example –
Consider the following CSG.
S → abc/aAbc
Ab → bA
Ac → Bbcc
bB → Bb
aB → aa/aaA
What is the language generated by this grammar?
Solution:
S → aAbc
→ abAc
→ abBbcc
→ aBbbcc
→ aaAbbcc
→ aabAbcc
→ aabbAcc
→ aabbBbccc
→ aabBbbccc
→ aaBbbbccc
→ aaabbbccc
The language generated by this grammar is {anbncn | n≥1}.

Linear Bounded Automata

Previous Page

Next Page

A linear bounded automaton is a multi-track non-deterministic Turing machine with


a tape of some bounded finite length.
Length = function (Length of the initial input string, constant c)
Here,
Memory information ≤ c × Input information
The computation is restricted to the constant bounded area. The input alphabet
contains two special symbols which serve as left end markers and right end markers
which mean the transitions neither move to the left of the left end marker nor to the
right of the right end marker of the tape.
A linear bounded automaton can be defined as an 8-tuple (Q, X, ∑, q0, ML, MR, δ, F)
where −
• Q is a finite set of states
• X is the tape alphabet
• ∑ is the input alphabet
• q0 is the initial state
• ML is the left end marker
• MR is the right end marker where MR ≠ ML
• δ is a transition function which maps each pair (state, tape symbol) to
(state, tape symbol, Constant ‘c’) where c can be 0 or +1 or -1
• F is the set of final states
A deterministic linear bounded automaton is always context-sensitive and the linear
bounded automaton with empty language is undecidable..

Introduction to Linear Bounded Automata (LBA)


History :
In 1960, associate degree automaton model was introduced by Myhill and these
days this automation model is understood as deterministic linear bounded
automaton. After this, another scientist named Landweber worked on this and
proposed that the languages accepted by a deterministic LBA are continually
context-sensitive languages.
In 1964, Kuroda introduced a replacement and a lot of general models specially
for non-deterministic linear bounded automata, and established that the
languages accepted by the non-deterministic linear bounded automata are
exactly the context-sensitive languages.
Introduction to Linear Bounded Automata :
A Linear Bounded Automaton (LBA) is similar to Turing Machine with some
properties stated below:
• Turing Machine with Non-deterministic logic,
• Turing Machine with Multi-track, and
• Turing Machine with a bounded finite length of the tape.

Tuples Used in LBA :


LBA can be defined with eight tuples (elements that help to design automata) as:
M = (Q , T , E , q0 , ML , MR , S , F),

where,
Q -> A finite set of transition states
T -> Tape alphabet
E -> Input alphabet
q0 -> Initial state
ML -> Left bound of tape
MR -> Right bound of tape
S -> Transition Function
F -> A finite set of final states
Diagrammatic Representation of LBA :
Examples:
Languages that form LBA with tape as shown above,
• L = {an! | n >= 0}
• L = {wn | w from {a, b}+, n >= 1}
• L = {wwwR | w from {a, b}+}
Facts :
Suppose that a given LBA M has
--> q states,
--> m characters within the tape alphabet, and
--> the input length is n
1. Then M can be in at most f(n) = q * n * mn configurations i.e. a tape of n
cells and m symbols, we are able to have solely mn totally different
tapes.
2. The tape head is typically on any of the n cells which we have a
tendency to are typically death penalty in any of the q states.
-: Module 5:-
Undecidability :-

Church -Turning thesis:-


The Church-Turing thesis says that every solvable decision problem can be
transformed into an equivalent Turing machine problem.
It can be explained in two ways, as given below −
• The Church-Turing thesis for decision problems.
• The extended Church-Turing thesis for decision problems.
Let us understand these two ways.

The Church-Turing thesis for decision problems


There is some effective procedure to solve any decision problem if and only if there
is a Turing machine which halts for all input strings and solves the problem.

The extended Church-Turing thesis for decision problems


A decision problem Q is said to be partially solvable if and only if there is a Turing
machine which accepts precisely the elements of Q whose answer is yes.

Proof
A proof by the Church-Turing thesis is a shortcut often taken in establishing the
existence of a decision algorithm.
For any decision problem, rather than constructing a Turing machine solution, let us
describe an effective procedure which solves the problem.
The Church-Turing thesis explains that a decision problem Q has a solution if and
only if there is a Turing machine that determines the answer for every q ϵ Q. If no
such Turing machine exists, the problem is said to be undecidable.

Church-Turing Thesis

The Church-Turing thesis (formerly commonly known simply as Church's thesis) says that
any real-world computation can be translated into an equivalent computation involving
a Turing machine. In Church's original formulation (Church 1935, 1936), the thesis says
that real-world calculation can be done using the lambda calculus, which is equivalent to
using general recursive functions.
The Church-Turing thesis encompasses more kinds of computations than those originally
envisioned, such as those involving cellular automata, combinators, register machines,
and substitution systems. It also applies to other kinds of computations found in
theoretical computer science such as quantum computing and probabilistic computing.
There are conflicting points of view about the Church-Turing thesis. One says that it can be
proven, and the other says that it serves as a definition for computation. There has never
been a proof, but the evidence for its validity comes from the fact that every realistic
model of computation, yet discovered, has been shown to be equivalent. If there were a
device which could answer questions beyond those that a Turing machine can answer,
then it would be called an oracle.
Some computational models are more efficient, in terms of computation time and
memory, for different tasks. For example, it is suspected that quantum computers can
perform many common tasks with lower time complexity, compared to modern
computers, in the sense that for large enough versions of these problems, a quantum
computer would solve the problem faster than an ordinary computer. In contrast, there
exist questions, such as the halting problem, which an ordinary computer cannot answer,
and according to the Church-Turing thesis, no other computational device can answer
such a question.
The Church-Turing thesis has been extended to a proposition about the processes in the
natural world by Stephen Wolfram in his principle of computational equivalence (Wolfram
2002), which also claims that there are only a small number of intermediate levels of
computing power before a system is universal and that most natural systems are
universal.

Universal Turning Machine:-


Explain the universal Turing machine in TOC

The Turing Machine (TM) is the machine level equivalent to a digital computer.
It was suggested by the mathematician Turing in the year 1930 and has become the
most widely used model of computation in computability and complexity theory.
The model consists of an input and output. The input is given in binary format form
on to the machine’s tape and the output consists of the contents of the tape when
the machine halts
The problem with the Turing machine is that a different machine must be constructed
for every new computation to be performed for every input output relation.
This is the reason the Universal Turing machine was introduced which along with
input on the tape takes the description of a machine M.
The Universal Turing machine can go on then to simulate M on the rest of the content
of the input tape.
A Universal Turing machine can thus simulate any other machine.
The idea of connecting multiple Turing machine gave an idea to Turing −
• Can a Universal machine be created that can ‘simulate’ other machines?
• This machine is called as Universal Turing Machine
This machine would have three bits of information for the machine it is simulating

• A basic description of the machine.


• The contents of machine tape.
• The internal state of the machine.
The Universal machine would simulate the machine by looking at the input on the
tape and the state of the machine.
It would control the machine by changing its state based on the input. This leads to
the idea of a “computer running another computer”.
It would control the machine by changing its state based on the input. This leads to
the idea of a “computer running another computer”.
The schematic diagram of the Universal Turing Machine is as follows −

What is Turing Machine?


A Turing machine is a computational mathematical model. It is a type of CPU that
controls all data manipulation performed by a computer. It was proposed by the
mathematician Turing in 1930 and has become the most extensively used
computation model in computability and complexity theory.
A Turing machine can also compute everything that a real computer can compute.
For example, a Turing machine can simulate any function used in a programming
language. Some common examples include recursion and parameter passing. A
Turing machine can also be used to simplify algorithm statements.
Turing machines can be either halting or non-halting, depending on the algorithm
and the input associated with the algorithm.
The model consists of an input and an output. The input is passed in binary format to
the machine's tape, and the output is the contents of the tape after the machine
halts.
But the problem with the Turing machine is that a new machine is constructed for
each new computation is to be performed for each input-output relation. This is why
the Universal Turing Machine(UTM) was invented.
What is Universal Turing Machine?
Turing was inspired by the idea of connecting multiple Turing machines. He asked
himself that can a universal machine be constructed that could simulate other
machines. He named this machine as Universal Turing Machine.
A Universal Turing Machine, in more specific terms, can imitate the behavior of an
arbitrary Turing machine over any collection of input symbols. Therefore, it is
possible to create a single machine to calculate any computable sequence.
The input of a UTM includes:
• The description of a machine M on the tape.
• The input data.

The UTM can then simulate M on the rest of the input tape's content. As a result, a
Universal Turing Machine can simulate any other machine.
Creating a general-purpose Turing Machine(UTM) is a more difficult task. Once the
Turing machine's transition is defined, the machine is restricted to performing a
specific type of computation.
We can create a universal Turing machine by modifying our fundamental Turing
machine model. For even simple behavior to be stimulated, the modified Turing
computer must have a huge number of states. We modify our basic model by doing
the following:
• Increase the number of read/write heads.
• Increase the number of input tape dimensions.
• Increasing memory space.

The UTM would include three pieces of data for the machine it is simulating:
• A basic description of the machine.
• The contents of the machine tape.
• The internal state of the machine.

The Universal machine would simulate the machine by checking the tape input and
the machine's state.
It would command the machine by modifying its state in response to the input. This
will be like a computer running another computer.
The schematic diagram of a Universal Turing Machine is as follows:

Difference Between Turing Machine and Universal


Turing Machine
Frequently Asked Questions
What is Universal Turing Machine(UTM)?
A Universal Turing Machine is a machine that can simulate an arbitrary Turing
machine over any collection of input symbols. It takes two inputs. The first is the
description of the machine, and the other is the input data.
Why do we need a Universal Turing Machine?
A Universal Turing Machine, in more specific terms, can imitate the behavior of any
Turing machine over any set of input symbols. Hence, we needed to build a single
machine that could be used to calculate any computable sequence.
What is the input of Universal Turing machine?
To use a Universal Turing Machine, you need to write some input on its tape and
start the machine. When the machine computes and halts, the value on the tape is
the output. The input of a UTM includes the description of a machine M and its input
data
Introduction to Undecidability
In the theory of computation, we often come across such problems that are answered
either 'yes' or 'no'. The class of problems which can be answered as 'yes' are called
solvable or decidable. Otherwise, the class of problems is said to be unsolvable or
undecidable.

Undecidability of Universal Languages:


The universal language Lu is a recursively enumerable language and we have to prove
that it is undecidable (non-recursive).OOPs Concepts in Java

Consider that language Lu is recursively enumerable language. We will assume that


Lu is recursive. Then the complement of Lu that is L`u is also recursive. However, if we
have a TM M to accept L`u then we can construct a TM Ld. But Ld the diagonalization
language is not RE. Thus our assumption that Lu is recursive is wrong (not RE means
not recursive). Hence we can say that Lu is RE but not recursive. The construction of M
for Ld is as shown in the following diagram:

The universal and Diagonalization languages


https://www.youtube.com/watch?v=fXW2X1-huso

Reduction between languages and Rice’s theorem :-


Rice theorem states that any non-trivial semantic property of a language which is
recognized by a Turing machine is undecidable. A property, P, is the language of all
Turing machines that satisfy that property.

Formal Definition
If P is a non-trivial property, and the language holding the property, Lp , is recognized
by Turing machine M, then Lp = {<M> | L(M) ∈ P} is undecidable.

Description and Properties


• Property of languages, P, is simply a set of languages. If any language
belongs to P (L ∈ P), it is said that L satisfies the property P.
• A property is called to be trivial if either it is not satisfied by any
recursively enumerable languages, or if it is satisfied by all recursively
enumerable languages.
• A non-trivial property is satisfied by some recursively enumerable
languages and are not satisfied by others. Formally speaking, in a non-
trivial property, where L ∈ P, both the following properties hold:
o Property 1 − There exists Turing Machines, M1 and M2
that recognize the same language, i.e. either ( <M1>, <M2>
∈ L ) or ( <M1>,<M2> ∉ L )
o Property 2 − There exists Turing Machines M1 and M2,
where M1 recognizes the language while M2 does not, i.e.
<M1> ∈ L and <M2> ∉ L

Proof
Suppose, a property P is non-trivial and φ ∈ P.
Since, P is non-trivial, at least one language satisfies P, i.e., L(M0) ∈ P , ∋ Turing
Machine M0.
Let, w be an input in a particular instant and N is a Turing Machine which follows −
On input x

• Run M on w
• If M does not accept (or doesn't halt), then do not accept x (or do not
halt)
• If M accepts w then run M0 on x. If M0 accepts x, then accept x.
A function that maps an instance ATM = {<M,w>| M accepts input w} to a N such that

• If M accepts w and N accepts the same language as M0, Then L(M) =


L(M0) ∈ p
• If M does not accept w and N accepts φ, Then L(N) = φ ∉ p
Since ATM is undecidable and it can be reduced to Lp, Lp is also undecidable.
-: Undecidable problem about language :-
For an undecidable language, there is no Turing Machine which accepts the language
and makes a decision for every input string w (TM can make decision for some input
string though). A decision problem P is called “undecidable” if the language L of all
yes instances to P is not decidable. Undecidable languages are not recursive
languages, but sometimes, they may be recursively enumerable languages.

Example

• The halting problem of Turing machine


• The mortality problem
• The mortal matrix problem
• The Post correspondence problem, etc.

What are the undecidable problems in TOC

The problems for which we can’t construct an algorithm that can answer the problem
correctly in the infinite time are termed as Undecidable Problems in the theory of
computation (TOC).
A problem is undecidable if there is no Turing machine that will always halt an
infinite amount of time to answer as ‘yes’ or ‘no’.
Examples
The examples of undecidable problems are explained below. Here, CFG refers to
Context Free Grammar.
• Whether two CFG L and M equal − Since, we cannot determine all the
strings of any CFG, we can predict that two CFG are equal or not.
• Given a context-free language, there is no Turing machine (TM) that will
always halt an infinite amount of time and give an answer to whether
language is ambiguous or not.
• Given two context-free languages, there is no Turing machine that will
always halt an infinite amount of time and give an answer whether two
context-free languages are equal or not.
• Whether CFG will generate all possible strings of the input alphabet
(∑*) is undecidable.
Halting Problem
The Halting problem is the most famous of the undecidable problems.
Consider the code

num=1;
while(num=0)
{
num=num+1;
}

It counts up forever since it will never equal 0. This is an example of the halting
problem.
Note: Every context-free language is decidable.
Some of the other undecidable problems are:
Totality problem − It decide whether an arbitrary TM halts on all inputs. This is
equivalent to the problem of whether a program can ever enter an infinite loop, for
any input. It differs from the halting problem, which asks whether it enters an infinite
loop for a particular input.
Equivalence problem − It decide whether two TMs accept the same language. This
is equivalent to the problem of whether two programs compute the same output for
every input.
-: MODULE 4 :-
Turning Machines :-( The basic model for turning machines )

A Turing Machine is an accepting device which accepts the languages (recursively


enumerable set) generated by type 0 grammars. It was invented in 1936 by Alan
Turing.

Definition
A Turing Machine (TM) is a mathematical model which consists of an infinite length
tape divided into cells on which input is given. It consists of a head which reads the
input tape. A state register stores the state of the Turing machine. After reading an
input symbol, it is replaced with another symbol, its internal state is changed, and it
moves from one cell to the right or left. If the TM reaches the final state, the input
string is accepted, otherwise rejected.
A TM can be formally described as a 7-tuple (Q, X, ∑, δ, q0, B, F) where −
• Q is a finite set of states
• X is the tape alphabet
• ∑ is the input alphabet
• δ is a transition function; δ : Q × X → Q × X × {Left_shift, Right_shift}.
• q0 is the initial state
• B is the blank symbol
• F is the set of final states
Comparison with the previous automaton
The following table shows a comparison of how a Turing machine differs from Finite
Automaton and Pushdown Automaton.

Machine Stack Data Structure Deterministic?

Finite Automaton N.A Yes

Pushdown Automaton Last In First Out(LIFO) No

Turing Machine Infinite tape Yes

Example of Turing machine


Turing machine M = (Q, X, ∑, δ, q0, B, F) with

• Q = {q0, q1, q2, qf}


• X = {a, b}
• ∑ = {1}
• q0 = {q0}
• B = blank symbol
• F = {qf }
δ is given by −

Tape alphabet symbol Present State ‘q0’ Present State ‘q1’ Present State ‘q2’

a 1Rq1 1Lq0 1Lqf

b 1Lq2 1Rq1 1Rqf

Here the transition 1Rq1 implies that the write symbol is 1, the tape moves right, and
the next state is q1. Similarly, the transition 1Lq2 implies that the write symbol is 1,
the tape moves left, and the next state is q2.

Time and Space Complexity of a Turing Machine


For a Turing machine, the time complexity refers to the measure of the number of
times the tape moves when the machine is initialized for some input symbols and the
space complexity is the number of cells of the tape written.
Time complexity all reasonable functions −
T(n) = O(n log n)
TM's space complexity −
S(n) = O(n)
--------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------

Turing Machine was invented by Alan Turing in 1936 and it is used to accept
Recursive Enumerable Languages (generated by Type-0 Grammar).
Turing machines are a fundamental concept in the theory of computation and
play an important role in the field of computer science. They were first
described by the mathematician and computer scientist Alan Turing in 1936
and provide a mathematical model of a simple abstract computer.
In the context of automata theory and the theory of computation, Turing
machines are used to study the properties of algorithms and to determine
what problems can and cannot be solved by computers. They provide a way
to model the behavior of algorithms and to analyze their computational
complexity, which is the amount of time and memory they require to solve a
problem.
A Turing machine is a finite automaton that can read, write, and erase
symbols on an infinitely long tape. The tape is divided into squares, and each
square contains a symbol. The Turing machine can only read one symbol at
a time, and it uses a set of rules (the transition function) to determine its next
action based on the current state and the symbol it is reading.
The Turing machine’s behavior is determined by a finite state machine,
which consists of a finite set of states, a transition function that defines the
actions to be taken based on the current state and the symbol being read,
and a set of start and accept states. The Turing machine begins in the start
state and performs the actions specified by the transition function until it
reaches an accept or reject state. If it reaches an accept state, the
computation is considered successful; if it reaches a reject state, the
computation is considered unsuccessful.
Turing machines are an important tool for studying the limits of computation
and for understanding the foundations of computer science. They provide a
simple yet powerful model of computation that has been widely used in
research and has had a profound impact on our understanding of algorithms
and computation.
A turing machine consists of a tape of infinite length on which read and
writes operation can be performed. The tape consists of infinite cells on
which each cell either contains input symbol or a special symbol called
blank. It also consists of a head pointer which points to cell currently being
read and it can move in both directions.

Figure: Turing Machine

A TM is expressed as a 7-tuple (Q, T, B, ∑, δ, q0, F) where:

• Q is a finite set of states


• T is the tape alphabet (symbols which can be written on Tape)
• B is blank symbol (every cell is filled with B except input alphabet
initially)
• ∑ is the input alphabet (symbols which are part of input alphabet)
• δ is a transition function which maps Q × T → Q × T × {L,R}.
Depending on its present state and present tape alphabet (pointed
by head pointer), it will move to new state, change the tape symbol
(may or may not) and move head pointer to either left or right.
• q0 is the initial state
• F is the set of final states. If any state of F is reached, input string is
accepted.
Let us construct a turing machine for L={0^n1^n|n>=1}

• Q = {q0,q1,q2,q3} where q0 is initial state.


• T = {0,1,X,Y,B} where B represents blank.
• ∑ = {0,1}
• F = {q3}
Transition function δ is given in Table 1 as:

Illustration
Let us see how this turing machine works for 0011. Initially head points to 0
which is underlined and state is q0 as:

The move will be δ(q0, 0) = (q1, X, R). It means, it will go to state q1, replace
0 by X and head will move to right as:

The move will be δ(q1, 0) = (q1, 0, R) which means it will remain in same
state and without changing any symbol, it will move to right as:
The move will be δ(q1, 1) = (q2, Y, L) which means it will move to q2 state
and changing 1 to Y, it will move to left as:

Working on it in the same way, the machine will reach state q3 and head will
point to B as shown:

Using move δ(q3, B) = halt, it will stop and accepted.


Note:

•In non-deterministic turing machine, there can be more than one


possible move for a given state and tape symbol, but non-
deterministic TM does not add any power.
• Every non-deterministic TM can be converted into deterministic TM.
• In multi-tape turing machine, there can be more than one tape and
corresponding head pointers, but it does not add any power to
turing machine.
• Every multi-tape TM can be converted into single tape TM.
Question: A single tape Turing Machine M has two states q0 and q1, of
which q0 is the starting state. The tape alphabet of M is {0, 1, B} and its input
alphabet is {0, 1}. The symbol B is the blank symbol used to indicate end of
an input string. The transition function of M is described in the following
table.
The table is interpreted as illustrated below. The entry (q1, 1, R) in row q0
and column 1 signifies that if M is in state q0 and reads 1 on the current tape
square, then it writes 1 on the same tape square, moves its tape head one
position to the right and transitions to state q1. Which of the following
statements is true about M?

1. M does not halt on any string in (0 + 1)+


2. M does not halt on any string in (00 + 1)*
3. M halts on all string ending in a 0
4. M halts on all string ending in a 1
Solution: Let us see whether machine halts on string ‘1’. Initially state will be
q0, head will point to 1 as:

Using δ(q0, 1) = (q1, 1, R), it will move to state q1 and head will move to
right as:

Using δ(q1, B) = (q0, B, L), it will move to state q0 and head will move to left
as:
It will run in the same way again and again and not halt.
Option D says M halts on all string ending with 1, but it is not halting for 1.
So, option D is incorrect.
Let us see whether machine halts on string ‘0’. Initially state will be q0, head
will point to 1 as:

Using δ(q0, 0) = (q1, 1, R), it will move to state q1 and head will move to
right as:

Using δ(q1,B)=(q0,B,L), it will move to state q0 and head will move to left
as:

It will run in the same way again and again and not halt.
Option C says M halts on all string ending with 0, but it is not halting for 0.
So, option C is incorrect.
Option B says that TM does not halt for any string (00 + 1)*. But NULL string
is a part of (00 + 1)* and TM will halt for NULL string. For NULL string, tape
will be,
Using δ(q0, B) = halt, TM will halt. As TM is halting for NULL, this option is
also incorrect.
So, option (A) is correct.
This article is contributed by Sonal Tuteja. Please write comments if you find
anything incorrect, or you want to share more information about the topic
discussed above

Turning Recognizable (recursively enumerable ) and turning -


decidable (recursive) language and their closure properties :-

Recursive and Recursive Enumerable Languages in TOC


Recursive Enumerable (RE) or Type -0 Language
RE languages or type-0 languages are generated by type-0 grammars. An RE
language can be accepted or recognized by Turing machine which means it will
enter into final state for the strings of language and may or may not enter into
rejecting state for the strings which are not part of the language. It means TM can
loop forever for the strings which are not a part of the language. RE languages are
also called as Turing recognizable languages.
Recursive Language (REC)
A recursive language (subset of RE) can be decided by Turing machine which
means it will enter into final state for the strings of language and rejecting state for
the strings which are not part of the language. e.g.; L= {anbncn|n>=1} is recursive
because we can construct a turing machine which will move to final state if the
string is of the form anbncn else move to non-final state. So the TM will always halt
in this case. REC languages are also called as Turing decidable languages. The
relationship between RE and REC languages can be shown in Figure 1.
Closure Properties of Recursive Languages
• Union: If L1 and If L2 are two recursive languages, their union L1∪L2
will also be recursive because if TM halts for L1 and halts for L2, it will
also halt for L1∪L2.
• Concatenation: If L1 and If L2 are two recursive languages, their
concatenation L1.L2 will also be recursive. For Example:
L1= {anbncn|n>=0}
L2= {dmemfm|m>=0}
L3= L1.L2
= {anbncndm emfm|m>=0 and n>=0} is also recursive.
• L1 says n no. of a’s followed by n no. of b’s followed by n no. of c’s. L2
says m no. of d’s followed by m no. of e’s followed by m no. of f’s. Their
concatenation first matches no. of a’s, b’s and c’s and then matches no. of
d’s, e’s and f’s. So it can be decided by TM.
• Kleene Closure: If L1is recursive, its kleene closure L1* will also be
recursive. For Example:
L1= {anbncn|n>=0}
L1*= { anbncn||n>=0}* is also recursive.
• Intersection and complement: If L1 and If L2 are two recursive
languages, their intersection L1 ∩ L2 will also be recursive. For
Example:
L1= {anbncndm|n>=0 and m>=0}
L2= {anbncndn|n>=0 and m>=0}
L3=L1 ∩ L2
= { anbncndn |n>=0} will be recursive.

L1 says n no. of a’s followed by n no. of b’s followed by n no. of c’s and then any no.
of d’s. L2 says any no. of a’s followed by n no. of b’s followed by n no. of c’s followed
by n no. of d’s. Their intersection says n no. of a’s followed by n no. of b’s followed
by n no. of c’s followed by n no. of d’s. So it can be decided by turing machine, hence
recursive.
Similarly, complement of recursive language L1 which is ∑*-L1, will also be
recursive.
Note: As opposed to REC languages, RE languages are not closed under
complementation which means complement of RE language need not be RE.
GATE Questions
Question 1: Which of the following statements is/are FALSE?
1.For every non-deterministic TM, there exists an equivalent deterministic TM.
2.Turing recognizable languages are closed under union and complementation.
3.Turing decidable languages are closed under intersection and complementation.
4.Turing recognizable languages are closed under union and intersection.
A.1 and 4
B.1 and 3
C.2
D.3
Solution:
Statement 1 is true as we can convert every non-deterministic TM to deterministic
TM.
Statement 2 is false as Turing recognizable languages (RE languages) are not
closed under complementation.
Statement 3 is true as Turing decidable languages (REC languages) are closed
under intersection and complementation.
Statement 4 is true as Turing recognizable languages (RE languages) are closed
under union and intersection.
Question 2 : Let L be a language and L’ be its complement. Which one of the
following is NOT a viable possibility?
A.Neither L nor L’ is RE.
B.One of L and L’ is RE but not recursive; the other is not RE.
C.Both L and L’ are RE but not recursive.
D.Both L and L’ are recursive.
Solution:
Option A is correct because if L is not RE, its complementation will not be RE.
Option B is correct because if L is RE, L’ need not be RE or vice versa because RE
languages are not closed under complementation.
Option C is false because if L is RE, L’ will not be RE. But if L is recursive, L’ will also
be recursive and both will be RE as well because REC languages are subset of RE.
As they have mentioned not to be REC, so option is false.
Option D is correct because if L is recursive L’ will also be recursive.
Question 3: Let L1 be a recursive language, and let L2 be a recursively enumerable
but not a recursive language. Which one of the following is TRUE?
A.L1′ is recursive and L2′ is recursively enumerable
B.L1′ is recursive and L2′ is not recursively enumerable
C.L1′ and L2′ are recursively enumerable
D.L1′ is recursively enumerable and L2′ is recursive
Solution:
Option A is False as L2’ can’t be recursive enumerable (L2 is RE and RE are not
closed under complementation).
Option B is correct as L1’ is REC (REC languages are closed under
complementation) and L2’ is not recursive enumerable (RE languages are not
closed under complementation).
Option C is False as L2’ can’t be recursive enumerable (L2 is RE and RE are not
closed under complementation).
Option D is False as L2’ can’t be recursive enumerable (L2 is RE and RE languages
are not closed under complementation). As REC languages are subset of RE, L2’
can’t be REC as well.
This article is contributed by Sonal Tuteja. Please write comments if you find
anything incorrect, or you want to share more information about the topic
discussed above

What is a recursive and recursively enumerable


language

Let us understand the concept of recursive language before learning about the
recursively enumerable language in the theory of computation (TOC).
Recursive Language
A language L is recursive (decidable) if L is the set of strings accepted by some Turing
Machine (TM) that halts on every input.
Example
When a Turing machine reaches a final state, it halts. We can also say that a Turing
machine M halts when M reaches a state q and a current symbol ‘a’ to be scanned
so that δ(q, a) is undefined.
There are TMs that never halt on some inputs in any one of these ways, So we make
a distinction between the languages accepted by a TM that halts on all input strings
and a TM that never halts on some input strings.
Recursive Enumerable Language
A language L is recursively enumerable if L is the set of strings accepted by some
TM.
If L is a recursive enumerable language then −
If w ∈ L then a TM halts in a final state,
If w ∉ L then a TM halts in a non-final state or loops forever.
If L is a recursive language then −
If w ∈ L then a TM halts in a final state,
If w ∉ L then TM halts in a non-final state.
Recursive Languages are also recursive enumerable
Proof − If L is a recursive then there is TM which decides a member in language then

• M accepts x if x is in language L.
• M rejects on x if x is not in language L.
According to the definition, M can recognize the strings in language that are accepted
on those strings.

Recursive Enumerable Language


For a given language if a turing machine can be designed then that language will be recursive
enumerable language.

Halting Problem
If turing machine answers YES for string belonging to language.
But for string not belonging to language Turing Machine can either answer NO or it could go
into infinite loop.
Then this is called Halting problem and such a language is called Recursie Enumerable
Language.
Below picture will put more light on the explanation:

In above pic the outer rectangle is Σ*(all possible strings over given input) and langauge is a
subset of it.

Recursive Enumerable and Recursive Langauges


Recursive Enumerable languages are superset of Recursive Langauges.
Means: Every recusrive language is recursive enumerable language but vice-versa is not true.

Closure Property Status

Union Yes

Intersection Yes

Set Difference No
Compliment No

Intersection with Regular Language Yes

Concatination Yes

Kleen Closure Yes

Kleen Plus Yes

Reversal Yes

Homomorphism Yes

ε-free Homomorphismx Yes

Inverse Homomorphism Yes

Substitution Yes

ε-free Substitution Yes


Recursive Enumerable Langauge Closure Properties.
Variants of Turning Machines :-
Variation of Turing Machine

1. Multiple track Turing Machine:

• A k-track Turing machine(for some k>0) has k-tracks and one R/W
head that reads and writes all of them one by one.
• A k-track Turing Machine can be simulated by a single track Turing
machine
2. Two-way infinite Tape Turing Machine:

• Infinite tape of two-way infinite tape Turing machine is unbounded


in both directions left and right.
• Two-way infinite tape Turing machine can be simulated by one-way
infinite Turing machine(standard Turing machine).
3. Multi-tape Turing Machine:

• It has multiple tapes and is controlled by a single head.


• The Multi-tape Turing machine is different from k-track Turing
machine but expressive power is the same.
• Multi-tape Turing machine can be simulated by single-tape Turing
machine.
4. Multi-tape Multi-head Turing Machine:

• The multi-tape Turing machine has multiple tapes and multiple


heads
• Each tape is controlled by a separate head
• Multi-Tape Multi-head Turing machine can be simulated by a
standard Turing machine.
5. Multi-dimensional Tape Turing Machine:

• It has multi-dimensional tape where the head can move in any


direction that is left, right, up or down.
• Multi dimensional tape Turing machine can be simulated by one-
dimensional Turing machine
6. Multi-head Turing Machine:

• A multi-head Turing machine contains two or more heads to read


the symbols on the same tape.
• In one step all the heads sense the scanned symbols and move or
write independently.
• Multi-head Turing machine can be simulated by a single head
Turing machine.
7. Non-deterministic Turing Machine:

• A non-deterministic Turing machine has a single, one-way infinite


tape.
• For a given state and input symbol has at least one choice to move
(finite number of choices for the next move), each choice has
several choices of the path that it might follow for a given input
string.
• A non-deterministic Turing machine is equivalent to the
deterministic Turing machine.

What are the Turing machine variations in TOC

Turing machines (TM) can also be deterministic or non-deterministic, but this does
not make them any more or less powerful.
However, if the tape is restricted so that you can only see use of the part of the tape
with the input, the TM becomes less powerful (linear bounded automata) and can
only recognise context sensitive languages.
Many other TM variations are equivalent to the original TM. This includes the
following −
• Multi-track
• Multi-tape
• Multi-head
• Multi-dimensional tape
• The off-line Turing machine
Multi-tape Turing Machine
A Turing machine with several tapes we call it a multi tape Turing machine.
Every tape’s have their own Read/Write head
For N-tape Turing Machine
M={( Q,X, ∑,δ,q0,B,F)}
We define
δ=QxXN ->Q x XN x {L,R}N
Example
If n=2 with current configuration δ(q0,a,e)=(q1,X,Y,L,R)
δ=QxXN ->Q x XN x {L,R}N
Non Deterministic Turing Machine
It is similar to DTM except that for any input and current state it has a number of
choices.
A string is accepted by a NDTM if there is a sequence of moves that leads to a final
state
The Transition function −
=Q x X ->2QxXx(L,R)
A NDTM is allowed to have more than one transition for a given tape symbol.

Multi-head Turing machine


It has a number of heads instead of one.
Each head independently reads/ writes symbols and moves left/right or keeps
stationery.
Off-line Turing Machine
An offline Turing machine has two tapes, which are as follows −
• One tape is read-only and contains the input.
• The other is read-write and is initially blank.
NN

Nondeterministic TMs and Equivalence with deterministic TMs

. Non-Deterministic Turing Machine

Previous Page

Next Page

In a Non-Deterministic Turing Machine, for every state and symbol, there are a group
of actions the TM can have. So, here the transitions are not deterministic. The
computation of a non-deterministic Turing Machine is a tree of configurations that
can be reached from the start configuration.
An input is accepted if there is at least one node of the tree which is an accept
configuration, otherwise it is not accepted. If all branches of the computational tree
halt on all inputs, the non-deterministic Turing Machine is called a Decider and if for
some input, all branches are rejected, the input is also rejected.
A non-deterministic Turing machine can be formally defined as a 6-tuple (Q, X, ∑, δ,
q0, B, F) where −
• Q is a finite set of states
• X is the tape alphabet
• ∑ is the input alphabet
• δ is a transition function;
δ : Q × X → P(Q × X × {Left_shift, Right_shift}).
• q0 is the initial state
• B is the blank symbol
• F is the set of final states

Deterministic vs. Nondeterministic


Computations

Previous Page

Next Page

To understand class P and NP, first we should know the computational model.
Hence, in this chapter we will discuss two important computational models.

Deterministic Computation and the Class P

Deterministic Turing Machine


One of these models is deterministic one-tape Turing machine. This machine consists
of a finite state control, a read-write head and a two-way tape with infinite sequence.
Following is the schematic diagram of a deterministic one-tape Turing machine.

A program for a deterministic Turing machine specifies the following information −

• A finite set of tape symbols (input symbols and a blank symbol)


• A finite set of states
• A transition function
In algorithmic analysis, if a problem is solvable in polynomial time by a deterministic
one tape Turing machine, the problem belongs to P class.

Nondeterministic Computation and the Class NP

Nondeterministic Turing Machine


To solve the computational problem, another model is the Non-deterministic Turing
Machine (NDTM). The structure of NDTM is similar to DTM, however here we have
one additional module known as the guessing module, which is associated with one
write-only head.
Following is the schematic diagram.

If the problem is solvable in polynomial time by a non-deterministic Turing machine,


the problem belongs to NP class.

https://www.youtube.com/watch?v=GitLr3MwsFE
https://www.youtube.com/watch?v=CQNBBz_e4ss

Unrestricted grammar and equivalence with Turning


machines:-

Unrestricted grammar
In automata theory, the class of unrestricted grammars (also called semi-
Thue, type-0 or phrase structure grammars) is the most general class of
grammars in the Chomsky hierarchy. No restrictions are made on the productions of
an unrestricted grammar, other than each of their left-hand sides being non-
empty.[1]: 220 This grammar class can generate arbitrary recursively enumerable
languages.

Formal definition[edit]
An unrestricted grammar is a formal grammar , where

• is a finite set of nonterminal symbols,

• is a finite set of terminal symbols with and disjoint,[note 1]


• is a finite set of production rules of the form where and are

strings of symbols in and is not the empty string, and

• is a specially designated start symbol.[1]: 220


As the name implies, there are no real restrictions on the types of production rules
that unrestricted grammars can have.[note 2]

Equivalence to Turing machines[edit]


The unrestricted grammars characterize the recursively enumerable languages. This

is the same as saying that for every unrestricted grammar there exists

some Turing machine capable of recognizing and vice versa. Given an


unrestricted grammar, such a Turing machine is simple enough to construct, as a
two-tape nondeterministic Turing machine.[1]: 221 The first tape contains the input

word to be tested, and the second tape is used by the machine to

generate sentential forms from . The Turing machine then does the following:

1. Start at the left of the second tape and repeatedly choose to move right or
select the current position on the tape.

2. Nondeterministically choose a production from the productions in .

3. If appears at some position on the second tape,

replace by at that point, possibly shifting the symbols on the tape

left or right depending on the relative lengths of and (e.g. if is

longer than , shift the tape symbols left).


4. Compare the resulting sentential form on tape 2 to the word on tape 1. If they
match, then the Turing machine accepts the word. If they don't, the Turing
machine will go back to step 1.
It is easy to see that this Turing machine will generate all and only the sentential

forms of on its second tape after the last step is executed an arbitrary number

of times, thus the language must be recursively enumerable.


The reverse construction is also possible. Given some Turing machine, it is possible
to create an equivalent unrestricted grammar[1]: 222 which even uses only productions
with one or more non-terminal symbols on their left-hand sides. Therefore, an
arbitrary unrestricted grammar can always be equivalently converted to obey the
latter form, by converting it to a Turing machine and back again. Some authors [citation
needed]
use the latter form as definition of unrestricted grammar.
Computational properties[edit]
The decision problem of whether a given string can be generated by a given
unrestricted grammar is equivalent to the problem of whether it can be accepted by
the Turing machine equivalent to the grammar. The latter problem is called
the Halting problem and is undecidable.
Recursively enumerable languages are closed under Kleene
star, concatenation, union, and intersection, but not under set difference;
see Recursively enumerable language#Closure properties.
The equivalence of unrestricted grammars to Turing machines implies the existence
of a universal unrestricted grammar, a grammar capable of accepting any other
unrestricted grammar's language given a description of the language. For this
reason, it is theoretically possible to build a programming language based on
unrestricted grammars (e.g. Thue).

1.

Automata theory: formal languages and formal grammars

Chomsky hierarchy Grammars Languages Abstract machines

• Type-0 • Unrestricted • Recursively enumerable • Turing machine

• — • (no common name) • Decidable • Decider

• Type-1 • Context-sensitive • Context-sensitive • Linear-bounded

• — • Positive range concatenation • Positive range concatenation* • PTIME Turing Machine

• — • Indexed • Indexed* • Nested stack


• — • — • — • Thread automaton

• — • Linear context-free rewriting • Linear context-free rewriting • restricted Tree stack autom

• — systems language • Embedded pushdown

• Type-2 • Tree-adjoining • Tree-adjoining • Nondeterministic pushdow

• — • Context-free • Context-free • Deterministic pushdown

• — • Deterministic context-free • Deterministic context-free • Visibly pushdown

• Type-3 • Visibly pushdown • Visibly pushdown • Finite

• — • Regular • Regular • Counter-free (with aperiod

• — • — • Star-free finite monoid)

• Non-recursive • Finite • Acyclic finite

h category of languages, except those marked by a *, is a proper subset of the category directly above it. Any language in each category is generated by a grammar a

by an automaton in the category in the same line.


Chomsky Hierarchy in Theory of Computation

According to Chomsky hierarchy, grammar is divided into 4 types as follows:


1. Type 0 is known as unrestricted grammar.
2. Type 1 is known as context-sensitive grammar.
3. Type 2 is known as a context-free grammar.
4. Type 3 Regular Grammar.

Type 0: Unrestricted Grammar:


Type-0 grammars include all formal grammar. Type 0 grammar languages are
recognized by turing machine. These languages are also known as the
Recursively Enumerable languages.

Grammar Production in the form of where


\alpha is ( V + T)* V ( V + T)*
V : Variables
T : Terminals.

is ( V + T )*.
In type 0 there must be at least one variable on the Left side of production.
For example:
Sab --> ba
A --> S
Here, Variables are S, A, and Terminals a, b.
Type 1: Context-Sensitive Grammar
Type-1 grammars generate context-sensitive languages. The language generated
by the grammar is recognized by the Linear Bound Automata
In Type 1
• First of all Type 1 grammar should be Type 0.
• Grammar Production in the form of

|\alpha |<=|\beta |

That is the count of symbol in is less than or equal to


Also β ∈ (V + T)+
i.e. β can not be ε

For Example:
S --> AB
AB --> abc
B --> b
Type 2: Context-Free Grammar: Type-2 grammars generate context-free
languages. The language generated by the grammar is recognized by a Pushdown
automata. In Type 2:
• First of all, it should be Type 1.
• The left-hand side of production can have only one variable and there is

no restriction on
|\alpha | = 1.
For example:
S --> AB
A --> a
B --> b
Type 3: Regular Grammar: Type-3 grammars generate regular languages.
These languages are exactly all languages that can be accepted by a finite-state
automaton. Type 3 is the most restricted form of grammar.
Type 3 should be in the given form only :
V --> VT / T (left-regular grammar)
(or)
V --> TV /T (right-regular grammar)
For example:
S --> a
The above form is called strictly regular grammar.
There is another form of regular grammar called extended regular grammar. In
this form:
V --> VT* / T*. (extended left-regular grammar)
(or)
V --> T*V /T* (extended right-regular grammar)
For example :
S --> ab.
Please write comments if you find anything incorrect, or if you want to share
more information about the topic discussed above.

Turning machines as enumerators .

Enumerator (computer science)

An enumerator is a Turing machine with an attached printer. The Turing machine


can use that printer as an output device to print strings. Every time the Turing
machine wants to add a string to the list, it sends the string to the printer.
Enumerator is a type of Turing machine variant and is equivalent with Turing
machine.

Formal definition[edit]
An enumerator can be defined as a 2-tape Turing machine (Multitape Turing

machine where ) whose language is . Initially, receives no input, and

all the tapes are blank (i.e., filled with blank symbols). Newly defined symbol is

the delimiter that marks end of an element of . The second tape can be

regarded as the printer, strings on it are separated by . The language

enumerated by an enumerator denoted by is defined as set of the strings


on the second tape (the printer).

Equivalence of Enumerator and Turing Machines[edit]


A language over a finite alphabet is Turing Recognizable if and only if it can be
enumerated by an enumerator. This shows Turing recognizable languages are also
recursively enumerable.
Proof
A Turing Recognizable language can be Enumerated by an Enumerator

Consider a Turing Machine and the language accepted by it be . Since the

set of all possible strings over the input alphabet i.e. the Kleene Closure is

a countable set, we can enumerate the strings in it as etc. Then the Enumerator

enumerating the language will follow the steps:

1 for i = 1,2,3,...

2 Run with input strings for -steps


3 If any string is accepted, then print it.

Now the question comes whether every string in the language will be printed by

the Enumerator we constructed. For any string in the language the

TM will run finite number of steps(let it be for ) to accept it. Then in

the -th step of the Enumerator will be printed. Thus the Enumerator will

print every string recognizes but a single string may be printed several times.
An Enumerable Language is Turing Recognizable

It's very easy to construct a Turing Machine that recognizes the enumerable

language . We can have two tapes. On one tape we take the input string and on
the other tape, we run the enumerator to enumerate the strings in the language one
after another. Once a string is printed in the second tape we compare it with the
input in the first tape. If its a match, then we accept the input, else reject.

MODULE 2
Context-free language and pushdown automata :-
Context free grammar and context free language

Context-Free Grammar Introduction

Previous Page

Next Page
Definition − A context-free grammar (CFG) consisting of a finite set of grammar rules
is a quadruple (N, T, P, S) where
• N is a set of non-terminal symbols.
• T is a set of terminals where N ∩ T = NULL.
• P is a set of rules, P: N → (N ∪ T)*, i.e., the left-hand side of the
production rule P does have any right context or left context.
• S is the start symbol.
Example

• The grammar ({A}, {a, b, c}, P, A), P : A → aA, A → abc.


• The grammar ({S, a, b}, {a, b}, P, S), P: S → aSa, S → bSb, S → ε
• The grammar ({S, F}, {0, 1}, P, S), P: S → 00S | 11F, F → 00F | ε

Generation of Derivation Tree


A derivation tree or parse tree is an ordered rooted tree that graphically represents
the semantic information a string derived from a context-free grammar.

Representation Technique
• Root vertex − Must be labeled by the start symbol.
• Vertex − Labeled by a non-terminal symbol.
• Leaves − Labeled by a terminal symbol or ε.
If S → x1x2 …… xn is a production rule in a CFG, then the parse tree / derivation tree
will be as follows −

There are two different approaches to draw a derivation tree −


Top-down Approach −
• Starts with the starting symbol S
• Goes down to tree leaves using productions
Bottom-up Approach −
• Starts from tree leaves
• Proceeds upward to the root which is the starting symbol S
Derivation or Yield of a Tree
The derivation or the yield of a parse tree is the final string obtained by concatenating
the labels of the leaves of the tree from left to right, ignoring the Nulls. However, if
all the leaves are Null, derivation is Null.
Example
Let a CFG {N,T,P,S} be
N = {S}, T = {a, b}, Starting symbol = S, P = S → SS | aSb | ε
One derivation from the above CFG is “abaabb”
S → SS → aSbS → abS → abaSb → abaaSbb → abaabb

Sentential Form and Partial Derivation Tree


A partial derivation tree is a sub-tree of a derivation tree/parse tree such that either
all of its children are in the sub-tree or none of them are in the sub-tree.
Example
If in any CFG the productions are −
S → AB, A → aaA | ε, B → Bb| ε
the partial derivation tree can be the following −

If a partial derivation tree contains the root S, it is called a sentential form. The above
sub-tree is also in sentential form.

Leftmost and Rightmost Derivation of a String


• Leftmost derivation − A leftmost derivation is obtained by applying
production to the leftmost variable in each step.
• Rightmost derivation − A rightmost derivation is obtained by applying
production to the rightmost variable in each step.
Example
Let any set of production rules in a CFG be
X → X+X | X*X |X| a
over an alphabet {a}.
The leftmost derivation for the string "a+a*a" may be −
X → X+X → a+X → a + X*X → a+a*X → a+a*a
The stepwise derivation of the above string is shown as below −
The rightmost derivation for the above string "a+a*a" may be −
X → X*X → X*a → X+X*a → X+a*a → a+a*a
The stepwise derivation of the above string is shown as below −
Left and Right Recursive Grammars
In a context-free grammar G, if there is a production in the form X → Xa where X is a
non-terminal and ‘a’ is a string of terminals, it is called a left recursive production.
The grammar having a left recursive production is called a left recursive grammar.
And if in a context-free grammar G, if there is a production is in the form X →
aX where X is a non-terminal and ‘a’ is a string of terminals, it is called a right
recursive production. The grammar having a right recursive production is called
a right recursive grammar.

Ambiguity in Context-Free Grammars

Previous Page
Next Page

If a context free grammar G has more than one derivation tree for some string w ∈
L(G), it is called an ambiguous grammar. There exist multiple right-most or left-most
derivations for some string generated from that grammar.

Problem
Check whether the grammar G with production rules −
X → X+X | X*X |X| a
is ambiguous or not.

Solution
Let’s find out the derivation tree for the string "a+a*a". It has two leftmost derivations.
Derivation 1 − X → X+X → a +X → a+ X*X → a+a*X → a+a*a
Parse tree 1 −

Derivation 2 − X → X*X → X+X*X → a+ X*X → a+a*X → a+a*a


Parse tree 2 −
Since there are two parse trees for a single string "a+a*a", the grammar G is
ambiguous.

CFL Closure Property

Previous Page

Next Page

Context-free languages are closed under −

• Union
• Concatenation
• Kleene Star operation

Union
Let L1 and L2 be two context free languages. Then L1 ∪ L2 is also context free.

Example
Let L1 = { anbn , n > 0}. Corresponding grammar G1 will have P: S1 → aAb|ab
Let L2 = { cmdm , m ≥ 0}. Corresponding grammar G2 will have P: S2 → cBb| ε
Union of L1 and L2, L = L1 ∪ L2 = { anbn } ∪ { cmdm }
The corresponding grammar G will have the additional production S → S1 | S2

Concatenation
If L1 and L2 are context free languages, then L1L2 is also context free.
Example
Union of the languages L1 and L2, L = L1L2 = { anbncmdm }
The corresponding grammar G will have the additional production S → S1 S2

Kleene Star
If L is a context free language, then L* is also context free.

Example
Let L = { anbn , n ≥ 0}. Corresponding grammar G will have P: S → aAb| ε
Kleene Star L1 = { anbn }*
The corresponding grammar G1 will have additional productions S1 → SS1 | ε
Context-free languages are not closed under −
• Intersection − If L1 and L2 are context free languages, then L1 ∩ L2 is
not necessarily context free.
• Intersection with Regular Language − If L1 is a regular language and
L2 is a context free language, then L1 ∩ L2 is a context free language.
• Complement − If L1 is a context free language, then L1’ may not be
context free.

CFG Simplification

Previous Page

Next Page

In a CFG, it may happen that all the production rules and symbols are not needed for
the derivation of strings. Besides, there may be some null productions and unit
productions. Elimination of these productions and symbols is called simplification of
CFGs. Simplification essentially comprises of the following steps −

• Reduction of CFG
• Removal of Unit Productions
• Removal of Null Productions

Reduction of CFG
CFGs are reduced in two phases −
Phase 1 − Derivation of an equivalent grammar, G’, from the CFG, G, such that each
variable derives some terminal string.
Derivation Procedure −
Step 1 − Include all symbols, W1, that derive some terminal and initialize i=1.
Step 2 − Include all symbols, Wi+1, that derive Wi.
Step 3 − Increment i and repeat Step 2, until Wi+1 = Wi.
Step 4 − Include all production rules that have Wi in it.
Phase 2 − Derivation of an equivalent grammar, G”, from the CFG, G’, such that each
symbol appears in a sentential form.
Derivation Procedure −
Step 1 − Include the start symbol in Y1 and initialize i = 1.
Step 2 − Include all symbols, Yi+1, that can be derived from Yi and include all
production rules that have been applied.
Step 3 − Increment i and repeat Step 2, until Yi+1 = Yi.

Problem
Find a reduced grammar equivalent to the grammar G, having production rules, P: S
→ AC | B, A → a, C → c | BC, E → aA | e

Solution
Phase 1 −
T = { a, c, e }
W1 = { A, C, E } from rules A → a, C → c and E → aA
W2 = { A, C, E } U { S } from rule S → AC
W3 = { A, C, E, S } U ∅
Since W2 = W3, we can derive G’ as −
G’ = { { A, C, E, S }, { a, c, e }, P, {S}}
where P: S → AC, A → a, C → c , E → aA | e
Phase 2 −
Y1 = { S }
Y2 = { S, A, C } from rule S → AC
Y3 = { S, A, C, a, c } from rules A → a and C → c
Y4 = { S, A, C, a, c }
Since Y3 = Y4, we can derive G” as −
G” = { { A, C, S }, { a, c }, P, {S}}
where P: S → AC, A → a, C → c

Removal of Unit Productions


Any production rule in the form A → B where A, B ∈ Non-terminal is called unit
production..

Removal Procedure −
Step 1 − To remove A → B, add production A → x to the grammar rule whenever B
→ x occurs in the grammar. [x ∈ Terminal, x can be Null]
Step 2 − Delete A → B from the grammar.
Step 3 − Repeat from step 1 until all unit productions are removed.
Problem
Remove unit production from the following −
S → XY, X → a, Y → Z | b, Z → M, M → N, N → a
Solution −
There are 3 unit productions in the grammar −
Y → Z, Z → M, and M → N
At first, we will remove M → N.
As N → a, we add M → a, and M → N is removed.
The production set becomes
S → XY, X → a, Y → Z | b, Z → M, M → a, N → a
Now we will remove Z → M.
As M → a, we add Z→ a, and Z → M is removed.
The production set becomes
S → XY, X → a, Y → Z | b, Z → a, M → a, N → a
Now we will remove Y → Z.
As Z → a, we add Y→ a, and Y → Z is removed.
The production set becomes
S → XY, X → a, Y → a | b, Z → a, M → a, N → a
Now Z, M, and N are unreachable, hence we can remove those.
The final CFG is unit production free −
S → XY, X → a, Y → a | b

Removal of Null Productions


In a CFG, a non-terminal symbol ‘A’ is a nullable variable if there is a production A →
ε or there is a derivation that starts at A and finally ends up with
ε: A → .......… → ε

Removal Procedure
Step 1 − Find out nullable non-terminal variables which derive ε.
Step 2 − For each production A → a, construct all productions A → x where x is
obtained from ‘a’ by removing one or multiple non-terminals from Step 1.
Step 3 − Combine the original productions with the result of step 2 and remove ε -
productions.
Problem
Remove null production from the following −
S → ASA | aB | b, A → B, B → b | ∈
Solution −
There are two nullable variables − A and B
At first, we will remove B → ε.
After removing B → ε, the production set becomes −
S→ASA | aB | b | a, A ε B| b | &epsilon, B → b
Now we will remove A → ε.
After removing A → ε, the production set becomes −
S→ASA | aB | b | a | SA | AS | S, A → B| b, B → b
This is the final production set without null transition.

Chomsky Normal Form


Previous Page

Next Page

A CFG is in Chomsky Normal Form if the Productions are in the following forms −

• A→a
• A → BC
• S→ε
where A, B, and C are non-terminals and a is terminal.

Algorithm to Convert into Chomsky Normal Form −


Step 1 − If the start symbol S occurs on some right side, create a new start
symbol S’ and a new production S’→ S.
Step 2 − Remove Null productions. (Using the Null production removal algorithm
discussed earlier)
Step 3 − Remove unit productions. (Using the Unit production removal algorithm
discussed earlier)
Step 4 − Replace each production A → B1…Bn where n > 2 with A → B1C where C →
B2 …Bn. Repeat this step for all productions having two or more symbols in the right
side.
Step 5 − If the right side of any production is in the form A → aB where a is a terminal
and A, B are non-terminal, then the production is replaced by A → XB and X → a.
Repeat this step for every production which is in the form A → aB.

Problem
Convert the following CFG into CNF
S → ASA | aB, A → B | S, B → b | ε

Solution
(1) Since S appears in R.H.S, we add a new state S0 and S0→S is added to the
production set and it becomes −
S0→S, S→ ASA | aB, A → B | S, B → b | ∈
(2) Now we will remove the null productions −
B → ∈ and A → ∈
After removing B → ε, the production set becomes −
S0→S, S→ ASA | aB | a, A → B | S | ∈, B → b
After removing A → ∈, the production set becomes −
S0→S, S→ ASA | aB | a | AS | SA | S, A → B | S, B → b
(3) Now we will remove the unit productions.
After removing S → S, the production set becomes −
S0→S, S→ ASA | aB | a | AS | SA, A → B | S, B → b
After removing S0→ S, the production set becomes −
S0→ ASA | aB | a | AS | SA, S→ ASA | aB | a | AS | SA
A → B | S, B → b
After removing A→ B, the production set becomes −
S0 → ASA | aB | a | AS | SA, S→ ASA | aB | a | AS | SA
A→S|b
B→b
After removing A→ S, the production set becomes −
S0 → ASA | aB | a | AS | SA, S→ ASA | aB | a | AS | SA
A → b |ASA | aB | a | AS | SA, B → b
(4) Now we will find out more than two variables in the R.H.S
Here, S0→ ASA, S → ASA, A→ ASA violates two Non-terminals in R.H.S.
Hence we will apply step 4 and step 5 to get the following final production set which
is in CNF −
S0→ AX | aB | a | AS | SA
S→ AX | aB | a | AS | SA
A → b |AX | aB | a | AS | SA
B→b
X → SA
(5) We have to change the productions S0→ aB, S→ aB, A→ aB
And the final production set becomes −
S0→ AX | YB | a | AS | SA
S→ AX | YB | a | AS | SA
A → b A → b |AX | YB | a | AS | SA
B→b
X → SA
Y→a

Greibach Normal Form

Previous Page

Next Page

A CFG is in Greibach Normal Form if the Productions are in the following forms −
A→b
A → bD1…Dn
S→ε
where A, D1,....,Dn are non-terminals and b is a terminal.

Algorithm to Convert a CFG into Greibach Normal Form


Step 1 − If the start symbol S occurs on some right side, create a new start
symbol S’ and a new production S’ → S.
Step 2 − Remove Null productions. (Using the Null production removal algorithm
discussed earlier)
Step 3 − Remove unit productions. (Using the Unit production removal algorithm
discussed earlier)
Step 4 − Remove all direct and indirect left-recursion.
Step 5 − Do proper substitutions of productions to convert it into the proper form of
GNF.

Problem
Convert the following CFG into CNF
S → XY | Xn | p
X → mX | m
Y → Xn | o

Solution
Here, S does not appear on the right side of any production and there are no unit or
null productions in the production rule set. So, we can skip Step 1 to Step 3.
Step 4
Now after replacing
X in S → XY | Xo | p
with
mX | m
we obtain
S → mXY | mY | mXo | mo | p.
And after replacing
X in Y → Xn | o
with the right side of
X → mX | m
we obtain
Y → mXn | mn | o.
Two new productions O → o and P → p are added to the production set and then we
came to the final GNF as the following −
S → mXY | mY | mXC | mC | p
X → mX | m
Y → mXD | mD | o
O→o
P→p

Pumping Lemma for CFG

Previous Page

Next Page

Lemma
If L is a context-free language, there is a pumping length p such that any string w ∈
L of length ≥ p can be written as w = uvxyz, where vy ≠ ε, |vxy| ≤ p, and for all i ≥ 0,
uvixyiz ∈ L.

Applications of Pumping Lemma


Pumping lemma is used to check whether a grammar is context free or not. Let us
take an example and show how it is checked.

Problem
Find out whether the language L = {xnynzn | n ≥ 1} is context free or not.

Solution
Let L is context free. Then, L must satisfy pumping lemma.
At first, choose a number n of the pumping lemma. Then, take z as 0n1n2n.
Break z into uvwxy, where
|vwx| ≤ n and vx ≠ ε.
Hence vwx cannot involve both 0s and 2s, since the last 0 and the first 2 are at least
(n+1) positions apart. There are two cases −
Case 1 − vwx has no 2s. Then vx has only 0s and 1s. Then uwy, which would have
to be in L, has n 2s, but fewer than n 0s or 1s.
Case 2 − vwx has no 0s.
Here contradiction occurs.
Hence, L is not a context-free language.

Pushdown Automata Introduction

Previous Page

Next Page

Basic Structure of PDA


A pushdown automaton is a way to implement a context-free grammar in a similar
way we design DFA for a regular grammar. A DFA can remember a finite amount of
information, but a PDA can remember an infinite amount of information.
Basically a pushdown automaton is −
"Finite state machine" + "a stack"
A pushdown automaton has three components −

• an input tape,
• a control unit, and
• a stack with infinite size.
The stack head scans the top symbol of the stack.
A stack does two operations −
• Push − a new symbol is added at the top.
• Pop − the top symbol is read and removed.
A PDA may or may not read an input symbol, but it has to read the top of the stack
in every transition.

A PDA can be formally described as a 7-tuple (Q, ∑, S, δ, q0, I, F) −


• Q is the finite number of states
• ∑ is input alphabet
• S is stack symbols
• δ is the transition function: Q × (∑ ∪ {ε}) × S × Q × S*
• q0 is the initial state (q0 ∈ Q)
• I is the initial stack top symbol (I ∈ S)
• F is a set of accepting states (F ∈ Q)
The following diagram shows a transition in a PDA from a state q1 to state q2, labeled
as a,b → c −
This means at state q1, if we encounter an input string ‘a’ and top symbol of the stack
is ‘b’, then we pop ‘b’, push ‘c’ on top of the stack and move to state q2.

Terminologies Related to PDA

Instantaneous Description
The instantaneous description (ID) of a PDA is represented by a triplet (q, w, s) where
• q is the state
• w is unconsumed input
• s is the stack contents
Turnstile Notation
The "turnstile" notation is used for connecting pairs of ID's that represent one or many
moves of a PDA. The process of transition is denoted by the turnstile symbol "⊢".
Consider a PDA (Q, ∑, S, δ, q0, I, F). A transition can be mathematically represented
by the following turnstile notation −
(p, aw, Tβ) ⊢ (q, w, αb)
This implies that while taking a transition from state p to state q, the input
symbol ‘a’ is consumed, and the top of the stack ‘T’ is replaced by a new string ‘α’.
Note − If we want zero or more moves of a PDA, we have to use the symbol (⊢*) for
it.

Pushdown Automata Acceptance

Previous Page

Next Page

There are two different ways to define PDA acceptability.


Final State Acceptability
In final state acceptability, a PDA accepts a string when, after reading the entire
string, the PDA is in a final state. From the starting state, we can make moves that
end up in a final state with any stack values. The stack values are irrelevant as long
as we end up in a final state.
For a PDA (Q, ∑, S, δ, q0, I, F), the language accepted by the set of final states F is −
L(PDA) = {w | (q0, w, I) ⊢* (q, ε, x), q ∈ F}
for any input stack string x.

Empty Stack Acceptability


Here a PDA accepts a string when, after reading the entire string, the PDA has
emptied its stack.
For a PDA (Q, ∑, S, δ, q0, I, F), the language accepted by the empty stack is −
L(PDA) = {w | (q0, w, I) ⊢* (q, ε, ε), q ∈ Q}

Example
Construct a PDA that accepts L = {0n 1n | n ≥ 0}

Solution

This language accepts L = {ε, 01, 0011, 000111, ............................. }


Here, in this example, the number of ‘a’ and ‘b’ have to be same.
• Initially we put a special symbol ‘$’ into the empty stack.
• Then at state q2, if we encounter input 0 and top is Null, we push 0 into
stack. This may iterate. And if we encounter input 1 and top is 0, we pop
this 0.
• Then at state q3, if we encounter input 1 and top is 0, we pop this 0. This
may also iterate. And if we encounter input 1 and top is 0, we pop the
top element.
• If the special symbol ‘$’ is encountered at top of the stack, it is popped
out and it finally goes to the accepting state q4.
Example
Construct a PDA that accepts L = { wwR | w = (a+b)* }
Solution

Initially we put a special symbol ‘$’ into the empty stack. At state q2, the w is being
read. In state q3, each 0 or 1 is popped when it matches the input. If any other input
is given, the PDA will go to a dead state. When we reach that special symbol ‘$’, we
go to the accepting state q4.

PDA & Context-Free Grammar

Previous Page

Next Page

If a grammar G is context-free, we can build an equivalent nondeterministic PDA


which accepts the language that is produced by the context-free grammar G. A
parser can be built for the grammar G.
Also, if P is a pushdown automaton, an equivalent context-free grammar G can be
constructed where
L(G) = L(P)
In the next two topics, we will discuss how to convert from PDA to CFG and vice
versa.

Algorithm to find PDA corresponding to a given CFG


Input − A CFG, G = (V, T, P, S)
Output − Equivalent PDA, P = (Q, ∑, S, δ, q0, I, F)
Step 1 − Convert the productions of the CFG into GNF.
Step 2 − The PDA will have only one state {q}.
Step 3 − The start symbol of CFG will be the start symbol in the PDA.
Step 4 − All non-terminals of the CFG will be the stack symbols of the PDA and all
the terminals of the CFG will be the input symbols of the PDA.
Step 5 − For each production in the form A → aX where a is terminal and A, X are
combination of terminal and non-terminals, make a transition δ (q, a, A).

Problem
Construct a PDA from the following CFG.
G = ({S, X}, {a, b}, P, S)
where the productions are −
S → XS | ε , A → aXb | Ab | ab

Solution
Let the equivalent PDA,
P = ({q}, {a, b}, {a, b, X, S}, δ, q, S)
where δ −
δ(q, ε , S) = {(q, XS), (q, ε )}
δ(q, ε , X) = {(q, aXb), (q, Xb), (q, ab)}
δ(q, a, a) = {(q, ε )}
δ(q, 1, 1) = {(q, ε )}

Algorithm to find CFG corresponding to a given PDA


Input − A CFG, G = (V, T, P, S)
Output − Equivalent PDA, P = (Q, ∑, S, δ, q0, I, F) such that the non- terminals of the
grammar G will be {Xwx | w,x ∈ Q} and the start state will be Aq0,F.
Step 1 − For every w, x, y, z ∈ Q, m ∈ S and a, b ∈ ∑, if δ (w, a, ε) contains (y, m) and
(z, b, m) contains (x, ε), add the production rule Xwx → a Xyzb in grammar G.
Step 2 − For every w, x, y, z ∈ Q, add the production rule Xwx → XwyXyx in grammar G.
Step 3 − For w ∈ Q, add the production rule Xww → ε in grammar G.
Pushdown Automata & Parsing

Previous Page

Next Page

Parsing is used to derive a string using the production rules of a grammar. It is used
to check the acceptability of a string. Compiler is used to check whether or not a string
is syntactically correct. A parser takes the inputs and builds a parse tree.
A parser can be of two types −
• Top-Down Parser − Top-down parsing starts from the top with the
start-symbol and derives a string using a parse tree.
• Bottom-Up Parser − Bottom-up parsing starts from the bottom with
the string and comes to the start symbol using a parse tree.

Design of Top-Down Parser


For top-down parsing, a PDA has the following four types of transitions −
• Pop the non-terminal on the left hand side of the production at the top
of the stack and push its right-hand side string.
• If the top symbol of the stack matches with the input symbol being read,
pop it.
• Push the start symbol ‘S’ into the stack.
• If the input string is fully read and the stack is empty, go to the final
state ‘F’.
Example
Design a top-down parser for the expression "x+y*z" for the grammar G with the
following production rules −
P: S → S+X | X, X → X*Y | Y, Y → (S) | id
Solution
If the PDA is (Q, ∑, S, δ, q0, I, F), then the top-down parsing is −
(x+y*z, I) ⊢(x +y*z, SI) ⊢ (x+y*z, S+XI) ⊢(x+y*z, X+XI)
⊢(x+y*z, Y+X I) ⊢(x+y*z, x+XI) ⊢(+y*z, +XI) ⊢ (y*z, XI)
⊢(y*z, X*YI) ⊢(y*z, y*YI) ⊢(*z,*YI) ⊢(z, YI) ⊢(z, zI) ⊢(ε, I)

Design of a Bottom-Up Parser


For bottom-up parsing, a PDA has the following four types of transitions −
• Push the current input symbol into the stack.
• Replace the right-hand side of a production at the top of the stack with
its left-hand side.
• If the top of the stack element matches with the current input symbol,
pop it.
• If the input string is fully read and only if the start symbol ‘S’ remains in
the stack, pop it and go to the final state ‘F’.
Example
Design a top-down parser for the expression "x+y*z" for the grammar G with the
following production rules −
P: S → S+X | X, X → X*Y | Y, Y → (S) | id
Solution
If the PDA is (Q, ∑, S, δ, q0, I, F), then the bottom-up parsing is −
(x+y*z, I) ⊢ (+y*z, xI) ⊢ (+y*z, YI) ⊢ (+y*z, XI) ⊢ (+y*z, SI)
⊢(y*z, +SI) ⊢ (*z, y+SI) ⊢ (*z, Y+SI) ⊢ (*z, X+SI) ⊢ (z, *X+SI)
⊢ (ε, z*X+SI) ⊢ (ε, Y*X+SI) ⊢ (ε, X+SI) ⊢ (ε, SI)

Non-deterministic Pushdown Automata


The non-deterministic pushdown automata is very much similar to NFA. We will discuss
some CFGs which accepts NPDA.

The CFG which accepts deterministic PDA accepts non-deterministic PDAs as well.
Similarly, there are some CFGs which can be accepted only by NPDA and not by DPDA.
Thus NPDA is more powerful than DPDA.

Example:
Design PDA for Palindrome strips.

Solution:

Play Video

Suppose the language consists of string L = {aba, aa, bb, bab, bbabb, aabaa, ......]. The
string can be odd palindrome or even palindrome. The logic for constructing PDA is
that we will push a symbol onto the stack till half of the string then we will read each
symbol and then perform the pop operation. We will compare to see whether the
symbol which is popped is similar to the symbol which is read. Whether we reach to
end of the input, we expect the stack to be empty.

This PDA is a non-deterministic PDA because finding the mid for the given string and
reading the string from left and matching it with from right (reverse) direction leads to
non-deterministic moves. Here is the ID.

Simulation of abaaba

1. δ(q1, abaaba, Z) Apply rule 1


2. ⊢ δ(q1, baaba, aZ) Apply rule 5
3. ⊢ δ(q1, aaba, baZ) Apply rule 4
4. ⊢ δ(q1, aba, abaZ) Apply rule 7
5. ⊢ δ(q2, ba, baZ) Apply rule 8
6. ⊢ δ(q2, a, aZ) Apply rule 7
7. ⊢ δ(q2, ε, Z) Apply rule 11
8. ⊢ δ(q2, ε) Accept
Parse tree
o Parse tree is the graphical representation of symbol. The symbol can be terminal or
non-terminal.
o In parsing, the string is derived using the start symbol. The root of the parse tree is that
start symbol.
o It is the graphical representation of symbol that can be terminals or non-terminals.
o Parse tree follows the precedence of operators. The deepest sub-tree traversed first.
So, the operator in the parent node has less precedence over the operator in the sub-
tree.

The parse tree follows these points:


o All leaf nodes have to be terminals.
o All interior nodes have to be non-terminals.
o In-order traversal gives original input string.

Example:
Production rules:

1. T= T + T | T * T
2. T = a|b|c

Input:

a * b + c

Step 1:

Step 2:
Step 3:

Step 4:

Step 5:
Deterministic Pushdown Automata
The Deterministic Pushdown Automata is a variation of pushdown automata that
accepts the deterministic context-free languages.
A language L(A) is accepted by a deterministic pushdown automata if and only if
there is a single computation from the initial configuration until an accepting one for
all strings belonging to L(A). It is not as powerful as non-deterministic finite
automata. That's why it is less in use and used only where determinism is much
easier to implement.
A PDA is said to be deterministic if its transition function δ(q,a,X) has at most one
member for -
a ∈ Σ U {ε}

So, for a deterministic PDA, there is at most one transition possible in any
combination of state, input symbol and stack top.
Formal Definition of Deterministic PDA
A Deterministic PDA is 5 tuple -

M = (Σ,Γ,Q,δ,q)

Σ - It is a finite set which does not contain a blank symbol,


Γ - a finite set of stack alphabet,
Q - set of states,
q - start state,
δ - a transition function, denoted as -
δ : Q × (Σ ∪ {□}) × Γ → Q × {N,R} × Γ∗

Non Deterministic Pushdown


automata
A non-deterministic PDA is used to generate a language that a deterministic
automata cannot generate. It is more powerful than a deterministic PDA. So, a push
down automata is allowed to be non-deterministic.
A non-deterministic pushdown automaton is
a 7-tuple
M = (Q,Σ,Γ,δ,q0,Z0,F)

Q- It is the finite set of states,


Σ - finite set of input alphabet,
Γ - finite set of stack alphabet,
δ - transition function,
q0 - initial state,
Z0 - stack start symbol,
F - finite states.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy