Atometa Book
Atometa Book
Preface v
Features of the Book viii
Brief Contents viii
2.4 Deterministic Finite Automata 36
1. PRELIMINARIES 1
2.5 Non-deterministic Finite
1.1 Introduction 1 Automata 36
1.2 Basic Concepts 1 2.6 Equivalence of NFA and DFA 37
1.2.1 Symbol I 2.6.1 NFA to DFA Conversion
1.2.2 Alphabet 1 (Method I) 38
1.2.3 String (or Word) 2 2.6.2 DFA Minimization 41
1.3 Sets 2 2.6.3 NFA to DFA Conversion
1.3.1 Operations 3 (Method II) 43
1.3.2 Cardinality 7 2.7 NFA with &Transitions 47
1.3.3 Countable and Uncountable 2.7.1 Significance of NFA with
Sets 7 e-Transitions 49
1.4 Relations 8 2.7.2 State Transition Table for NFA with
1.4.1 Properties 10 c-Transitions 50
1.4.2 Closure Properties 10 2.7.3 c-Closure of a State 50
1.5 Graphs 11 2.8 Equivalence of NFA and NFA with
1.5.1 Directed Graph (or Digraph) 12
&Transitions 50
1.5.2 Tree 13
2.9 Equivalence of DFA and NFA with
1.6 Languages 13
&Transitions 53
1.6.1 Formal Language 14
2.9.1 Indirect Conversion Method 53
1.7 Mathematical Induction 14
2.9.2 Direct Conversion Method 55
2. FINITE STATE MACHINES 20 2.10 Finite Automata with Output 57
2.10.1 Moore Machine 57
2.1 Introduction 20 2.10.2 Mealy Machine 59
2.1.1 Concept of Basic Machine 21
2.10.3 Finite State Transducer 63
2.2 Finite State Machine 22
2.11 Equivalence of Moore and Mealy
2.2.1 Examples 22
Machines 63
2.2.2 Transition Diagram
(or Transition Graph) 24 2.11.1 Moore to Mealy Conversion 64
2.2.3 Transition Matrix 24 2.11.2 Mealy to Moore Conversion 66
2.3 Finite Automata 29 2.11.3 Additional Examples on Moore and
2.3.1 Transition Graph 30 Mealy Machines 68
2.3.2 Functions 31 2.12 FSM Equivalence 75
2.3.3 Acceptance of a String 32 2.12.1 Moore's Algorithm 75
2.3.4 Acceptance of a Language 32 2.13 DFA Minimization (Another
2.3.5 Some Examples of FA as Approach) 77
Acceptor 33 2.14 Properties and Limitations
2.3.6 FA as Finite Control 35 of FSM 79
xii DETAILED CONTENTS
•L i
s
Preliminaries
LEARNING OBJECTIVES
After completing this chapter, the reader will be able to
understand the following:
1.1 INTRODUCTION
Mathematical preliminaries, such as set theory, different operations on sets, relations, and
graphs, and principle of mathematical induction, play an important role in efficiently grasp-
ing the concepts of theory of computation. The aim of this chapter is to acquaint the reader
with all such necessary concepts and serve as a quick reference guide.
1.2.1 Symbol
A symbol is an abstract or user-defined entity. It is analogous to a point
in geometry, and
cannot be formally defined. For example, letters, digits, or any other characters that one
wishes to consider as a part of the language that is being designed, are said to be symbols.
It is the basic unit (or constituent) of any language.
1.2.2 Alphabet
An alphabet is a finite set of symbols. It is denoted by /.
2 THEORY OF COMPUTATION
For example:
D = (0, 1, 2, ..., 9)
X = ( +, *, l.%)
Y= {a, b, D,$, @ )
t
A 'finite set' has finite—bounded or limited—number of elements. For example, se
X and set y
given here consists of numbers 0 to 9, which is 10 elements; similarly, set
have five elements each.
and Y are all alphabets. Note that alphabet Y contains heterogeneous
Here, sets D, X,
(not of the same type) entities.
In natural languages such as English, we call each character (or symbol) an alphabet.
e
However, in computer science, a slightly different nomenclature is followed: Here, a finit
set of all symbols is known as an alphabet.
As the symbols are user-defined, one can decide any alphabet set of his/her choice and
define a language over it. Just as English and French might have similar symbols, but Marathi
and Japanese have altogether different symbols, programming (or formal) languages, such as
C, C+ +, Java, and C#, might have differences in their respective set of symbols. Therefore,
it is up to the language designer or programmer to choose suitable symbols.
A set is defined as a collection of well-defined and distinct objects. These objects, or enti-
ties, are called the members (or elements) of the set.
For example, consider a set A such that:
A = {1,2,3)
PRELIMINARIES 3
1.3.1 Operations
There are many operations that are typically performed over sets, namely, union, intersection,
difference, concatenation, and closure. Let us learn about each of them in detail.
Set Union
The union of two sets is defined as:
AUB=IxIx EAorx E B)
A union of two sets includes elements from both the sets. If a given
element exists in both the sets, then it appears only once in their
union—this operation is analogous to the Boolean OR operation.
Any element in the union of two sets A and B is an element of
either A or B, or both. The Venn diagram for the union operation is
shown in Fig. 1.1. The grey area in the figure represents the result
Figure 1.1 Venn diagram for A U B of the union.
Note: Venn diagrams are used to represent logical relations among sets diagrammatically. A
Venn diagram is constructed with a collection of simple closed curves drawn in a plane, each
representing a set.
For example:
Note that although the elements 1 and 3 are members of both the sets A and B,
they appear
only once in in the set A U B.
Set Intersection
The intersection of two sets is defined as:
A n B = {xlx E Aandx E B)
4 THEORY OF COMPUTATION
Set Difference
The difference of two sets is defined as:
A —B= {xix Aandx 0B)
or A — B = A —(A n B)
Hence, the difference set (A — B) includes those elements in set A that
are not present in set B. The Venn diagram in Fig. 1.3 represents the
difference operation (A — B). The grey area in the figure represents the
result of the operation.
For example:
If A = {1, 2, 3, 7, 9} and B = { 1, 3, 4, 6}, then, A — B = {2, 7, 9}.
Figure 1.3 Venn diagram for A — B The set difference (A — B) is also called the relative complement of
B in A.
Cartesian Product
The Cartesian product of two sets is defined as:
AxB={(a,b)la EAandbeB,Va&Vb}
It defines the association of every element of set A with each element of set B.
For example:
{1, 2} X {red, blue} = {(1, red), (1, blue), (2, red), (2, blue))
{a, b} X {a, b} = Ra, a), (a, b), (b, a), (b, b))
In the Cartesian product (A X B), each pair of the form (a, b)
is called an ordered pair,
because there is an ordering imposed on how to formulate the pair; the first symbol comes
from set A, while the second comes from set B. An ordered pair is also referred to as a
tupk•
Subset
If every member of set A is a member of set B, then set A is said to be a subset of set B.
We write this as A C B
Here, set B is said to be the superset of set A.
For example: (1, 4) C {1, 2, 3, 4, 5}
Sets A and B are said to be equal if they have the same members.
PRELIMINARIES 5
Note: The empty set 4) is a subset of every set, and every set is a subset of itself. For example,
let us consider a set A. Then,
ci) C A
ACA
Power Set
The power set of a set A is the set of all subsets of A, including itself, and the empty set,
0. It is denoted by 2A.
For example, if A = {0, 1, 2},
then, 2A = (0, {0}, (1), (2), (0, 1), {0, 2), {1, 2}, {0, 1, 2))
We see that A in this example has 3 elements, while 2A has 23 = 8 elements. In short, if we
denote the number of elements in A as I A I', then number of elements in 2A is:
I 2A = 21A1
In general, the power set of a finite set with n elements has 2'
elements.
Complement of a Set
A set that encompasses all possible sets that can exist is called a
universal set, and is denoted by U.
The complement of any set A is defined as:
A'= U — A
Figure 1.5 Venn diagram for A' Figure 1.5 shows a diagrammatic representation of the universal set
(Complement of A) U; the grey area denotes A'.
6 THEORY OF COMPUTATION
observations:
Note: Some imixirtani
• A n A'
'AUB=BUA
(B U C) •AU=U
• (AUB)UC = A U
• A — A = (f)
• A (A U B)
• Lp= 4)
• AUA=A
• AUO=A • (12.' = U
• A nn= !MA • A—B=Anr
• (A n B)nC=An (B n C) • 1,4 B1=IB
• (A n B) A • (A x =
• A(1/1=A • A X (B U C) = (A X B) U (A X C)
• An4,= 4) • (A U B) X C = (A X C) U (B X 0
Set Concatenation
Concatenation of two sets A and B is defined as:
A • B = AB = {xlx ab,Va E A and Vb E B}
This means that every string from set A is concatenated with each string in set B.
For example, if A = {000, 111}, and B = (101, 010),
then, AB = (000101,000010,111101,111010)
Note that
1. AB BA
For these example sets,
BA = (101000, 101111, 010000, 010111)
2. A(BC) = (AB)C, where, A, B, and C are sets.
Set Closure
Closure of a set is defined as:
S* = U SI U S2 ...,
where, S° = {e},
and Si = Si-1 • S; for i > 0.
Closure of a set is thus a repetitive concatenation of the set to itself.
For example:
If S = (01, 11), then, from the definition:
= fel
Si = S° • S = {e} • {01, 11}
= {01, 11}
S2 = SI • S
= {01, 11} • {01, 11}
= {0101, 0111, 1101, 1111)
S3 = S' • S
{0101, 0111, 1101, 1111} • {01,11}
= {010101, 010111, 011101, 011111, 110101, 110111, 111101, 111111}
Thus, we have:
S* = Scl U U S2 U S3 U
= le, 01, 11, 0101, 0111,1101, 1111, 010101, 010111, ...I
Similarly, if S = {0, 1},
then, S* = {c, 0, 1, 00, 01, 10, 11, 000, ... }.
Note: S* essentially records all possible combinations of the strings from set S. By definition,
S* is an infinite set, even if the original set S is finite.
1.3.2 Cardinality
Cardinality of a set is defined as the number of elements in the set. If A is any set, then its
cardinality is denoted as `I A l'.
1. Two sets S1 and S2 are said to have the same cardinality if there is one-to-one mapping
of the elements of Si onto 52. One-to-one mapping among the sets is defined if one
element of a given set is associated with at most one element of the other set. Refer to
Section 1.4 for more details.
2. For finite sets, if 'Si' is a proper subset of S2, then Si and S2 have different cardinalities.
In fact, 1Si 1 < 1 S2 I.
3. Statement 2 is not always true for infinite sets. For example:
S1 SI = set of positive even integers, that is, Si = { 2, 4, 6, 8, ... }
S2 set of all positive integers, that is, S2 = { 1, 2, 3, 4, 5, 6, 7, ...}
We observe that SI is a proper subset of S2. However, they have the same cardinal-
ity, because there exists one-to-one and onto mapping of positive even integers
and all positive integers (refer to Fig. 1.6). As we can observe, every element of
Figure 1.6 Subset S2 is mapped to exactly one element of S1 (one-to-one) and all the elements of
having same cardinality Si are mapped to at least one element of S2 (onto).
Example 1.1 Show that if set S is uncountable and set T is countable, then
uncountable.
Solution Let us consider the set S as a set of real numbers, R, which is an uncountable
and T as the set of all integers, I, which is a countable set. As we know, all whole integ*1 er'
are included in the set of real numbers; hence, I C R.
If we consider the set (R — I), only the whole integers will be removed from the set R,
this means, (R — I) still consists of all real numbers, except the whole integers. Hence the
property of the real numbers that between any two real numbers there are infinite real numbers
still holds true, and the ability to find the successor is missing even with (R —
Hence, (R — I) still remains uncountable.
Therefore, if S is uncountable and T is countable, then (S — 7) is uncountable.
1.4 RELATIONS
A relation is a set of ordered pairs (or tuples), where the first component of the pair is
from the set called the domain, and the second component is from the set called the
range (or co-domain). In a relation, if the domain and range are the same set S, then
we say that the relation is on set S (refer to Fig. 1.8).
A binary relation can be defined as follows:
where, set A is the domain set, and set B is the range set.
Domain
For example, the relation in Fig. 1.7 can be listed as:
Range
Figure 1.8 Relation R = {(2, 1), (4, 2), (6, 3), ... }
Note: In this section, we will discuss binary relations. There is a more generic form of relations,
which is known as n-ary relations or,finitary relations.
PRELIMINARIES 9
The associations, one-to-one (Fig. 1.9a) and many-to-one (Fig. 1.9c), are referred to as
functions, as they yield a single value from the range (or output) set. Domain and range
can thus be respectively envisioned as input and output in terms of programming nomen-
clature. Most of the programming languages define functions, which have a single return
value of some return type. For example, refer to the following function that returns the
balance for a bank account.
Integer getBalance (String bankAccNumber);
Here, the set of all bank account numbers is a domain, while the integer amount that it
returns is the range.
The associations, one-to-many (Fig. 1.9b) and many-to-many (Fig. 1.9d), are sometimes
referred to as multi-valued functions, or set-valued functions. For example, every real number
greater than zero has two square roots—the square roots of 4, for instance, are { +2, —21.
If every member of the range set is associated with at least one element of the domain
set, then the relation is considered as an onto relation; otherwise it is considered as an
into relation. For any machine (or program) to be predictable (or deterministic), it is
required to know what input (domain) generates what output. In other words, every
machine (or program) needs to be an onto relation. We observe that none of the asso-
ciations in Fig. 1.9 are onto relations—they are all into relations. Figure 1.10 depicts
1.4.1 Properties
(domain and range is the same set S), then it is said to be:
If R is a relation on set S
1. Reflexive, if aRa exists for all a in S
a, b, and c in S
2. Transitive, if aRi, and bRC imply ak, for all
b in S
3. Symmetric, if aRt, implies bRa, for all a and
for all a and b in S
4. Anti-symmetric, if aRi, does not imply bRa,
If a relation is reflexive, transitive, as well as symmetric, then it is said to be an equivalence
relation. If a relation is reflexive, transitive, and anti-symmetric, then it is said to be a p4171
ordering relation.
on the set of integers I:
For example, let us consider a trivial relation`='
1. It is reflexive because, a = a, Va E I
2. It is transitive because, if a = b and b = c, then a = c, Va, Vb, VC E I
3. It is symmetric because, if a = b then b = a,Va,Vb E I
Therefore, it is an equivalence relation.
An equivalence relation on a set S divides it into disjoint equivalence classes; the elements
of each class (subset) have similar properties. This is the reason behind the origin of the
concept of a class in object-oriented languages. For example, if we consider the operation
to find: the 'remainder after dividing any decimal number by 3', we can distribute the digits
{0, 1, 2, ..., 9} into three equivalent classes, namely, {0, 3, 6, 9}, { 1, 4, 7}, and {2, 5, 8},
that are associated with the remainders 0, 1, and 2 respectively when divided by 3.
In case of partial ordering relations, anti-symmetry imposes some ordering on the set
elements. For example, a partial ordering relation < = ' on the set of integers I, imposes
ordering of the type:
<= —2 <= —1 <= 0 <= +1 <= +2 <= +3 <=
Note: '<' is actually a total ordering relation. It means that every two set elements are related
to each other. Partial ordering is a more generic class, whereas total ordering is a specialization
of the same. Binary tree is an example of a partial ordering relation, which means not every
two nodes in the tree are related.
Symmetric Closure
Symmetric closure of a relation R is defined as follows:
If (a, b) E R, then (a, b) and (b, a) are in the symmetric closure of R.
Thus, symmetric closure of R = R U ((b, a) I (a, b) E R}
In other words, symmetric closure of R is the union of R with its inverse relation, R-1.
For example, let us consider relation, R = [(1, 2), (2, 2), (2, 3)) over set S = {1, 2, 3},
Then, symmetric closure of R = 1(1, 2), (2, 2), (2, 3), (2, 1), (3, 2)1.
Example 1.3 Find the transitive closure and symmetric closure of the relation:
R = (1, 2), (2, 3), (3, 4), (5, 4))
Solution Transitive closure R+ = (1, 2), (2, 3), (3, 4), (5, 4), (1, 3), (2, 4), (1, 4))
As (1, 2) and (2, 3) are members of R+ , (1, 3) is added. Similarly, since (2, 3) and (3, 4)
are members of R+ , (2, 4) is added; and since (1, 2) and (2, 4) are members of R+ , (1, 4)
is added.
Symmetric closure of R = f(1, 2), (2, 3), (3, 4), (5, 4), (2, 1), (3, 2), (4, 3), (4, 5)}
Since (1, 2) is a member of R, (2, 1) is added. Similarly, (3, 2), (4, 3), and (4, 5) are also
added in the symmetric closure of R.
1,5 GRAPHS
G = (V, E)
where, V: Finite set of vertices, and E: Finite set of ordered pairs of vertices called arcs.
An arc (v1, v2) from vertex v1 to vertex v2 is -denoted by: `v1 -> v2'; here, v l is called the
predecessor of v2, and v2 is called the successor of v1.
For example, let us refer to Fig. 1.11(b). For the graph given in the figure, we have:
V= (1, 2, 31
E = ((1, 2), (2, 3), (3, 3), (3, 1)}
PRELIMINARIES 13
The order of vertices is important in a directed graph. The ordered pair (1, 2) denotes the
edge from the node labelled '1' to the node labelled '2', and not vice versa.
Note: A graph can be defined as a relation over a set of vertices. It is not merely a diagram, but
a visualization of the underlying relation.
1.5.2 Tree
A tree is a digraph with the following properties:
1. There exists one vertex called the root vertex that does not have a predecessor and
from which there is a path to every other vertex in the graph.
2. Each vertex other than the root has exactly one predecessor; the immediate predecessor
of a node is called the parent node.
3. The successors of each vertex are ordered from the left; the immediate successor of a
node is called the child node,
Figure 1.12 shows an example tree, whose root vertex is A. Nodes B, C, and
F are the interior vertices (or intermediate nodes), and D, E, G, H are the leaf
nodes or leaves—these are nodes that do not have any successors. Node C is
the parent of nodes E and F, while E and F are the child nodes of C. Further,
node C is the ancestor of nodes G and H, as the path from C leads to G or H,
while G and H are said to be the descendants of C. Similarly, A is the ancestor
Figure 1.12 A tree of D, and D is the descendant of A.
An ancestor is not generally the immediate parent, but could be the parent of
either a parent or his/her parent, and so on. This might involve many levels, with the only
condition that the path from the ancestor should lead to its descendants.
Note: Though a tree is a digraph, the edges in Fig. 1.12 are not drawn as arcs. This is because,
in case of a tree, it is assumed that the edges are directed from parent to the children and not
vice versa. They may or may not be shown by an arrow.
1.6 LANGUAGES
A language is defined as a set of strings comprising symbols from one alphabet.
Note that the null set 4) and the set consisting of empty string, that is, {e}, are also
considered as languages.
The set of all strings over a fixed alphabet is a language, and is denoted by E*.
For example:
Let E = al; then,
E* = {e, a, aa, aaa, . .
Similarly, let E = {0, 1}; then,
E* = le, 0, 1, 00, 01, 10, 11, 000, ...1
14 THEORY OF COMPUTATION
Theorem 1.1
For any alphabet 1,
E* = E**
Proof
We know that language E* over an alphabet E is the set of all possible strings of symbols from 1.
Now, E** = (E*)* = Set of all possible combinations of strings of symbols over I*.
Therefore, in turn, E** is the set of all possible string combinations of symbols from
Therefore, E* = E**
Note: As we have already included all possible strings from E, there is nothing more we can
add. Hence, E* is pronounced as E-closure. All strings are obtained by the basic operation of
concatenation of symbols to one another. The term closure comes from the fact that the set S*
is closed, and no matter how many more concatenations we might perform over it, we will not
get any additional strings than those in /*. We can extend this further to say:
E* = E** = E*** = E****, and so on. •
Principle Let S(n) denote the statement to he proved involving variable n, and let us
suppose:
1. S(1) is true:
2. If S(k) is true for n = k, and 'S(k + 1)' is also true,
then, S(n) is true for all values of n.
A statement is proved using induction by showing that the first statement in the infinite
sequence of statements is true, and then proving that if any one statement in the infinite
sequence of statements is true, then so is the next one. This is based on the successor re-
lationship among the infinite sequence of statements.
The following steps are involved in inductive proof:
Induction basis This step tests if the statement, S(n) holds true when n is equal to its
lowest possible value. Usually, n = 0 or n = 1.
Induction hypothesis (or inductive hypothesis) In this step, it is assumed that S(n) is
true for some value of n, that is, for n = k.
Inductive step This step tests if the statement also holds when n = k + 1. If the step
is true for n = k + 1, then it is true for all values of n.
Inductive step: Now we need to prove the statement for n = k + 1. From the inductive
hypothesis, we have:
1 + 3 + 5 + • • • + (2k — 1) = k2
Therefore, if n = k + 1, we have:
Thus, the given statement is true for n = k + 1 as well; so, by the principle of mathemati-
cal induction, we conclude that it is true for all n = 1, 2, 3,
16 THEORY OF COMPUTATION
1 +2+3+ + k k (k + 1) 1 2
Inductive step: Now we need to prove the statement for n = k + 1. From the inductive
hypothesis, we have:
1+2+3+ + k k (k + 1) I 2
1 + 2 + 3 + • • • +k+k+1
= [k (k + 1) I 2] + (k + 1)
= [k (k + 1) I 2] + 2(k + 1) / 2
= (k + 1) (k + 2) / 2
= (k + 1) [(k + 1) + 1] I 2
Thus, the given statement is true for n = k + 1 as well; so, by the principle of
mathematical induction, we conclude that it is true for all n = 1, 2, 3, ...
SUMMARY
A symbol is an abstract or a user-defined entity. It is words by appending a finite number of symbols from
analogous to a point in geometry, and cannot be for- one given alphabet set into finite sequences.
mally defined. For example, letters, digits, or any other The prefix of a string includes any number of lead-
characters that one wishes to consider as a part of the ing symbols of the string. For example, if string
language that is being designed, are said to be symbols. x abc,
= then the prefixes of x are: c, a, ab, and abc.
It is the basic unit (or constituent) of any language. The prefix of a string, other than the string itself, is
An 'alphabet' is a finite set of symbols. It is denoted called proper prefix. In this example, the three prefixes
by E. For example, D = {0, 1, 2, ..., 9}. As the symbols {e, a, ab} are proper prefixes of abc.
are user defined, one can decide any alphabet set of The suffix of a string is any number of trailing
his/her choice and define a language over it. symbols of the string. For this example, suffixes
A string or word is defined as a finite sequence of string x are: c, c, bc, and abc. The suffix of a
of symbols over a given alphabet. Here, the phrase, string, other than itself, is called proper suffix. In
'over a given alphabet', means that all the symbols this example: e, c, and bc are proper suffixes of
of a word should come from the same alphabet set. A string abc.
string must contain the symbols from only one given If w and x are two strings, then wx is called the
alphabet in order to be meaningful. We can formulate concatenation of these two strings.
PRELIMINARIES 17
A set is a collection of well-defined and distinct Every relation is a subset of the Cartesian product
objects. These objects or entities in the set are called of the domain and range set:
the members (or elements) of the set.
AR8 c (A X B)
Typical operations on sets include: union, intersec-
tion, difference, and complement. Other operations A relation can encompass four types of associations:
include concatenation and closure. one-to-one, one-to-many, many-to-one, and many-to-
If every member of set A is a member of set B, many. The domain and range can be respectively
then A is said to be a subset of B; and B is called visualized as input and output, in terms of program-
the superset of A. ming nomenclature.
The power set of a set A is the set of all subsets If R is a relation on set S (domain and range is
of A, including A itself and the empty set cl). It is the same set S), it is said to be:
written as 2A.
1. Reflexive, if aR, exists for all a in S
The Cartesian product of two sets is defined as:
2. Transitive, if a Rb and bRc imply aRc, for all a, b, and
Ax8={(a,b)Ia€AandbEB,Va&Vb} c in S
It defines the association of every element of set A 3. Symmetric, if aRb implies bRa, for all a and b in S
with each element of set B. 4. Anti-symmetric, if a Rb does not imply bRa , for all
For example: a and b in S
(1, 2} x (red, blue} {(1, red), (1, blue), (2, red), If a relation is reflexive, transitive, as well as sym-
(2, blue)} metric, then it is said to be an equivalence relation. If
Cardinality of a set is defined as the number of ele- a relation is reflexive, transitive, and anti-symmetric,
ments in the set. If A is any set, then its cardinality then it is called a partial ordering relation.
is denoted by I A I. Transitive closure of a relation R, that is, R+ , is
Countability is the property which signifies the exis- defined as:
tence of a successor. For example, given any integer
1. If (a, b) E R, then (a, b) is in R+
i one can always find its successor, as `i + 1'. Finite
2. If (a, b) E R+ and (b, c) E R+ , then (a, c) is in R+
sets are always countable. Infinite sets that can be
placed in one-to-one correspondence with the set of Reflexive and Transitive closure of a relation R, that
natural numbers, N = (1, 2, 3, 4, 5, ...), are said to is, R*, is defined as:
be countably infinite, or just countable or enumerable. R* = R+ U {(a, a) I Va E where R is a relation
Some infinite sets are uncountable. For example, defined over set S.
the set of real numbers R is infinite and uncountable,
Symmetric closure of a relation R is defined as:
as one cannot find the successor for any given real
number. This is because between any two real num- If (a, b) E R, then (a, b) and (b, a) are in the sym-
bers, there are infinite numbers of other real numbers. metric closure of R
Hence, the set of real numbers in uncountable.
Thus, symmetric closure of R = R U {(b, a) I (a, b)
A relation is a set of ordered pairs (or tuples), where R). In other words, the symmetric closure of R is the
the first component of the pair is from the set called union of R with its inverse relation, R-1.
the domain, and the second component is from the
A 'graph' is formally defined by a tupie:
set called the range (or co-domain). In a relation, if
G (V,=
E)
the domain and range are the same set S, then we
say that the relation is on set S. where, V = Finite set of vertices or nodes
A binary relation is defined as: E = {(v1, v2) f v1, v2 E V}, that is, finite set of
ARB = ((a, b) E A and b EB}, edges connecting the vertices.
where, set A is the domain set and set B is the In case of a directed graph (or digraph), set E is a
range set. finite set of ordered pairs of vertices called arcs. An
-
1/5 frit VRT y r PP°
For example:
is denoted by,
from vertex vi to vertex v2
arc (v1, v2) the predecessor
of v2, Let I — {a}; then,
v2 '; here, is called
Iv, successor of v1. {c, a, aa, aaa, • • .}
and v2 is called the t of
A graph is defined as a relation over the so For any given alphabet X, any sikset (If
merely a diagram, but a visw ill/ii
vertices. It is not language. Hence, I* is considered as a
univa`''
tion of the underlying relation. ucsleud eto
guage over E, as it in s de
all rnoin
po t:eat:
is a digraph having the following proper tlefi: s ativi,
j
A tree Mathematical induction is a m ethod of ril-ol b4 t,
cal proof, typically
1. There exists one vertex called the root vertex
that does not have a predecessor and from statement, S(n) is true for all values, of n
which, there is a path to every other vertex in 3, 4,
1
The principle of mathematical induction
the graph.
the following steps:
2. Each vertex other than the root has exactly one
predecessor. 1. Test whether the statement is true for n
3. The successors of each vertex are ordered from S(1) is true, then we proceed to thenext
ste4.
the left. 2. Assume that the statement is true for sornevaiit.
A language is defined as a set of strings of symbols of n, that is, for n = k. This means, we assine
from one alphabet. The null set 40 and the set consisting S(k) is true.
of empty string, that is, {e), are also languages. 3. Test whether the statement is true for 'n
The set of all strings over a fixed alphabet E is a 1'. If S(k + 1) is true, it means that the statement
language, and is denoted by 1*. is true for all values of n.
EXERCISES
This section lists a few unsolved problems, to help the readers understand the topic better and practise example
related to the preliminaries.
Objective Questions
C) 1.1 A relation R is defined on the set of integers as
(b) A C U B)
,Ry if f(x + y) is even. Which of the following
(c) n B) n c=An (B n cr)
statements is true?
(d) (A U B) X C (AXC)U (BX C)
(a) R is not an equivalence relation.
1.3 The less than relation <' on real numbers i!
(b) R is an equivalence relation having one
(a) A partial ordering relation since it
equivalence class.
(c) R is an equivalence relation having two asymmetric and reflexive
.
(b) A partial ordering relation since it is agl
equivalence classes.
(d) R is an equivalence relation having three symmetric and reflexive
(c) Not a partial ordering relation since it
equivalence classes.
0 1.2 If A and B not asymmetric and not reflexive
are sets, then which of the following
statements is false? (c) Not a partial ordering relation since
(a) (A U B) c A sytruneitr c and is reflexive
U (B U C)
(e) None
nota of these
Remember (R), Understand (U), Apply (A), Analyse (L), Evaluate (E), and Create (C)
PRELIMINARIES 19
LEARNING OBJECTIVES
After completing this chapter, the reader will be able to
understand the following:
2.1 INTRODUCTION
The term finite state machine (FSM) is used for all such programs that have a finite number
of functions, but do not have any memory for storing the intermediate results. It is a simple
and primitive computational model, which has many applications, but limited power due
to lack of memory.
The term 'machine' is used throughout this book to refer to predictable programs, whose
behaviour can be understood without executing them. The term 'state' is typically used
for different functions that are constituents of a program. A 'program', here, is defined
as a collection of unique functions, each one performing an atomic and unique task. A
program is said to be completely executed when the task of each function is performed
one by one, in order, until the last.
FINITE STATE MACHINES 21
Here, we observe that the input is a combination of multiple input symbols. Such a
machine is also called a 'combinational machine'.
2. A decimal-to-binary converter can be treated as a basic machine having the following
inputs and outputs:
I = {0, 1, 2, ...
0 = {000, 001, 010, ...}
3. A weighing machine that we normally see at railway stations or bus stands is also
a good example of a basic machine. Here the output is the weight ticket, which is
obtained after inserting a coin:
I = {coin}
0 = {printed weight ticket}
4. Electrical appliances, such as electric fans, are basic machines with regulator positions
as the input set, and the speed in revolutions per minute (rpm) as the output set:
I = { posl, pos2, pos3, pos4, pos5 }
0 = { speed A, speed B, speed C, speed D, speed E
22 THEORY OF COMPUTATION
In all the aforementioned examples, we see that there is an in input, which, after going into
the machine, gives a particular output. Thus, ignoring the internal details and concerning
only with inputs and outputs, every machine can be viewed as a basic machine. A basic
machine performs only a table look-up from a finite-sized table called the MAF table. This
table has neither memory nor internal states.
It is impossible to create a virtual table that can store infinite word sets in a tabular
form. Let us consider a basic machine that produces output in the form of 'yes' or 'no' to
check whether a given word is from a given infinite language. This is possible only with
the help of a machine having internal states; on reading the input, the machine selects a
particular path from the initial state to reach the final state, and produces a valid output.
Such a machine has a finite number of internal states, and is called an FSM.
2.2.1 Examples
Let us try to understand the concept through a few simple examples:
S {carry, no carry).
FINITE STATE MACHINES 23
Initially, the machine is always in the 'no carry' state; and at any given time, the machine
may be in one of the two states, that is, 'carry' or 'no carry', depending on the result of
the previous addition.
Let us consider a situation in which the current state of the machine is 'carry' and the
current input is `(0, 0)'. Then the output will be 0 + 0 + 1 (carry) = 1, and the machine
moves to the next state, which will be a 'no carry' state. Similarly, if the current state of
the machine is 'no carry' and the input is `(1, 1)', then the output will be 1 + 1 + 0 (no
carry) = 0. The machine again moves to the next state, that is, the 'carry' state.
Thus, the addition of the two input symbols at any given point depends not only on the
current input, but also on the current state of the machine. For example, `(0 + 0)' is not
always 0; it can also be 1, if the machine is in the 'carry' state, as we have already seen.
The MAF and STF for the binary adder are shown in Table 2.2.
Table 2.2 MAF and STF for binary adder (a) MAF: I x S —> 0 (b) STF: I x S S
S
I Carry No carry I Carry No carry
(0, 0) 1 0 (0, 0) No carry No carry
Outputs Next states
(0, 1) 0 (0, 1) carry No c
(1, 0) 0 (1, 0) carry
(1, 1) 1 0 (1, 1) carry carry
(a) (b)
Let us simulate the working of the binary adder that we have designed for adding two
numbers: 1011 and 1111. The initial state will always be 'no carry' when we begin the
addition. The simulation is shown in Fig. 2.1.
Current Next
4 3 2 1 state Input state Output
1* 0i 1 Number 1
1. No carry (0) (1, 1) Carry (1) 0
1 1 1 1 Number 2
Final state 2. Carry (1) (1, 1) Carry (1) 1
(carry) —4'11 1 1 04—Initial state
(No carry)
Output 0 1 0 3. Carry (1 (0, 1) Carry (1) 0
The two input binary numbers considered for simulation in Fig. 2.1 are 1011 and 1111.
The first pair of input symbols is (1, 1), which is obtained from the least significant bits
(right-most bits) of both the binary input numbers. 'No carry' is the initial state, or the
24
THEORY OF COMPUTATION
beginning state. Upon reading the input (1, 1), the FSM makes a transition from `no carry•
state to 'carry' state, and generates an intermediate output, 0. Then, it reads the next pairs of
digits—(1, 1), (0, 1), and (1, 1)—and transits at every stage, generating some intermediate
output. Thus, the addition of the two input binary numbers yields the output '1010' Wi th
carry. Figure 2.1 explains each step in detail.
A more convenient representation of the FSM is possible using a transition diagram
instead of state tables.
Input Output
Figure 2.2 Transition graph (TG) for binary adder
The initial state, or 'Start' state, is normally represented by an arrow, pointing towards
the state. Many times, the word 'Start' is also written above this arrow.
Each transition, that is, each arc of the graph, gives information about the output symbol
as well the next state, if we know the current input and the current state. Thus, it is easier
to understand the working of a machine with the help of this diagram, rather than looking
at two different tables simultaneously.
In order to represent a transition from state Si to Si for more than one input symbol, the
`logical OR' symbol, that is, the 'V' symbol, is used; and if there is no transition from state
Si to state Sp then the symbol, `- ', is placed for that entry in the matrix.
The transition matrix for the binary adder in Example 2.1 is shown in Table 2.3.
Next state
Current state No carry Carry
No carry (0, 1)/1 V (0, 0)/0 (1, 1)/0
V (1, 0)/1
Carry (0, 0)/1 (0, 1)/ 0 V (1, 0)/0
V (1, 1)/1
Example 2.2 Design an FSM for a divisibility-by-3 tester for decimal numbers.
Solution Decimal numbers are base-10 numerals containing digits from 0 to 9. Hence,
the input set, I, is:
I= {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
From this set, digits 0, 3, 6, and 9 are of same type, that is, they are all divisible by 3.
Similarly, digits 1, 4, and 7 generate remainder 1, when divided by 3; and digits 2, 5, and
8 generate remainder 2 when divided by 3.
Based on this similarity, we can group the digits together, and the resultant set of inputs
will effectively consist of only three different classes of inputs:
I= (0, 3, 6, 9), (1, 4, 7), (2, 5, 8)1
Let us consider the possible outputs: If the number is divisible by 3, the output is '1',
which means yes, it is divisible; and if the number is not divisible by 3, then the output is
`0', which implies no, it is not divisible.
Hence, the output set is:
0= {0, 1 }
When we divide a number by 3, it is possible to get three different remainder values: 0,
1, and 2. If we get 0 as the remainder, then the decimal number is divisible by 3, and the
machine produces output 1 (yes, it is divisible); and if the remainder is either 1 or 2, the
machine produces output 0 (no, it is not divisible).
Depending on these remainder values, we can have three different states for the machine:
S = {So, SI, S2}
where, So: Zero-remainder state
Si : One-remainder state
S2: Two-remainder state
26 THEORY OF COMPUTATION
The state tables for the divisibility-by-3 tester are shown in Table 2.4.
Table 2.4 State tables for the divisibility-by-3 tester (a) MAF: S 0 (b) STF:1 x S S
concatenation I.S 1 1
S (0, 3, 6, 9) (1, 4, 7) (2, 5, 8) S (0, 3, 6, 9) (I, 4, 7) (2, 5, 8)
if I=3,S=S0=0 So 1 ' 0 0 So So Si S2
then output 30/3
divisible Si 0 0 1 Si S1 S2 So
S2 0 1 0 S2 S2 So S1
(a) (b)
At every step in the division of any multi-digit decimal input, the remainder from the previ-
ous division step is concatenated to the next input digit to form the number to be considered
for division in the next step. For example, if the machine is in state S1, that is, one-remainder
state, and the input is (1, 4, 7)-that is, either 1, or 4, or 7-then the next step considers for
division either '11', or '14', or '17', respectively. Now, if we
Start divide these numbers by 3, we get remainder 2. Hence, the
(0, 3, 6, 9) 1 machine moves from state S1 to state S2 with output '0'. This
(2, 5, 8)-* 1 output indicates that the number '11', '14', or '17', as the case
(1, 4, 7) I may be, is not divisible by 3.
Similarly, if the machine is in state S2, and the input is
(1, 4, 7), then the numbers formed are respectively '21', '24',
(0, 3, 6, 9) 0 (0, 3, 6, 9) -.o or `27', which are divisible by 3. Therefore, the machine
(2, 5, 8)-,-0 moves to zero-remainder state, that is, state So, with output
Figure 2.3 Transition graph (TG) for divisibility- '1', meaning that the number is divisible by 3. The transition
by-3 tester diagram for the machine is shown in Fig. 2.3.
We see that So-the zero-remainder state-is the initial state. The final state could be
either So, S1, or S2, depending on the input number.
Now, let us simulate the working of the machine that we have designed on two arbitrary
numbers, '112' and '1416' , as shown in Fig. 2.4.
S2 0
S2 So 1
Final Output
state (non-divisible) So 6 So 1
Final Output
state (divisible)
Figure 2.4 Divisiblity-by-3 tester simulation for input numbers '112' and '1416'
FINITE STATE MACHINES 27
The transition matrix for the divisibility-by-3 tester is shown in Table 2.5.
Example 2.3 Design an FSM for divisibility-by-5 tester for decimal numbers.
Solution We can use an approach similar to the one in Example 2.2. In this case, it is
possible to have five different remainder values: 0, 1, 2, 3, and 4, as we are dividing by
5. If we do not want to know whether machine is in state S1, S2, S3, or S4 after reading the
input number, that is, if we do not want to know about the exact value of the remainder,
but only want an answer in the form of 'yes' (it is divisible), or 'no' (it is not divisible),
then we require only two states:
S = Igo (divisible), g1 (not divisible) }
We know that if any decimal number ends in 0 or 5, then it is divisible by 5; else it is not
divisible by 5. Depending on this fact, we can group the input digits into two categories:
(0, 5) and (1, 2, 3, 4, 6, 7, 8, 9) Therefore, the input set, I, and the state set, S, are:
I= ((0, 5), (1, 2, 3, 4, 6, 7, 8, 9))
S = {go, g1 }
In this example, we are not interested in the different remainder values. Hence, when the
remainder is either 1, 2, 3, or 4, that is, when the number is not divisible by 5, the machine
will lie in state q1 ; and if the remainder is 0, that is, the number is divisible by 5, then it
will lie in state go. However, the initial state of the machine is always go.
The STF and MAF tables for the divisibility-by-5 tester are shown in Table 2.6.
Table 2.6 STF and MAF tables for the divisibility-by-5 tester (a) MAF: I X S
-> 0 (b) STF: l x S --> S
S (1, 2, 3, 4, 6, 7, 8, 9) S
(0,5) (0,5) (1, 2, 3, 4, 6, 7, 8, 9)
go 1 0 go go
gi 1 0 471 go
(a) q1
(b)
28 THEORY OF COMPUTATION
(1, 2, 3, 4, 6, 7, 8, 9)
(0, 5)
Next state
Current state qo q1
qo (0, 5)/1 (1, 2, 3, 4, 6, 7, 8, 9)/0
41 (0, 5)/1 (1, 2, 3, 4, 6, 7, 8, 9)/0
Example 2.4 Design an FSM for divisibility-by-2 tester for unary numbers.
Solution Unary numbers use a single letter or digit to represent a number. For example,
decimal number 5 can be represented in unary form as '11111', or `aaaaa', or `bbbbb', or
`00000', and so on, depending on what letter or digit we choose to represent.
Let us consider the case: / {=a }, which means that we choose the letter a' to represent
the numbers. Hence, the set of all non-zero positive integers over can be represented as
{a, aa, aaa, aaaa, }
Now, if we divide a number by 2, the possible remainder values are 0, which means it
is divisible; or 1, which means it is not divisible by 2. Therefore, the set of states can be
represented as:
S = {p (divisible), q (non-divisible)}
As we are going to represent unary numbers using symbol 'a', the set of input symbols is:
I = {a}
state p,
the number becomes 2, which is divisible by 2. Therefore, the machine moves to
and gives the output as 1.
The transition diagram for this machine is shown in Fig. 2.6, and the transition matrix
is shown in Table 2.9.
Formal Definition
FA is denoted by a five-tuple:
M = (Q, E, 5, go, F)
where,
Q: Finite set of states
X: Finite input alphabet
go: Initial state of FA, go E Q
F: Set of final states, F C Q
5: STF that maps Q X E to Q, i.e., 5: Q Q
The aforementioned definition of the transition function, 5, is for deterministic FA, or DFA.
The transition function for non-deterministic FA, or NFA, is different and is discussed
later in this chapter.
As the definition is formalized, we only need to get familiar with the standard names
or symbols that are used. A few notational differences between FA and FSM that we can
identify are: Q is analogous to state set, S; E is analogous to input set, I; and the state
transition function, '5: Q X E Q', is analogous to `STF: I X S ---> S'.
We observe that the formal definition for FA does not include output set,
0; therefore,
this formalism may be called FA without output. We further observe that FA, instead, has
final states that can be reached only when the input is acceptable (or valid).
Any subset of Q can be marked as the set of final states,
F, depending on the solution.
Upon reading the entire input string, if the machine resides in any of the final states, then the
30 THEORY OF COMPUTATION
input string is considered to be 'accepted' by the machine. On the other hand, if the machine
resides in any of the non-final states, then the input read is considered to be 'rejected' by the
machine. Conceptually, this is equivalent to generating the output 'true' if valid and 'false'
if otherwise—more like a Boolean function. Thus, FA acts as an input acceptor or rejecter.
Usually, in the transition graph of the problem solution, rejection is not explicitly shown:
The paths in the transition graph, which are unspecified, are considered as rejection paths.
At times, these can be explicitly shown with the help of non-accepting `trap states'. This
will be dealt with in greater detail while discussing the examples.
In case of DFA, neither for state q in Q, nor for the input symbol 'a' in does '5 (q, a)'
contain more than one element. Thus, the transition would be of the form:
5 (q, a) = p
where, p is the unique next state in Q, to which the machine makes the transition (p may
or may not be equal to q). This means that if the DFA is in state q and reads a symbol 'a',
then it moves to the unique next state p' .
For example, let us consider the FA:
q0 q2 ql I = {0, 1 }
—0 0
Notation (1) Notation (2) Notation (1) Notation (2)
Notation (1) Notation (2)
(b) (c)
(a)
Figure 2.8 Some additional notations for transition graph of FA (a) Initial state (b) Final
state (c) If initial and final states are same
This figure explains the different types of notations used to denote the initial and final
states. According to notation (2) in Fig. 2.8(a), the initial state can be denoted by a symbol,
—' (minus sign), inside the state circle, whereas according to notation (1), it is represented
by an arrow pointing to the state circle along with the string 'start' attached to it. The final
state is denoted by a symbol, ' +' (plus sign), inside a state circle according to notation (2)
in Fig. 2.8(b), whereas according to notation (1), it is represented by two concentric circles.
If initial state is also a final state, then we write a ' sign inside the state circle in notation
(2) in Fig. 2.8(c), while according to notation (1), this is represented by an arrow pointing
towards two concentric circles.
2.3.2 Functions
The example FA in Fig. 2.7 accepts only the strings over 2 = {0, 1) that contain even
number of 0's and even number of 1's. All these acceptable strings can be aggregated into
an infinite set or language L defined as:
L = { set of all strings over 2 = { 0, 1} containing even number of 0's and even
number of 1's }
A conclusion that can be drawn from the aforementioned finding is that though the lan-
guage L is infinite, we can construct a machine (FA) that can accept any string from the
Current Current language L and reject any other string which is not part of the
state input symbol Next state language L.
(Initial) qo q2
In other words, the FA constructed for a given language
checks the validity of the strings from the language. It checks
q2 0 go whether the input string is a part of the language or not. If the
qo 1 qi input string is part of the language, the FA accepts it by making
qi 1 qo (Final) a transition to the final state after reading the string. Otherwise,
(Accepted the it resides in one of the non-final states, indicating that it has
input) rejected the string.
Figure 2.9 Example FA accepting the input
'0011' Let us simulate the working of the FA represented in Fig. 2.7
for the input string '0011' (refer to Fig. 2.9).
32 THEORY OF COMPUTATION
Current Current Input string '0011' contains even number of 0's as well
state input symbol Next state as even number of l's. Hence, the input string is accepted by
go (Initial) 0 q2
the FA, which is indicated by the final state, go, in which the
machine resides after reading the string '0011'.
q2 I q3
Let us now simulate the working of FA for the input '01011';
q3 0 ql we see that this is not a valid string as it does not contain even
q1 I q0 number of l's (refer to Fig. 2.10).
1 We see that the input string '01011' is rejected by the FA as
q0 q1
(Non-final it resides in the non-final state, g l, after reading the string.
state) Thus, the FA is used as a language acceptor. It only relies
(Input rejected) on the transitions based on the input symbols, and these transi-
Figure 2.10 Example FA rejecting the input tions are based on the input patterns only. Hence, the FA can
'01011' accept an infinite language as well.
8 (g1, a2) = g2
This means that if the current state is go, then after reading string 'al a2 ', the state reached
is q2.
Similarly, the definition, '5 (q0, x) = p' means that after reading string 'x' symbol by
symbol (one symbol at a time), the machine reaches state p. If this state p is a final state,
that is, if p is a member of the set F, then 'x' is accepted by the automata M; else if p F.
then it is rejected by M.
Example 2.5 Design an FA that reads strings made up of the letters in the word 'CHARIOT'
and recognizes those strings that contain the word 'CAT' as a substring.
Solution Here the alphabet E consists of seven symbols:
= {C, H, A, R, I, 0, T}
We need to consider all possible strings made up of these seven symbols and design an
FA, which would recognize only those which contain 'CAT' as a substring. In turn, we
can say that the FA we are going to design would make a transition to the final state only
if the input string contains 'CAT' as a substring.
The working of the FA will be as follows:
If the FA reads the symbol 'C' while in the start state, that is, qo, then it makes a
transition to state q1; otherwise remains in the same state. Now in state q1, if the
FA reads symbol 'A', then it makes a transition to another state, that is, state q2.
Similarly, in q2, if FA reads symbol 'T', then it makes a transition to q3, which is the
final state.
Essentially, states q1, q2, and q3 respectively indicate that the substrings 'C', 'CA', and
`CAT' have been read.
Figure 2.11 shows the transition graph for the recognizer FA. The FA that is con-
structed here is actually a sequence detector program, while the sequence it is trying to
detect is 'CAT' .
H, A, R, I, 0, T C, H, A, R, I, 0, T
Start T
H, R, I, 0, T
H, A, R, I,0
We observe that the state qo is the start (or initial) state, and q
3 is the final state. If the
FA reads any symbol from while in q3
(final state), it does not make a transition to any
other state as it has already recognized the substring 'CAT' by then.
The state transition table for the FA is shown in Table 2.11.
-••••••• •••
Q C H A R I 0 7'
go q1 go go go go go qo
q1 q1 Q0 Q2 Q0 Q0 qo Q0
q2 Q1 Q0 Q0 Q0 Q0 Q0 q3
q3 q3 q3 Q3 Q3 Q3 q3 q3
Example 2.6 Design an FA that reads strings made up of {0, 1} and accepts only those
strings which end in either '00' or '11'.
Solution Here, the alphabet E is given as:
{ 0, 1}
We need to construct an FA that is capable of reading all possible strings over 2, but accepts
only those strings which end in '00' or '11'. This is yet another example of a sequence
detector (refer Fig. 2.12).
The FA has two final states: One that accepts strings ending with '00', that is, state q2,
and the other that accepts strings ending with `11', that is, state q4.
In Fig. 2.12, states q1 and q2 respectively indicate that the substrings '0' and '00' have
been read. Similarly, states q3 and q4 respectively indicate that the substrings `1' and '11'
have been read. The self-loop in state q2 on symbol '0' indicates that any string that ends
in three or more 0's anyway ends with two 0's; thus, once two 0's have been read, more
0's does not change the acceptance criteria—that the string should end in '00'. Similarly,
one can explain the self-loop on state q4 on symbol '1'.
The states ql and q2, upon reading '1' transit to state q3, indicating that one '1' has been
read. Similarly, states q3 and q4 transit to state q1 upon reading the '0', indicating that the
string ends with '0'. If q1 reads one more '0', then it transits to the final state q2, indicating
that the sequence ends with '00'.
The state transition table for the FA in Fig. 2.12 is shown in Table 2.12.
go q1 q3
Q1 42 43
q2 Q2 q3
1
q3 41 44
Figure 2.12 FA accepting strings q4 Q1 Q4
which end in either '00' or '11'
FINITE STATE MACHINES 35
Let us trace the FA for input, '111' . The sequence of states starting with the initial (start)
4, which is a
state q0 will be: qo --> q3 -->q4 --> q4. Thus, the last state in the sequence is q
final state, and hence, the input '111' is accepted by the FA.
. On input '0', the FA
Now, let us trace the input, '0110', starting with the start state q0
makes a transition from state qo to state q t . On reading the next symbol, that is, '1', it
3
changes to state q3. Similarly, for the third symbol, that is, `1', it switches from state q
to state q4. Though q4 is a final state, there is one more symbol, that is, '0', that needs to
be read; and which is the last symbol of the string '0110' . On reading the last symbol, the
FA makes a transition from state q4 to state q1 . Thus, the sequence of states for the input
string '0110' is: qo ---> q1 —> q3 —q4 --> q1. As q1 is a non-final state, and is the last state
of the sequence after reading the input string '0110', the string is not accepted (that is, it
is rejected) by the FA.
2.3.6 FA as Finite Control
The FA can be visualized as a finite control with the input string written on a tape, and
each tape cell containing one symbol from the string; let '$' denote the end of the string.
A pointer or head points to a cell on the tape from which
Cells Input symbols
the next symbol will be read. The head is labelled by the
/1.
DORIES end of input state
Indicates label that represents the current state of the machine
Tape
(refer to Fig. 2.13).
Denotes symbol
to be read next The FA reads symbols one by one from a finite tape,
Current state whose end of string is indicated by '$'. Let us say the FA in
Fig. 2.12 resides in state q2 at a given instance, and is about
Figure 2.13 Finite control representation of FA
to read the symbol '0' from the tape cell. After reading the
symbol, the FA's head will point to the next adjacent cell on the right-hand side, and will
transit to the next state (which may or may not be the same as the current state). This means
that after reading the current symbol in the cell, the head always moves to the right to read
the next symbol. If it reads `$' (indicating the end of input), and at this stage, if the FA is
in the final state, then the input string is accepted, otherwise it is rejected.
Theorem 2.1
Every FA can be represented using a transition graph (TG), but not every TG satisfies the
definition of an FA.
Proof
Consider the transition graph consisting of only one node as shown in Fig. 2.14(a).
0 ® (b)
Figure 2.14(a) represents a TG that accepts no string—not even
an empty string (e) of zero length—as there is no final state. This is
(a) because, in order to be able to accept anything, the FA must have
Figure 2.14 Transition a final state. Hence, the TG in Fig. 2.14(a) cannot be called an FA.
graph and FA (a) TG
Now, Fig. 2.14(b) represents an FA, which accepts only the empty
(b) FA
string, that is, e, because its initial and final states are same—there is
only one state. We also observe from the figure that there is no other transition. Therefore,
Fig. 2.14(b) represents the TG for an FA accepting only e string.
Hence, the theorem is proved.
36 THEORY OF COMPUTATION
We have already mentioned about the deterministic finite automata (DFA) in Section 2.3.
An FA is said to be deterministic if for every state there is a unique input symbol, which
0 takes the state to the required next unique state. This means that given a state
SS, the same input symbol does not cause the FA to move into more than one
state—there is always a unique next state for an input symbol (read, for any
state transition. Refer to Fig. 2.15).
Figure 2.15 shows that from state So, there is only one transition on reading
symbol '0' and symbol '1'. Similarly, for states Si and S2, there are unique
0 transitions on reading symbols '0' and . No state exists from which there
Figure 2.15 Example DFA is more than one transition for the same input symbol. Therefore, the example
FA, represented using Fig. 2.14, is a DFA.
We now see that all the examples we have seen before (i.e., Figures 2.7, 2.11, and 2.12)
are DFAs.
M = (Q, E, S, go, F)
where, Q, , go, and F have their usual meanings.
The only change is the state transition function, 8, that maps from Q X
E to 2Q:
8 : Q X ---> 2Q
For an example, refer to Fig. 2.16. The FA shown in the figure is an NFA.
We observe that from state go, on reading the input symbol '0', there are two
Example NFA different transitions: first to state go, and the other to state q1 . There is no
unique next state for the transition on symbol '0' from state go. This means
Figure 2.16 that state go , on reading symbol '0', can go to either q1 or to itself, that is,
'go'. Hence, it is possibilistic; and so, the behaviour of the NFA, can-
Table 2.13 State transition not be predicted.
function, 8, for NFA in Fig. 2.16 The transitions can be represented by the following equation:
6 (go, 0) (go, q1 }
1
Q We observe from Table 2.13 that even from state q2 there are two different
go 1410,q1} q2 transitions on reading the same input symbol '1', and it is not possible
ql q1 to determine the next state. Therefore, this is called a non-deterministic
gi
(q1, q21 FA or NFA.
q2
FINITE STATE MACHINES 37
s 0
(a) (b)
Figure 2.17 NFA vs DFA (a) Example NFA (b) Equivalent DFA
Figure 2.17(a) shows an example NFA, with two different transitions from state go
on reading symbol '0'. The string is said to be accepted only if, after reading that string,
the machine resides in one of the final states. Therefore, only two strings, '00' and '01' are
accepted by the NFA in Fig. 2.17(a). The second path, on reading symbol '0' from state
go, does not reach the final state anyway.
Thus, the language accepted by the NFA in Fig. 2.17(a) is:
L1 = {00, 01}
Now, the DFA in Fig. 2.17(b) accepts the language:
L2 = {00, 01}
We observe that L1 = L2,
Therefore, the NFA and DFA in Fig. 2.17 are equivalent to each other, and accept the
same language {00, 01 } . Therefore, their power is same.
Let us now study some methods for converting an NFA to its equivalent DFA.
38 THEORY OF COMPUTATION
M= 5, q0, F)
where,
= Q X --> 2Q
The main difference between NFA and DFA is the state transition function, 8. The entries
in the state transition table for NFA are sets, instead of singular entries as in DFA.
Q' = 2Q, as
Hence, while converting an NFA into its equivalent DFA, let us consider
the set of states for the resulting DFA.
Then, the state function for the DFA will be:
: xI
This means that every combination of states can be considered as a new state, and can be
given a new label.
For example, for the DFA equivalent to the example NFA in Fig. 2.16, we consider {q2},
{qo, gi }, {go, q2} as three different states: Let us label them as po, pi, and p2, respectively.
The transitions from these states can be obtained by combining the transitions for the
constituent states. Thus, Q' can be represented as:
= tO, [qo], [q1], [q2], [go, q1], [go, q2], [q1, q2], [go, q1, q2]
The transition 8' ([qo, q1], 1) can be obtained as:
6'([q0, q1], 1) = 5 (go, 1) U 5 (q1, 1) = {[q2] U [q1]} = {q1 , q2}
Thus, we can say that state {go, g1 } for the equivalent DFA makes transition on reading
symbol '1' to state {q1, q2}. If we label these states with the singular labels, as discussed
earlier, the equivalent DFA will contain the unique next states for each symbol on which
the transition is made.
A combination of states is considered as a final state for the resultant DFA, if at least
one of the states constituting the combination is a final state for the given NFA. The initial
state remains unchanged for the resultant DFA.
Let us study some examples to explore this further.
Example 2.7 Convert the NFA: M = [{go, g1 }, {0, 1}, 8, go, {gi
}1, where 5 is as shown in
Table 2.14, to its equivalent DFA.
Solution For the given NFA, Q = {go, q1}, = {0, 1
}, F = {q 1 ), and initial state is go.
Let us denote the resultant DFA as:
where,
= {0, 1} and the initial state, go, are the same as that of the given NFA--the entry point
cannot be changed if the language accepted is required to be same for equivalence.
FINITE STATE MACHINES 39
Table 2.14 State transition table The power set of Q is given by:
for example NFA )
2Q = 10, [go], [q1], [go, q1]
As per the method discussed, the finite set of states for the resultant
1 (and equivalent) DFA, that is, Q' can be written as:
go {go.gi}
{[gob [qii, [go, q1] }
qi tgo,q11
We see that 4), is excluded, as it does not denote any state of the given
NFA, and hence, is irrelevant.
Now, we need to find the state transition function for the DFA:
5': Q' X I Q'
This means that we need to find all the transitions from every state in Q' on reading both
the input symbols '0' and '1'.
From Table 2.14, which shows the state table of the given NFA, we have:
Table 2.15 State transition table 8' ([qo, q1 ], 0) = [8 (go, 0) U 8 (q1, 0)] = [Igo, q1) U = [go, q1]
for resultant DFA 6' ([q0, q1], 1) = [5 (q0, 1) U 5 (q1, 1)] = [{q1} U { qo, q1}] = q0,
ql }] [go, qi]
0 1
Using the aforementioned information, the state transition table for
Q'
resultant DFA is written as shown in Table 2.15.
[go] [go, qi] [q1]
The transition diagram for the resultant equivalent DFA can be drawn
{q1] [go, qi] as shown in Fig. 2.18(a).
[go, qi] [go, qi] [go, q As qc, is the initial state of the given NFA, {go} is the initial state of
the resultant DFA. Similarly, `q1 ' is the final state for the given NFA;
(a)
(b)
Figure 2.18 DFA equivalent to the NFA in Table 2.14 (a) Equivalent DFA
(b) Equivalent DFA after relabelling
40 THEORY OF COMPUTATION
hence, for the resultant DFA, we need to consider all such states, which contain q ias one
of the constituents of the final states.
Therefore, = qi, [q0, qi1)
The machine necessarily should only have one initial state. However, it can have multiple
final states.
Let us now change the labels of the states for the DFA as:
Example 2.8 Convert the NFA [{p, q, r, s}, (0, 1}, 5, p, {s)] to its equivalent DFA, where
the state transition function, 5, is as shown in Table. 2.17.
Solution For the resultant DFA, = {0, 1 }, the initial state is also p, and
Q' = {p, q, r, s, pq, pr, ps, qr, qs, rs, pqr, pqs, prs, qrs, pqrs}
Table 2.17 State transition Note that we have excluded 4 (null set) from 2Q. All the states having 's'
table for example NFA as one of the constituents are the final states for the resultant DFA, because
s' is the final state for the given NFA. Therefore, the set of final states for
the resultant DFA is given by:
Q 0 1
= {s, ps, qs, rs, pqs, prs, qrs, pqrs}
q p
q r r We can relabel the states from Q' by numbers from 1 to 15, in sequence.
For the states p, q, r, and s, the state transition function, 5'. can be directly
obtained from the given state transition table in Table 2.17.
For example; 5' (p, 0) = pq
5' (q, 1) = r
5' (s, 1)=s
For the combination states, which have two or more symbols together representing a state, 8'
can be obtained from the union of their respective transitions in the NFA state transition table.
For example:
8' (pqr, 0) (p, 0) U S (q, 0) U S (r, 0)] Up, q} U {r} U {s}} = [pqrs]
Thus, the entire state transition table for the resultant DFA is as shown in Table 2.18.
Observe that along with the normal state symbol, the new labels are also shown in the
brackets. The symbol, "0 ', indicates the final states.
FINITE STATE MACHINES 41
I
0 1
Q'
(1) p pq (5) p (1)
(2) q r (3) r (3)
(3) r s (4)
*(4) s s (4) s (4)
(5) pq pqr (11) pr (6)
(6) pr pqs (12) P (1)
*(7) Ps pqs (12) ps (7)
(8) qr rs (10) r (3)
*(9) qs rs (10) rs (10)
*(10) rs S (4) s (4)
(11) pqr pqrs (15) pr (6)
*(12) pqs pqrs (15) prs (13)
*(13) prs pqs (12) ps (7)
*(14) qrs rs (10) rs (10)
*(15) pqrs pqrs (15) prs (13)
New labels
`*' : Final states
Equivalent States
Two or more states are said to be equivalent states if they undergo the same transitions on
reading the same input symbol (true for all the transitions) and are of the same type, that is,
either final states or non-final. This means that all these states move to the same next state
after reading the same input symbol. Furthermore, they all are of the same type—either
final or non-final). Hence, if one or more states have the same transitions, but they are of
different types—that is, one is a final transition and the other, a non-final transition—they
are not considered as equivalent states.
The equivalent states are used to minimize the DFA: If there is more than one state
that is equivalent, we can remove all the equivalent states but one. The one state thus kept
replaces all other equivalent states in the DFA. However, there are certain rules that we
need to follow.
42 THEORY OF COMPUTATION
I II
Table 2,20 Further
minimization of the state
transition table
0 1
1 5 1
2 3 3
3 4
(a)
*4 4 4
5 11 6
6 7• 1
7• 7
8 4 3
11 7• 6
Unreachable States
A state is said to be an 'unreachable state' if it cannot be reached from the initial state on reading
any input symbol. These unreachable states do not take part in string acceptance, and hence
must be removed in order to minimize the DFA; they are like pieces of dead code that never
get executed in any run of a computer program. For example, the states labelled 2, 3, 4, and
8 in Fig. 2.19(a) are unreachable states; they make a disconnected graph, as discussed earlier.
Hence, these states along with their transitions are removed in order to minimize the DFA.
Method II, which is described in this section, is much easier and direct—it takes le
number of steps and time to build an equivalent DFA using this method. Instead sser
g the set Q' = 2Q and then removing the states that are not required, this m a con,
ethod.'
considers only those states (or combinations of states) that are required. The effort requi
red
for minimization is hence lesser compared to the previous method.
This method directly starts with the transition diagram. Instead of considering all p
os
sible states, it starts finding the transitions, one state at a time. If the next state of a given
transition is a new combination, then it gets added to the set of states for the resultant
DFA. Due to this, we never get a disconnected transition graph as we obtained in method I
(refer to Fig. 2.19).
Let us now look at some examples to illustrate this method.
Example 2.9 Convert the NFA: M = {{go, gr }, {0, 1}, 8, q0, {g1 }] where, the state transi-
tion function, 6, is given in Table 2.21, to its equivalent DFA.
Solution This is the same example we considered earlier (refer to Example 2.7).
The transition graph for the given NFA is as shown in the Fig. 2.20.
We observe from Fig. 2.20 that state go, on reading input '0', makes transition either
to state qo or state q1, that is, multiple states. This means that for the equivalent resultant
Table 2.21 State transition table DFA, we need the combination state [go q1]. Let us therefore create a
for example NFA new state with label `go q1 ' as shown in Fig. 2.21(a). As this new state
contains the label q1, which is the final state for the given NFA, we mark
this state as the final state for the resultant DFA, as shown—with double
Q 0 1 concentric circles—in Fig. 2.21(a).
q0 {go, 41} fq11 State qo on reading input '1' goes to only `q1 ', therefore, we add one
{eh, qi } more state to the resultant DFA labelled `gt ', which is also a final state
41 4)
(refer to Fig. 2.21b).
Since there is no transition from state q1 on reading input '0' for the
given NFA, therefore, no transition from state q1 is labelled with a '0'.
However, state q1, on reading input '1', goes either to state go or state q l;
therefore, we show a transition from state q1 to state qo q1. Since, state q0q1
Figure 2.20 TG for example already exists as a part of the resultant DFA, we do not need to create a new
NFA given in Table 2.21 state (refer to Fig. 2.21c).
Now, there is only one state go gl for which the transitions need to be realized:
Similarly,
Solution This is the same example we considered earlier (refer to Example 2.7).
The transition graph for the given NFA is as shown in the Fig. 2.20.
We observe from Fig. 2.20 that state go, on reading input '0', makes transition either
to state go or state q1, that is, multiple states. This means that for the equivalent resultant
Table 2.21 State transition table DFA,
we need the combination state [go q1 ]. Let us therefore create a
for example NFA new state with label `go q1 ' as shown in Fig. 2.21(a). As this new state
contains the label q1 , which is the final state for the given NFA, we mark
this state as the final state for the resultant DFA, as shown—with double
0 1 concentric circles—in Fig. 2.21(a).
qo {go, .71} {q1 State go on reading input '1' goes to only `g1 ', therefore, we add one
q1 go, 411 more state to the resultant DFA labelled `g1 ', which is also a final state
(refer to Fig. 2.21b).
0 Since there is no transition from state q1 on reading input '0' for the
1 given NFA, therefore, no transition from state q1 is labelled with a '0'.
However, state q1, on reading input '1', goes either to state go or state q1;
therefore, we show a transition from stateg i to state go 'qv Since, state q
Figure 2.20 TG for example o .71
NFA given in Table 2.21
already exists as a part of the resultant DFA, we do not need to create a new
state (refer to Fig. 2.21c).
Now, there is only one state go q1 for which the transitions need to be realized:
(b) (c)
(a)
(d) (e)
Figure 2.21 DFA construction from the given NFA (a) Step 1 (b) Step 2 (c) Step 3 (d) Step 4 (e) Final DFA
Since, there is no state remaining to be considered for finding the transitions, Fig. 2.21(d)
could be the final DFA. However, according to our convention, we change the labels for
the states so as to make them singular labels: We label state go as 'A' ; state q1 as '13' ;
and state qc, q1 as C. Thus, we draw the transition graph for the final DFA, as shown in
Fig. 2.21(e).
We observe that this DFA is exactly the same as that we have obtained by applying the first
method—refer to Fig. 2.18(b). The state transition table will also be the same as Table 2.16.
For the aforementioned example, we see that the complexity of both the methods is the same,
as there are only three state combinations to work with, and all are required combinations.
Let us now look at some non-trivial examples to illustrate that method II is more efficient
than method I.
Example 2.10 Construct DFA equivalent to NFA: M = q, r, sl, {0, 1), 8, p, (q, s}]
where, 8 is given in Table 2.22.
Solution The transitions from the states p, q, r, and s for the resultant
Table 2.22 State transition table
DFA can be obtained directly from Table 2.22, which are:
for example NFA
(p, 0) = qr (new state required) (p, 1) = q
81 (q, 0) = r (q, 1) = qr
Q 0 1
(r, 0) = s (r, 1) = p
g, r q (s, 0) = (1) (s, 1) = p
q, r
r where, 3' is the state transition function for the equivalent DFA.
These transitions are reflected in Fig. 2.12(a), with five states p, q, r,
s, and qr, of which q, s, and qr are the final states because for the given
46 THEORY OF COMPUTATION
(a) (b)
(d)
Figure 2.22 NFA to DFA conversion steps (a) Step 1 (b) Step 2 (c) DFA (without relabelling)
(d) Final DFA
NFA, F = { q, s}. Hence, for the resultant DFA, we mark those states as final, which contain
either q, s, or both the symbols.
The initial state will be same as that of the given NFA, that is, p—refer to Fig. 2.22(a).
Now, we have finished with transitions from states p, q, r, and s. However, there is one new
state qr that we need to introduce as per the method, and find the transitions from the same:
8' (qr, 0) = 5 (q, 0) U 8 (r, 0)
= {r) U {s}
= rs (new state)
Similarly,
5' (qr, 1) = 8 (q, 1) U 5 (r, 1)
= {q, r} U {p }
= pqr (new state)
Thus, we must add two new states, rs and pqr, and both are final states. Let us now find
transitions for both these states:
5' (rs, 0) = 5 (r, 0) U 5 (s, 0) = {s} U¢
= s (already existing state)
FINITE STATE MACHINES 47
Formai Definition
NFA with e-moves is denoted by a 5-tuple:
M= (Q, 1, 5, q0, F)
48
THEORY OF COMPUTATION
where,
3 (go, 0) go
6 (go,c) = q1
(2.1)
Therefore, 5 (go, 0e) = 8 (5 (go, 0), e)
8 (go,
qi
Here, 'Oe' means zero concatenated with an empty string, that is, only '0'.
Therefore,
6 (q0, 0e) = 5 (go, 0)
go
(2.2)
From Eqs (2.1) and (2.2), we can write,
5 (go, °) Igo, gi
Similarly,
5 (gi, 1) = { q1 , q2 }
Thus, on a single symbol, from state go, the FA has two different transitions. The same
thing holds for state gi as well. Therefore, the diagram must be an NFA (obviously, With
&moves).
FINITE STATE MACHINES 49
L = Set of strings with zero or more number of 0's, followed by zero or more
number of l's, followed by zero or more number of 2's
It becomes very difficult and may even seem impossible to directly draw an NFA or DFA
from the aforementioned language description. However, we may observe that the NFA
with c-moves in Fig. 2.24 accepts exactly the same language described here. Hence, it
becomes very easy to draw an NFA with c-moves from the language description, with the
help of e-transitions.
Now, let us check whether the NFA accepts the string ' e' , that is, with zero number of
0's, followed by zero number of l's, followed by zero number of 2's.
Let us start with the initial state:
8 (q0, c) = q1
5 (q1, = q2
Therefore, 8 (q0, cc) = q2
Here, 'cc' is nothing but ' e' ;
hence, 8 (q0, e) = q2 (final state)
We see that starting from the initial state, q0, and reading string e, the machine reaches
the final state q2; hence, the machines accepts `e'.
Let us now check the acceptance of '002'—two number of 0's, followed by zero number
of l's, followed by one number of 2's:
5 (q0, 0) = q0
5 (409 0) = q0
8 (q0, c) = q1
8 (q1, e) = q2
3 (q2, 2) = q2 (final state)
Therefore, 8 (q0, 00ee2) = q2
i.e., S (q0, 002) = q2
This means that '002' is a valid string, as q2 is the final state, which is reached upon read-
ing the input string '002'.
Thus, we see that NFA with c-moves helps us split complex language acceptance prob-
lems into smaller ones. The solutions to these problems can then be integrated with the
help of c-transitions.
For example, in Fig. 2.24, state q0 accepts inputs consisting of zero or more number of
`O's; state q1 accepts inputs with zero or more number of 's; and state q
2 accepts zero or
more number of '2's. Hence, the original language L
now is formed by sequencing these
three parts, so as to make one follow the other.
50 THEORY OF COMPUTATION
The state transition table for the NFA with e-moves in Fig. 2.24 is as shown in Table 2.25.
Table 2.25 State transition table for NFA with e-moves in Fig. 2.24
u {e}
Q 0 1 2
go {go) 4 4 {q1}
q1 4 (q1} 4) {q2}
q2 (1) 4 {q2}
Here, 'go' is also added to the set because every state is at distance zero from itself.
Similarly, e-closure (q1 ) = {q1 , q2 } , and &closure (q2) = {q2}.
We use a separate denotation, g, to represent the &closure of a state. It is defined as:
Example 2.11 Convert the NFA with &moves in Fig. 2.24 to its equivalent NFA without
&moves, accepting the same language.
Solution Using the definition of &closure, we have:
&closure (q0) = {q0, q1, q2}
&closure (q1) = {ql, q2}
&closure (q2) = {q2}
We observe that `q2' is a member of &closure (q0), which means that `q2' is at zero distance
from 'go '. Similarly, `q2' is at distance zero from `q1' as well. Note that `q2' is the only
final state for given NFA with &moves:
F = {q2}
From, this observation, the set of final states for the resultant equivalent NFA without
&moves is given by:
= {q0, q1, q2}
Now, let us find the state transition function, 8', for the resultant NFA. This can be obtained
with the help of the rule in Eq. (2.1):
5' (q0, 0) = &closure (6 (g (q0, e), 0))
= &closure (5 ({q0, q1, q2 } , 0))
= &closure (5 (q0, 0) U 8 (q1, 0) U 5 (q2, 0))
= &closure ({ q0 ) U 4. U 4))
= &closure (q0)
{go, qi, q2}
8' (q0, 1) = &closure (8 (g (q0, e), 1))
= &closure (5 ({q0, q1, q2}, 1))
= &closure {8 (q0, 1) U 8 (q1, 1) U S (q2, 1)
= &closure U {q1} U 4))
= &closure (q1)
{q1, q2}
5' (q0, 2) = &closure (8 (g (q0, e), 2))
= &closure (5 ({q0, q1, q2}, 2))
= &closure (8 (q0, 2) U 6 (q1, 2) U 8 (q2
, 2))
52 THEORY OF COMPUTATION
8' (q2, 1)
8' (g2, 2) = {q2}
Table 2.26 State transition table for NFA From the aforementioned transitions, we construct the state
without c-transitions transition table for the resultant NFA without e-transitions as
shown in Table 2.26.
The resultant NFA without &moves can be represented by
Q 0 1 2
a TG shown in Fig. 2.25(a).
go Igo, g1,g21 {g1,g2} {g2} We observe that there is a transition from state qo to state
gi 4, {gi, g2} {q 2} q1 , labelled '0, 1', meaning that there are two transitions from
q2 4, {q2} state qc, to state q1; one on reading input symbol '0', and the
4
other on reading input symbol '1'. Here they are combined
for simplicity. For a more specific diagram, refer to Fig. 2.25(b), where all transitions are
shown separately.
(a) 2
Figure 2.25 NFA without e-moves (a) Resultant NFA without 6-moves (b) NFA without
c-moves (all transitions separately shown)
As we have already discussed, `q2' is a final state here because it is also a final state
for the given NFA with e-moves. In addition, 'go' also becomes a final state for the
resultant NFA without &moves, because &closure (0 contains `q2', which is the final
state of given NFA with e-moves. For the same reason, `q1 ' is also a final state now for the
resultant NFA without &moves.
FINITE STATE MACHINES 53
Example 2.12 Construct an equivalent DFA for the NFA with &moves shown in Fig. 2.24.
Solution For the given NFA with &moves in Fig. 2.24, we have already constructed an
equivalent NFA without c-moves in Fig. 2.25.
For converting this NFA without c-moves in Fig, 2.25 to its equivalent DFA, let us use
conversion method II (refer to Section 2.6.3).
From Table 2.26, which is the state transition table for the NFA in Fig. 2.25, we can
directly write the state transition function, 51 , for the resultant DFA:
These transitions are shown in Fig. 2.26(a); here, transitions from the states 'go q1
`q1 q2' are unprocessed. q2 ' and
The new states `q t q2' and 'go q1 q2' are also final states, because 'g
o', `q1 ', and `q2' are
the final states of the original NFA.
Now,
Si (qi, q2, 0) = 5' (qi , 0) U 5'
(q2, 0) = 4. U
= ort)
54
THEORY OF COMPUTATION
(a) (b)
(c)
Figure 2.26 Construction of DFA from the given NFA (a) Step 1
(b) Step 2 (all transitions shown, no state unprocessed) (c) Final DFA
S1 (g1 g2,1)=
6'(g1 ,1 ) U 6'(g2,1)={g1 ,g2 }U4)
= q1 q2 (already existing state)
61 (q1 q2, 2) = 6' (q1 , 2) U (q2, 2)
= {.7 2 } U {42}
= q2 (already existing state)
Since we are now left with no new state, the modified state transition diagram for the
equivalent DFA, with all the states and their respective transitions, can be drawn as shown
in Fig. 2.26(b). The state transition table for the same is as shown in Table 2.27.
From the table, we observe that we can replace state 'go q1 q2' by state 'go', and state
`q1 q; by `q1 ' as these are equivalent states. We now have the minimized state transition
table as shown in Table 2.28. The TG for this can be drawn as in Fig. 2.26 (c).
Example 2.13 Consider the NFA with c-moves shown in Fig. 2.24. Construct the DFA
equivalent to it using direct approach.
Solution Let us begin with the start state go of the given NFA with e-moves. We
create a new state as a part of the resultant DFA; the first part of the label is 'go', and
the second part includes all the reachable states of state go, that is, `g1, q2 ', as shown
in Fig. 2.27(a).
no 1
0
01 ql, q2 q2
(c) (d)
Figure 2.27 DFA construction from NFA with
c-moves (a) Step 1 (b) Step 2
(c) Step 3 (d) DFA
56
THEORY Of COMPUTATION
Since state q2
state go , which is a final state for the given NFA with c-moves, is r
; we mark this new state as a final state. eachable from
We now proceed to find the transitions from this new state:
Let us create new states if we find any new combination for the next state. From Table ,
which represents the state transition function, 5, for the NFA with e-moves in 2.25
we can directly write the state function, 51, for the resultant DFA. Fig. 2.25,
We observe that in the given NFA with c--moves, state qo on reading the input sy
mbol
`0' goes to itself. We now proceed with the next symbol in the same state label, that is
State q1 q1'.
on reading input symbol '0' goes nowhere. Similarly, the other part of the label,
that is, state q2
also does not have any transition on reading input symbol '0'. Hence, th
e
next state is the same state; so we add the self loop on symbol '0'. This can be explai
with the help of the following equation: ned
S1 ([g0, q1 q2], 2) = [8 (q0, 2) U S (q1, 2) U S (q2, 2), reachable states from the
resultant states]
= [4) U U {g2 }, reachable states for q2
]
[q2,
Let us now proceed with the state whose first part is `g1 '. It consists of two symbols-qi
and q2. We observe that state q1 on reading input symbol '1' goes to itself; hence, there is
no need to create a new state; also, it goes nowhere on reading input symbols '0' and '2 •
Likewise, the next state, q2, transits to nowhere on reading input symbols '0' and '1', but
on reading '2', it goes to state q2 which is already created.
FINITE STATE MACHINES 57
Therefore, we have:
In this section, we are going to study the formalism around these two different FSM types.
Formal Definition
A Moore machine is a six-tuple that is defined as follows:
M= (Q, 1, A, 5, A, q0)
58 THEORY OF COMPUTATION
where,
Q: Finite set of states
M: Finite input alphabet
A: An output alphabet
8: State transition function (STF); 3:QXE-->Q
A: Machine function (MAF); A: Q
go: Initial state of the machine
Thus, for a Moore machine, an output symbol is associated with each state. When the
machine is in a particular state, it generates the output, irrespective of the input that caused
the transition.
Let us look at an example to illustrate this machine.
Example 2.14 Construct a Moore machine to find out the residue-modulo-3 for binary
numbers.
Solution If i is a binary number, and if we write '0' after i then, its value becomes 2i.
For example, consider a binary number:
i=1 (value = 1)
If we write '0' after i, then we have:
i.0 = 10 (value = 2 X 1)
As another example, let us we consider:
i = 100 (value = 4)
Then, i.0 = 1000 (value = 8 = 2 X 4)
Similarly, if we write '1' after i, where i is any binary number, its value becomes '2i + 1'.
For example, consider the binary number:
i=1 (value = 1)
Then, i.1 = 11 (value = 3 = 2 X 1 + 1)
As another example, let us consider:
i = 100 (value = 4)
i.1 = 1001 (value = 9 = 2 X 4 + 1)
As we are constructing a machine to determine the remainder (or residue) when we divide
any binary number by 3, the different remainder values that we can have are: 0, 1, and 2.
In Table 2.29, let us consider a case in which the remainder from the previous result is
2—that is, 10 in binary form. Now, if we write 0 after it, it becomes 4—that is, 100 in biliarY
form. When we divide this by 3, the remainder will be 1. In the division process, we read
the input from left to right, one digit at a time. The digit we read is divided by the divisor
(in this example, the divisor is 3). The next digit is then concatenated to the remainder of
FINITE STATE MACHINES 59
Table 2.30 Machine function for Moore this division to form the next number to be divided. We continue
machine, A : Q A this process till all the digits in the input string are exhausted. We
are going to use the same in our computations.
State Q go q1 q2
As we are interested in constructing a Moore machine for which
Output 0 0 1 2 the output depends only on the current state of the machine, we can
associate the three different remainder values with three different
states: remainder 0 with state q0, remainder 1 with state q1, and
remainder 2 with state q2, as described in the MAF shown in Table 2.30.
Now, Table 2.31 shows the state transition table for the required Moore machine. This
is same as Table 2.29 depicting formal notation.
We observe that Table 2.31 is an exact reflection of Table 2.29 that we prepared earlier. The
transition graph for this can be drawn as in Fig. 2.28. In the diagram, along with every state
vertex, there is a symbol written below, which is an output symbol associated with that state.
Q 0
qo qo qt 0 1 2
q 1 q2 q0 Figure 2.28 TG for Moore machine
q2 q 1 q2
Formal Definition
A Mealy machine is denoted by a six-tuple:
M= (Q, /, A, 8, A, q0 )
where,
Thus, for this type of machine, the output depends on both current state and the Curs
rent input symbol. We may recall that this is the same FSM that we have discussed in the
beginning of this topic, that is, Section 2.2. All the examples we have seen earlier in the
section are Mealy machines.
Example 2.15 Design a Mealy machine that accepts the language consisting of strings
from E*, where E = {0, 1}, and ending with double 'O's or double '1's.
Solution Let us assume that the output alphabet, A = {y, n}, indicating whether the
input string is accepted or not: y—that is, yes—will be the output if the string is accepted,
and n—that is, no—will be the output if the string is not accepted by the machine.
Let us assume that the initial state of the machine is 'go'. From state 'go there will be two transi.
tions: one on reading input symbol '0', and the other on reading input symbol '1', as E = {0, 1),
On reading input symbol '0', the machine makes the transition to some other state, say
`po', which looks for a second consecutive zero to get double 'O's. Similarly, on reading
input symbol '1', the machine transits to another state, say `p1', which looks for a second
consecutive one to get double `1's. The situation is represented in Fig. 2.29(a).
0/y
0/y
1/n
ly
1/y
(a) (b) (c)
Figure 2.29 Construction of Mealy machine (a) Step 1 (b) Step 2 (c) Final Mealy machine
In state po if the machine reads '0', it remains in the same state and produces output 'y' as
consecutively two 0's have been read. It continues to remain in the same state on reading more
0's, because the string may be of the form `0000'—any number of 'O's more than two 'Vs
always end in at least two 'O's. Similarly, in state p1 if machine reads 1', it transits to the same
29(b).
state with output `y', indicating it has read double `1's. This state is reflected in Fig. 2.
Now, in state po, if the machine reads symbol '1', it makes a transition to state pp Which,
checks for a second consecutive '1'. Next, in state p p if the machine reads '0', it transits
to state p which looks for a second consecutive '0'. The output in both these cases is n •
o
as the second consecutive letter is yet to be read. .
Here ends the process of finding transitions on both '0' and '1' from every state of the machine
Figure 2.29(c) shows the final Mealy machine that complies with the given requirements.
FINITE STATE MACHINES 61
Example 2.16 Design a Mealy machine for incrementing the value of any binary num-
ber by one. The output should also be a binary number, whose value is one more than the
number given.
Solution In order to obtain the 2's complement of a binary number, we first take the l's
complement and add ' l' to it.
For example, let 1011 be the given binary number. Then we calculate the 2's comple-
ment as follows:
1011 — given binary number
0100 — l's complement
+ 1 — add one
0101 — 2's complement
We use a similar method to design a machine that will add '1' to a given binary number.
We will revise the aforementioned method, as shown here:
.... ****** ••••
1011 Take 1's complement
Then takes 2's 0100) .1. We want to add
complement +
1 J '1' to '0100'
.... 0101 — Result of addition
:Ir
Thus, in order to add '1' to any binary number, we find the l's complement of the given
number and then find the 2's complement of the l's complement that we have obtained.
In other words, given a number, the output should be the incremented result of adding
to the given number. For example:
0100 - Given number
0101 - Incremented result
Instead of first finding the l's complement and then the 2's complement, we can adopt
a simple direct approach, by following the simple steps given here:
1. Read bit by bit from the least significant bit (LSB) of the given binary number. This is
more like reading the input string in the reverse order, that is, from right to left.
2. Keep on replacing the 1 's by 0's, till we reach the first 0 (from the right).
3. Replace this first 0 by 1.
4. After this first 0, keep the remaining bits as they are.
In our example:
010
).First '0' replace by '1'
Q qt)
State qe, is the initial state, which is associated with replacing all l's by 0's till it reaches the
first 0. After reaching the first 0, while reading from right to left, the machine replaces it by
1/0 0/0 1 and moves to the next state ql . State qi on reading '0' generates output 0, and on
on reading '1', generates output 1. It thus ensures that the remaining bits are not changed.
Figure 2.30 depicts the final Mealy machine. For this machine, as we know:
1/1 Q= {go,g1 }
Figure 2.30 Mealy = (0, 1)
machine to increment A= {0,1)
value of a binary
number by '1' Tables 2.32(a) and 2.32(b) respectively represent the state transition function,
5, and the machine function, A.
Table 2.32 STF and MAF (a) 8 :Qx1--)Q
(b) A:QxE-4
1 1
Q 0 1 Q 0 1
go q1 go go 1 0
q1 qi qi qi 0 1
(a) (b)
Example 2.17 Design a Mealy machine to find the 2's complement of a given binary number.
Solution Let us have a simple algorithm similar to that of the previous example:
1. Read bit by bit from LSB (right to left, in the reverse order).
2. Keep the bits unchanged till you reach the first '1' from the right side; do not replace
this first '1' by '0'. That remains unchanged as well.
3. The remaining bits are to be changed from 0 to 1, and from 1 to 0.
The design here again requires two states:
Q = {go, qi}
0/0 on
The initial state, qo reads all the 0's (from right to left) without replacing
1/0 them, till it reaches the first . Upon reading the first '1', it makes a
transition to state q t . In this state, the machine replaces each '0' that is
Figure 2.31 Mealy machine
read, by 1, and every`1' that is read, by 0. The Mealy machine can be
to find 2's complement of a
binary number constructed as shown in Fig. 2.31.
Simulation
Let us now simulate the working of the Mealy machine in Fig. 2.31 on an input binary
number, '1010' (shown in Fig. 2.32). Note that the machine reads the input string from
right to left, one bit at a time.
FINITE STATE MACHINES 63
go 0 go 0
(Initial)
go 41 1
41 0 41 1
41 1 41 0
Input ends Reading
bottom up
The output can be obtained by reading from bottom to top as we are considering the input
in the reverse order, that is, right to left. The output will be '0110'. Hence, '0110' is the
2's complement of '1010'.
Let us check the validity of the answer that we have obtained, by our usual method:
1010 — number given
0101 — l's complement
+ 1 — add 1
0110 — 2's complement
Hence, the answer obtained is correct.
M2 = (Q, A, 8, A
where, A (q, a) = A (5 (q, a))
V (q c Q and V a c I)
In case of a Moore machine, the output only depends on the machine's state. On the other
hand, in case of a Mealy machine, it depends on the machine's state as well as the input
symbol read. Hence, the only difference between the two is the machine function, A.
The aforementioned rule stated, that is, A (q, a) = A (8 (q, a)), associates the output
of the next state, that is, A (5 (q, a)) with the transition for the resultant Mealy machine,
that is, A' (q, a).
Example 2.18 Consider a Moore machine that we have already designed (refer to Fig. 2.28)
for finding residue-mod-3 for any binary number, redrawn as Fig. 2.33(a). Convert this Moore
machine to its equivalent Mealy machine.
Solution Consider the Moore machine in Fig. 2.33(a).
1/1 0/2
We may write:
Note: Let us consider both Moore and Mealy machines shown in Fig. 2.33 for the input sequence
(or string) '1010' of length, n = 4.
After simulation, the output sequence for the Moore machine is:
1 0 1 0
eh __),.. q1 _._).. q2 ___..),. q2 ____20. q1
0 1 2 2 1I
1
Output sequence
go —,..
1 0
q1 --). 1
q2 --y. q2 .--).
0
q1
2 2 1
I 1
Output sequence
(Q, A, 5, A, go)
This conversion is slightly non-trivial. As we cannot determine the output symbol that the
equivalent Moore machine holds in each state; we end up creating all possible combinations
of the state symbols and the output symbols, Q x A. It is almost like creating multiple
variants of the same state that only differ in the associated output symbol.
Each state label, thus, has two symbols: one that is a state symbol and the other its
associated output symbol. Hence, in order to find the new machine function, A', for the
resultant Moore machine, it simply needs to return the associated output symbol:
We see that the output symbol, b, has no role to play in determining the next state. This
means that all the state variants obtained by associating the different output symbols—
[q, b1 ], [q, b2], etc.—make transitions to the same next state.
Another important thing is to decide the initial state for the resultant Moore machine.
Now, there exist multiple variants for the initial state, which could be of the form:
[q0 , bi ], [q0, [q0, bnl, etc., as there are multiple output symbols. As per the
rule, any state [q0 , b01 can be considered as an initial state, where b0 is an arbitrarilY
selected member of A. The output symbol that is associated with the initial state is
irrelevant here, as it gets generated even before any input symbol is read and is hence,
insignificant.
Example 2.19 Consider the Mealy machine that we have designed to accept strings
ending with '00' or `11' as in Fig. 2.29(c), which is redrawn in Fig. 2.34(a). The state
transition tables for this machine are shown in Table 2.36. Construct the equivalent
Moore machine.
FINITE STATE MACHINES 67
Q 0 1 Q 0 1
go n
go Po P i n
y n
Po Po Pi Po
Pi n
Pi Po Pi y
(a) (b)
Solution According to the conversion rule, the resultant Moore machine consists of:
= [Q x A] = { [go, n], [go, y], [P0, nl, [Po, y], [P1 , n], [p1 , yl }
Similarly,
The transition graph for the equivalent Moore machine is as shown in Fig. 2.34(b).
Note that states, [q0, n] and [go, y],
[po, n] and [po, )1] as well as [pi , n] andhave
[p1
the same transitions or behaviour. Similarly,
states generate different outputs. , y] have the same behaviour, except that these
68 THEORY OF
COMPUTATION
Start
No incoming
edges
- - _ - - - --
(a)
(b)
(c)
Figure 2.34 Mealy to Moore conversion (a) Mealy machine (b) Equivalent Moore machine
(c) Minimized Moore machine after relabelling
We may relabel the states as we wish. Note that the state [q0, n] has been arbitrarily
selected as the initial state, from among [q0, n] and [q0, y]. We can remove state [q0, y],
as there are no incoming edges to this state. After removing this state and relabelling the
remaining states, we get the final Moore machine as shown in Fig. 2.34(c).
Example 2.20 Construct Mealy and Moore machines for the following:
For input from, E*, where E = (0, 1), if the input ends in '101', the output should be 'x';
if the input ends in '110', output should be `y'; otherwise, output should be '.e.
Solution This is a simple sequence detector design problem. Let us start with the con-
struction of the Mealy machine:
Consider the first string pattern '101', and draw a minimal diagram as shown in Fig. 2.35(a).
We can modify this diagram to accept '110' as shown in Fig. 2.35(b). The remaining pos-
sible transitions are as shown in Fig. 2.35(c). The STF and MAF for this Mealy machine
are as shown in Table 2.37.
FINITE STATE MACHINES 69
(a)
(b)
(d)
(c)
Figure 2.35 Construction of Mealy machine (a) Step 1 (b) Step 2 (c) Step 3
(d) Mealy machines
Table 2.37 STF and MAF for Mealy machine in Figure 2.34(c)
(a)8:QxI—>Q(b)A:QxX—>ti
E E
Q 0 1 Q 0 1
go go q1 go z z
q1 q2 q4 q1 z z
92 q0 q3 q2 z x
q3 q0 q4 q3 Z Z
q4 q5 q4 q4 y z
q5 q0 q3 Rs z x
(a) (b)
We observe that states q2 and q5 are equivalent; therefore, we replace q5 by q2. The
reduced STF and MAF tables are obtained as in Table 2.38.
The reduced Mealy machine transition graph is as shown in Fig. 2.35(d). The STF and
MAF tables for the equivalent Moore machine can be obtained by using the conversion method
(refer to Table 2.39). Out of the states, [q0, x], [q0, y], and [q0, z], any one can be considered
as the initial state, and the other two can be omitted in order to get the final answer.
70
THEORY OF COMPUTATION
Table 2.38
Reduced STF and MAF tables (a) Q x Q
(b)A:Qx
l
Q 0 1
Q 0 1
go go q1 go z Z
q1 q2 q4 ql Z z
q2 go q3 q2 Z X
113 go q4 q3 z Z
11,3 ge q4 q4 Y z
`.' modified entry modified entry
(a) (b)
Table 2.39 STF and MAF for equivalent Moore machine
(a) : (b) A :
X
Q' 0 1
Q A
Igo, xi [go, z] [gi, z] [go, x] x
WO, Y1 [go, Z] [I11, z]
[q0, y] y
[go, z] [q0, z] Ph, z] [go, z] z
[q1, x] [42, Z] [q4, z] [ql, Xi x
[th y] [q2, z] [q4, z]
[q1, Y] y
[41, z] [q2 , z] [q4. z] [qi, z] z
[42,x] Ego, zi [43,x] [q2, x] x
[q2, y] [40, z] [q3,x] [42, Y] y
[q2, z] [go, z] [q3, x} [q2, z] z
[q3, x] [40, z] [q4, z] [q3, x] x
[q3, Y] [q0, z] [44, z] [q3, y1 y
[q3, z] [q0, z] [q4, z] [q3, z] z
[q4,x] [42,A [44, z] {q4, x] x
[174,A [42, A [44, z] [44, y] Y
[q4, z] [42, Y] [44,z1 [q0, z] z
(a) (b)
Example 2.21 Construct Mealy and Moore machines for the following:
For the input from E*, where = { 0, 1, 2 } , print the residue-modulo-5 of the input treated
as a ternary (base 3, with digits 0, 1, and 2) number.
-
1110114111
Table 2.41 can be directly converted into the state transition table as shown in Table 2.4)
Each column in Table 2.41 is analogous to a row in Table 2.42.
Using the MAF (Table 2.40) and the STF (Table 2.42), we can draw TG for the Moore
machine as shown in Fig. 2.36.
1 2
q0 qo ql q2
q1 q3 q4 q0
q2 ql q2 q3
4
q3 q4 q0 ql
Figure 2.36 Moore machine
qa q2 q3 q4
Now, the equivalent Mealy machine can be obtained using the conversion rule. The state
function, 8, for the resultant Mealy machine is the same as in Table 2.42; and the revised
machine function, A , can be obtained using the rule:
A ' (q, a) = A (5 (q, a))
Using the aforementioned rule, the machine function for the Mealy machine can be written
as shown in the Table 2.43.
Using the STF (Table 2.42) and MAF (Table 2.43), the transition graph for the equivalent
Mealy machine can be drawn as shown in Fig. 2.37.
Q 0 1 2 110
q0 0 1 2
qi 3 4 0
q2 1 2 3
q3 4 0 1
2 3 4 Figure 2.37 Equivalent Mealy machine
q4
A, E,
Example 2.22 Design a Moore machine that will read sequences made up of letters
1, 0, U and will give an output having the same sequences, except that in those cases where
an T directly follows an `E', it will be changed to 'V.
Solution This is again a simple sequence detector problem. Let us first design the Mealy
machine, which we can then convert to its equivalent Moore machine.
E,1, 0, U},
The constraint here is that if 'r directly follows `E' in any string over / = (A,
it must be replaced with 'U' ; the rest of the symbols remain as they are.
Step 1 in Fig. 2.38 takes care of identifying the special case, where T directly follows
`E' in any string over / = {A, E, I, 0, U}.
State q0, on reading input symbol `E' moves to a new state q1 just to remember that
`E' has been read. Now, in state q 1 if 'I' is the current symbol read, that means, it is the
symbol that directly follows the previous symbol ' E' ; hence, it needs to be replaced with
`II' as shown in Fig. 2.38(a). State q1 , on reading symbol `E' retains the same state, as the
next symbol could be `I'-a string might contain the pattern ; hence the self loop
from state q1 on reading symbol ' E' .
A A I-I E-E
/ -01
A --A
Start
Start
E-)-E
0
u,u I-.0
(a) (b)
Figure 2.38 Mealy machine as a sequence detector (a) Step 1 (b) Step 2
Figure 2.38(b) gives the complete Mealy machine. It keeps the rest of the symbols as
they are. We now need to convert this Mealy machine to its equivalent Moore machine.
As we know, the states of the equivalent Moore machine can be obtained as [Q X A],
where Q is the set of states and A is the output set of the Mealy machine. Thus, the STF
and MAF for the equivalent Moore machine can be obtained as shown in Table 2.44.
Table 2.44 STF and MAF for the equivalent Moore machine (a) A' : Q' x A (b) 3' : Q' x -. 0'
2
Q Q' A A E I o U
[q0 , A] A [go, A] [go, Al [91, E] [go, I] [go, 01
[go, U]
[go, El E [g o, E] [q0, A] [qi, E] [q0, I] [q0, 0] [q0, U]
[go, l] I [q0, I] [q0, A] [q1 , E] [go, 11 [go, 0] [q0, ul
[q0, 0] 0 [q0, 0] [q0, A] [ql , E] [90, I] [go, 0] [go, U]
[q0, 11 U [go, U] [go, Al [q1, E] [Igo, I] [go, 0] [q0, Ul
[q1 , A] A [q1, A] [q0, A] [q1 , E] [go, U] [go, 0] [q0, U]
[q1 , E] E [q1, E] [q0, A] [q1, E] [go, U] [q0, 0] [q0, U]
[q1, I] I [gi, I] [go, Al [q1, El [go, U] [go, 0] [go, U]
[q1, 0] 0 [q1 , 0] [q0, A] [q1 , E] [go, U] [go,
O] [go, U]
[q1, U] U [q1 , U] [q0, A] [q1 , El [go, U] [go, 0]
[go- U]
(a) (b)
74 THEORY OF COMPUTATION
In Table 2.44(b), if we consider [q0, A] as an initial state, the states, which can be i ncluded
in the minimized Moore machine, are: [q0, A], [q1, El, [go, 1], [90' 01, and [go, U]• No trans
tion from the initial state will ever reach any other state. It is clear from Table 2.44(b) that
these are the only five states that are the next states for all the transitions.
Let us now relabel the states as:
A E I O U
Q A
P0 A Po Po P1 P2 P3 P4
P1 E P1 PO P1 P4 P3 P4
P2 I P2 PO P1 P2 P3 P4
P3 0 P3 PO P1 P2 P3 P4
P4 U P4 PO P1 P2 P3 P4
(a) (b)
Table 2.45(a) gives the machine function and Table 2.45(b) gives the state transition
function for the equivalent Moore machine.
Example 2.23 Consider the Moore machine described by the state transition table given
in Table 2.46. Construct the corresponding Mealy machine.
Table 2.46 State transition table for a Moore machine
0 1
0 0 0 -I-0
q1
0 1 Figure 2.39 TG for the equivalent Mealy
Q2
machine
q3 0 1
Algorithm
If M and M' are two FSM's over I, where = n (`n' is the number of input signals);
then, apply the following steps to check the equivalence of M and M':
1. Construct a comparison table consisting of (n + 1) columns. In column 1, list all
ordered pairs of vertices of the form (v, v ), where v C M, and C M'.
In column 2, list all pairs of the form (v., va ), if there is a transition labelled a C I
leading from v to va and from v to va .
In column 3, list all pairs of the form (vb, vb:), if there is a transition labelled b C
leading from v to vb and from v to vb . Repeat this for all 'n' symbols from M.
2. If the pair (v., va i ), has not occurred previously in column 1, place it in the column 1 and
repeat step 1 for that pair.
Repeat step 2 for each pair, (vb, vi;), (vc , 0, and so on.
76 THEORY OF COMPUTATION
1 3. If in the table, suppose a pair (v, v') is reached in which, v is the final vertex (or fi1141
or vice-versa, then stop and declare
state) of M, and v' is a non-final vertex of M'; that
the two FSMs, M and M', are not equivalent.
, • • • 9 n that do not
Otherwise, stop when there are no more new pairs in columns 2, 3
and A/' are equivalent.
occur in column 1. In this case, the two FSM's M
and M', are given in Fig. 2.40. Check the equivalence of the
Example 2.24 Two FSMs, M
two FSMs by applying Moore's algorithm.
FA, M
(a) (b)
Figure 2.40 Example FSMs (a) FSM M (b) FSM
Solution For the given FSMs,
= {0, 1)
Hence, 121 = n = 2
Table 2.49 Comparison table for FSMs As per the algorithm, we must create (n + 1) = 2 + 1 = 3 columns
in Fig. 2.45 labelled (v, v'), (v0, v0'), and (v1 , v.1).Table 2.49 shows the comparison
between M and M'.
(v, (vo, vo') (v1, v;) We begin with the pair of initial states (q0, q01 ). State (go, q0'),
(q0, 401) (go, go') (q1, g31) on reading input symbol `0', goes to state (q0, q0'), where both go
and qo are final states; and on reading input symbol ' 1 ', it goes to
(q1, q39) (q2, g11) (go, g2')
(q1, q31 ), where both gl and q3 ' are non-final states.
Therefore, we can proceed to the next step: we observe that (q0, q0') is already there
in column 1, but (q1 , q31 ) is not there. Therefore, we place it in column 1 and repeat the
procedure. State (q1 , q31 ), on reading input symbol '0', goes to state (q2, q1 ')—here, both q2
and q1 are non-final states; and on reading input symbol ` it goes to state (q0, g2 )—here,
q0 is the final state in M, while q2 ' is a non-final state in M'; therefore, we stop here and
declare that the given FSMs, M and are non-equivalent.
Example 2.25 For the two FSMs given in Fig. 2.41, check the equivalence using Moore's
algorithm.
Solution For the given FSMs,
2 = {0, 1}
FINITE STATE MACHINES 77
FA, M
(a) (b)
Hence, 1E1 = n = 2
Therefore, we need to create (n + 1) = 3 columns labelled (v, v'), (vo, v01 ), and (v1, v1').
The comparison between the two FSMs, M and M', can be made as in Table 2.50.
We begin with both the initial states (p,p ), which yield a
Table 2.50 Comparison table for FSMs in Fig. 2.46 new pair, (q, q'), for the transition on reading input symbol
(v, v') (vo, v0') (v1, v1) '1'. Placing (q, q') in column 1 and finding the transitions
for input symbols '0' and '1', we see that the next states
(P , P i ) (P, 11) (q, q') are: (r, s') and (q, r')—both new pairs.
(q,q' ) (r, s') (q, r') Let us first consider (r, s'), and place it is column 1,
(r,s') (p, p') (q, q') which yields pairs (p, p') and (q, q') for transition on
reading input symbols '0' and '1' respectively—these are
(q, pi ) (r, s' ) (q, r')
already there in column 1.
Therefore, we consider the only remaining new pair,
(q, r' ), and place it in column 1. Neither of the transitions from (q, r') yield any new pair.
Hence, there is no other new pair to process; so, we stop, and declare that the FSMs, M
and M', in Fig. 2.46 are equivalent.
(a) (b)
Figure 2.42 DFA minimization (a) Example DFA (b) Minimized DFA
Let us consider the DFA in Fig. 2.42(a). We want to minimize this DFA. The state
transition table for this DFA is as shown in Table 2.51.
First draw a table as shown in Table 2.52 and put an 'X' (cross) for all combinations of
final and non-final states. Here, consider combinations of 'C' (final state) with all other states.
Now, start with the last column, that is, with combination (G, H).
(G, 0) = G 5 (G, 1) = E
and ,(E, C)
3 (H, 0) = G 5 (H, 1) = C
there is already an 'X' marked in the table, which means that these two
For pair (E, C),
states are non-equivalent. Therefore, we put 'X' for the pair (G, H).
FINITE STATE MACHINES 79
Moreover, we have seen a finite control representation of the FSM, where the read head
always moves one position to the right after reading an input symbol (refer to Section 2.36,
The head can never move in the reverse direction. Therefore, the FSM cannot retrieve
what it has read previously, before coming to the current position on the tape; and since
it cannot retrieve anything that is read earlier, it cannot remember them. This also means
that the FSM eventually will always repeat a state or produce a periodic sequenceofstatei:
State determination Since the initial state of an FSM and the input sequence given to
it determines the output sequence, it is always possible to discover the unknown state
which the FSM resides at a particular instance.
Impossibility of multiplication As we have seen, an FSM cannot remember arbitrarily
long sequences. We know that during multiplication operation, it is required to remember
two full sequences corresponding to the multiplier and the multiplicand. Moreover, while
multiplying, it is also required to store the partial sums that are obtained during the inter-
mediate stages. Therefore, no FSM can multiply, given any two arbitrarily long numbers.
This is because, essentially, an FSM does not have any memory.
Impossibility of palindrome recognition No FSM can recognize a palindrome string,
because it does not have the capability to remember all the symbols it reads until the
half-way point of input sequence. Hence, it cannot match them in reverse order, with the
symbols in second half of the sequence. This is true even if we assume that the given FSM
can recognize the mid-point of the sequence, which is also actually impossible.
Impossibility to check if parentheses are well-formed The aforementioned reason holds
here also. As an FSM has no capability to remember the earlier input symbols that it reads,
it cannot compare with the remaining input symbols to check for well-formed parentheses.
This is an impossible task for any FSM.
In order to accomplish all these complex jobs, we require a more capable and power-
ful machine, which is has a finite number of states, but unlimited memory to remember
arbitrarily long sequences. Further, its head should be able to move to the left as well as
to the right, that is, in both directions, so that it can read whatever it has stored on its tape.
In other words, it should have a capability to retrieve whatever is stored in the memory.
Since the FSM lacks the ability to store, we can say that an FSM is a program without
variables and without assignment statement.
Example 2.26 Construct an NFA that accepts any positive number of occurrences of vari-
ous strings from the following language L:
Solution This is a typical sequence detector problem that we have earlier solved in
the context of Mealy machines. Let us construct an NFA, which moves to a final state if
FINITE STATE MACHINES 81
Note: We may notice that Figure 2.42 is actually a DFA. Every DFA is, in a way, a specialization
of the NFA. The converse may not be true, though.
Example 2.27 Convert the Mealy machine in Fig. 2.44 to its equivalent Moore machine.
Solution We begin with four state labels, using Q' = [Q X A], for the equivalent Moore
machine that we want to construct.
blO all Q' = { [S0, 0], [S0, 1], [S1 , 0], [S1 , I])
Hence, we have:
b/1 6' ([S0, 0], a) = [8 (S0 , a), A (S0, 0)]
Figure 2.44 Example = [Si, 0]
Mealy machine = 8' ([S0 , 1], a)
Similarly, the transitions for the remaining states are obtained. The STF and MAF tables
for the equivalent Moore machine are shown in Table 2.55.
Table 2.55 STF and MAF tables for the equivalent Moore machine
STF (6')
MAF (A')
Q' a
Q' A
[S0,0] [S1, 0] [So, 0] [so, 0] 0
[So, 1] [S1, 0] [S0, 0] [se, 1] 1
[S1, 0] [S1, 1] [S0, I] [s 0] 0
[Si, 1] [Si, 1] [S0, 1}
[si , 1] 1
82 THEORY OF COMPUTATION
Q a
qo q1
q1 q4 q2 - -
q4 q5
q5 q3
Solution
(a) This is a sequence detector, where each string can contain exactly one `b,' immediately
following 'c'. This means that the required machine must detect the substring •
a Figure 2.46 provides the solution—a DFA that detects substring 'cb'
(b) This is also a sequence detector problem, where each string should
start with '1', and the length of the string (i.e., Ix') should be divisible
a, b a, b by 3. As the string should start with '1', its length cannot be zero.
Figure 2.46 DFA that detects Therefore, the minimum length of the string is 3, though zero is also
substring divisible by 3.
If we begin with strings having length 3 (as this is the minimum length
of the string), the allowed strings are: '100', '101', '110', '111'. For strings
having length greater than 4, one can have any combination of 0's or 1's, with lengths In
multiples of 3. The DFA that provides the solution is shown in Fig. 2.47.
FINITE STATE MACHINES 83
co 1 GI 0,1 42) 0, 1 C) We can see from Fig. 2.47 that state go, which is the initial
state, makes a transition to state q1 on reading input symbol
0, 1 0, 1 1', as it is expected to begin every string by '1'.
States qi and q2 helps in recognizing the four combina-
tions: '100', '101', '110', '111'. State q3 is reached upon
Figure 2.47 DFA that detects sequence required reading any of these 4 strings.
in Example (b)
Beyond state q3, the length of the string must be in mul-
tiples of 3, with any combination of 0's and 1's. This is signified by the loop '(0,
1) (0, 1) (0, 1)' which is repeated any number of times, in order to ensure that the
length of the string is divisible by 3.
(c) The required sequence detector required can be depicted as shown in
Figure 2.48 DFA that
Fig. 2.48.
detects sequence in
Example (c) We observe that there is a loop labelled 'a' over the state go, which represents
any number of a's, that is, zero or more number of a's. On reading one ' b' , the DFA
reaches the final state q1 . This state has a loop on `b' in order to cater to more number of b's,
as required.
cell to its right, or to its left, upon reading the current input symbol. Furthermore, in— Case
of an FA, the machine stops after reading the entire input string, that is, once the right
end-marker or the end-of-input indicator is reached. However, in case of a 2FA, we need
to explicitly make the machine move to the halt (or final) state, through instructions, to
either 'accept halt' or 'reject halt'—in order to stop the machine; otherwise, the machine
will keep on reading the input tape from left to right and right to left. This change is again
because of the fact that the read head of a 2FA can move in any direction on the input tape,
Just as the FA, the 2FA can also be classified as: Two-way deterministic finite automaton
(2DFA), or two-way non-deterministic finite automaton (2NFA). The classification is based
on whether the machine makes a transition to the unique next state or not. This concept is
same as that of the FA. Let us now formally define the 2FA.
Formal Definition
A 2FA is formally denoted by an eight-tuple (or octuple):
M= (Q E, -i, 8, go, gA, gR)
where,
Q: Finite set of states
E: Finite input alphabet
go: Initial state of FA, go E Q
gA: Accept halt (final) state of 2FA, gA E Q
gR: Reject halt (final) state of 2FA, qR E Q
F-: Left end-marker symbol; I- E
-I: Right end-marker symbol; -F E
The '8' function for 2DFA is given by:
8:Qx(EU )) (Q X {L, R})
The '8' function for 2NFA is given by:
Q X (2 U -11) —4 finite subsets of (Q x {L, R})
Thus, in case of the 2DFA, 5 (q, a), for any current state, q, does not contain more than one
element, upon reading a symbol 'a' . Thus, the transition would be of the form:
5 (q, a) = (p,
where, '13' is the unique next state in Q to which the machine makes the transition—`p'
may or may not be equal to ' q' —and 'd' is a direction, which is a member of {L,
This means that the 2DFA reads a symbol `a', which is either an input symbol or an
end-marker, and makes a transition to a unique next state. Then it moves one position to
the right or to the left onto the input tape, of course not exceeding the bounds.
In case of a 2NFA consider the following transition:
(q, a) = di), (p2, di), dm)}
where, `q', `p2', `p": are states from Q; 'a' is any input symbol in (/ U F, -I));
and are directions from the set {L., R}
FINITE STATE MACHINES 85
Example 2.30 Construct a 2DFA that accepts the following regular set:
The 2DFA now starts reading in the reverse order, that is, from right to left, While
moving towards the left, the 2DFA counts the number of b's and ignores all a's. state
is reached if the right end-marker is read and the number of a's are found to be a multip913
of 3. In state q3, if the machine reads '17', it transits to state q4, and moves one position to
the left; all the a's are ignored otherwise. While in state q4, if the machine reads a second
, it transits back to state q3 and moves one position to the left. Thus, the loop formed
by states q3 and q4 counts if the number of b's is a multiple of 2—that is, if the number
of b's is an even number. In state q3, if the machine reads the left end-marker, it moves
to the accept halt state qA, indicating that the string read contains an even number of b's
in addition to the number of a's being a multiple of 3. Otherwise, it transits to the reject
halt state, qR.
Now, let us we consider:
X = remainder of division of number of a's in the input string by 3 (to check for mul-
tiples of 3)
Y = remainder of division of number of `b's in the input string by 2 (to check for even
numbers)
Then, X can take values: 0, 1, and 2; while Y can take values: 0 and 1.
We see that states go, q1, and q2 in Table 2.57 represent the values 0, 1, and 2 respec-
tively for X; and similarly, states q3 and q4 represent the values 0 and 1 respectively for Y.
Let us simulate the working of the 2DFA for an input string abaabbb'. The string
consists of 3 a's and 4 b's—we observe that the number of a's are a multiple of 3, and the
number of b's is an even number. Hence, the string must be accepted by the 2DFA.
go
FINITE STATE MACHINES 87
q0
I-abaabbb 5 (90, b) (q0, R)
1'
q0
I-abaabbb 8 (q0, -I) = (q3, L)
(13
I-abaabbb 5 (q3, b) = (q4, L)
q4
I-abaabbb 8 (q4, b) = (q3, L)
q3
I-abaabbb 6 (q4, b) = (q4,
(14
I-abaabbb 6 (q4, a) = (q4, L)
q4
~ abaabbb H 5 (q4, a) = (q4,
q4
Fabaabbb H 6 (q4, b) = (q3, L)
(I3
I-abaabbb (q3, a) = (q3, L)
q3
I-abaabbb H 8 (q3, I-) = (qA, R)
qA
A 2FA is equivalent to a read-only turing machine (TM; refer to Chapter 4) that uses only
a finite and constant amount of space on the input tape, which, of course, limits it ability.
However, a 2FA cannot write anything onto the tape, while a TM can.
Though a 2FA seems to be more powerful than an FA, it is in fact equivalent to it. Hence,
a 2FA can only accept regular languages. The only difference between the two is that an
FA reads the input symbols only in one direction—left-to-right, and not in the reverse. This
makes the FA a special case of the 2FA. Any FA which is equivalent of a 2FA generally
has exponentially more number of states than its equivalent 2FA. For the above 2DFA,
which has five states (excluding qA and qR), the equivalent DFA can be constructed with
six distinct states (refer to Fig. 2.50). The difference in the number of states is too small
here, but this may not be the case every time.
88 THEORY OF COMPUTATION
Following is the explanation for the six states in the DFA in Fig. 2.50. Note
that X and Y have the same meaning as stated earlier.
Let:
go: Represents X = 0 and Y = 0 (number of a's is a multiple of 3 and number
of b's is even)
g1 : Represents X = 1 and Y= 0
Figure 2.50 DFA q2:Represents X = 2 and Y = 0
equivalent to the 2DFA in
q3:Represents X = 1 and Y = 1
Table 2.56
q4:Represents X = 2 and Y = 1
q5:Represents X = 0 and Y = 1
State go is an initial state as well as the final state. We observe that as the DFA makes a
single read of the input string and only in one direction, each state needs to be associated
with some value for X as well as for Y. This means that each state needs to associate
with the number of a's and b's read so far. This is unlike the 2DFA, which associates
either the value of X or Y with the machine states. In case of 2DFA this is possible, as
it makes the two passes of the input string—once while moving from left-to-right and
the other in the reverse direction.
Thus, in case of 2DFA, we have three values for X and two values for Y, making five states
in total. For the equivalent DFA, the computation is equal to all combinations of values of
X and Y. Hence, in this example DFA, it is six (3 x 2 = 6) states in total.
Let us consider an example, in which the machine accept strings whose number of a's
is a multiple of 6, and the number of b's is a multiple of 4. In such a case, the 2DFA will
have: 6 + 4 = 10 states, while the equivalent DFA will have: 6 x 4 = 24 states. Thus, a
DFA which is equivalent of a 2DFA generally has exponentially more number of states
than the 2DFA.
SUMMARY
The term 'machine' refers to a predictable program, A finite state machine (FSM) denotes all such pro-
whose behaviour can be understood without execut- grams that have a finite number of functions, but no
ing it. A 'state' is the term typically used for differ- memory (to store intermediate results). The power
ent functions that are constituents of a program. A of this machine is limited due to lack of memory. An
'program' is a collection of unique functions, each FSM is a simple and primitive computational model.
performing an atomic and unique task. An FSM is represented by a pair of functions,
A basic machine is an abstract view of any pro- namely:
gram (or machine), where one is interested only in Machine function:
determining the output that is generated for a given MAF: I S --> 0 and
input. A basic machine can be viewed as a function
which maps the input set, I, to the output set, 0. State transition function:
This function is called the machine function (MAF), STF: I S -4 S
and is defined as:
where, S: Finite set of internal states of the machine
MAF: I 0
FINITE STATE MACHINES 89
/: Finite set of input symbols (or input accepting the same set of words (or language). A
alphabet) DFA is, in a way, a specialization of an NFA. Hence,
0: Finite set of output symbols (or output NFA and DFA have equal powers.
alphabet) There is another formalism called 'NFA with
6-transitions' (or 6-moves). The 8 function for an NFA
Finite automata (FA) is the mathematical model (for-
with 6-transitions is given by:
malism) of an FSM. Mathematical models emphasize
only the specific properties of these machines. An 3 :QX (EU{e})—>2Q
FA portrays an FSM as a language acceptor.
An NFA with 6-moves helps us divide a complex lan-
FA is formally denoted by the five-tuple:
guage acceptance problem into smaller problems; the
M= (a, 1, 6, go, F) solutions to these problems can then be integrated
with the help of c-transitions. Further, an NFA with
where, Q: Finite set of states
6-moves can be converted either to its equivalent
E: Finite input alphabet
NFA without 6-moves, or directly to its equivalent
go: Initial state of FA; go E Q
DFA. These three machines are equivalent to one
F: Set of final states; it is a subset of Q,
another and have equal powers.
i.e., F C Q
FSMs can be classified as Moore and Mealy ma-
5: State transition function (STF)
chines. 'Moore' and 'Mealy' machines are defined
The '8'function for a deterministic FA or DFA is given with the help of the six-tuple:
by:
M = (Q, E, A, 8, A, q0)
8 :QxE-4Q
where,
The '5' function for non-deterministic FA or NFA is Q: Finite set of states
given by: E: Finite input alphabet
5:QxE—>2Q A: An output alphabet
6: State transition function (STF); 5 :
An FA is said to be deterministic if for every state Q x E Q
there is a unique input symbol, which takes the state A: Machine function (MAF)
to the required next unique state. This is explained go: Initial state of the machine
by an equation of the form, 5 (q, a) p.= This means
that if the current state of the machine is q, and if The machine function for a Moore machine is given by:
the machine reads an input symbol 'a', it makes A : Q —> A (i.e., output depends only on
transition to a unique next state, p. the current state)
An NFA may have more than one possible transition
on the same input symbol from some state. Such a Whereas, the machine function for a Mealy machine
machine is not even probabilistic, as no weights are is given by:
assigned to the different possible transitions from
A:Qx —> A (i.e., output depends on the
the state on the same symbol; hence, it can also
current state as well as the
be called a 'possibilistic' machine. This is explained
input symbol read)
by an example equation of the form: 8 (q0, 0) =
{go, There could be multiple next states, and Moore and Mealy machines are equivalent to each
this is shown by the set entry. We note that {go , q1} other, which means that one can construct an equiva-
is a member of 2Q (the power set of Q, i.e., set of lent Moore machine given a Mealy machine, and vice
all subsets of Q). versa. Moore and Mealy machines can be imple-
NFA and DFA are equivalent to each other. In other mented as finite state transducers (FST), which can
words, for every NFA, there exists an equivalent DFA write the output string onto the output tape.
90 THEORY OF COMPUTATION
EXERCISES
This section lists a few unsolved problems for the readers to help understand the topic better and practice
some FSM construction examples.
Objective Questions
® 2.1 The smallest finite automata which accepts the
(c) Finite automaton uses stack as a memory
(d) All of these
language {x length of x is divisible by 3 ) has
RO 2.4 The language accepted by DFA is called a
(a) 2 states
language.
(b) 3 states OR 2.5 NFA means
(c) 4 states for
0 2.6 Which of the following statements are true
(d) 5 states the NFA: ({p, q, r, s}, {0, 1), 8,p, {q, s}), where
0 2.2 Which of the following is true? `6' is given by:
(a) DFA and NFA have same power Table 2.58 State transition table
(b) DFA is more powerful than NFA
(c) NFA is more powerful than DFA
0
(d) All of these
® 2.3 Which of the following is true about finite p q, r q
automata? r q, r
q
(a) It has no memory r
(b) A finite automata can have more than one
initial state
FINITE STATE MACHINES 91
(a) NFA accepts the string 00 2.9 Write a short note on Mealy and Moore
(b) NFA does not accept the string 001 machines.
(c) NFA accepts the string 1111110 O 2.10 Explain with an example, the process
(d) All of these of converting a Mealy machine to its
(e) None of these corresponding Moore machine.
2.7 If M is a DFA accepting a language consisting OL 2.11 Write a short note on the properties and
of 0's and 1's that end in either '00' or '11'. limitations of FSM.
What is the minimum number of states in M? OL 2.12 Compare Moore and Mealy machines.
(a) 2 (D 2.13 Design an FSM to check divisibility by three,
(b) 3
where E = (0, 1, ..., 9).
(c) 4
0 2.14 What are finite automata? Construct the
(d) 5
minimum state automata equivalent to the state
(e) 6
transition diagram in Fig. 2.51.
® 2.8 Every DFA is also an NFA. Is this statement
true or false?
Review Questions
0 2.1 Construct Mealy and Moore machines for the
following:
For the input from /*, where E = (0, 1, Figure 2.51 Example automata
2) print the residue-modulo-5 of the input
treated as a ternary (base 3 with digits 0, 1, 0 2.15 Construct NFA without e-transitions for the
and 2) number. NFA with e-transitions shown in Fig. 2.52.
0 2.2 Discuss the relative powers of NFA and DFA.
0 2.3 Write the machine function and the state
transition function for a binary adder. Support
your answer with a transition diagram.
2.4 Prove the following statement: 'Corresponding
to every transition graph, there need not exist
an FSM, but the converse is always true'. Figure 2.52 Example NFA with &transitions
C) 2.5 Define and give suitable examples for a
transition graph. ® 2.16 Design an FSM that reads strings made
O 2.6 Construct a Mealy machine that accepts up of letters in the word 'CHARIOT' and
the strings from (0 + 1)* and produces the
recognizes those strings that contain the word
following output:
`CAT' as a substring.
Table 2.59 Output 02.17 Construct a Moore machine equivalent to
the Mealy machine represented by the TG in
End of string Output
Fig. 2.53.
101 x
110 y 0/z2
Otherwise Start
2.18 Consider the DFA as shown in Fig. 2.54. 2.22 Convert the Mealy machine in Fig, 2.5s t,
Obtain the minimum state DFA. Moore machine.
b/0 a/0
Start a/l
a b c
P 4 {p} {q} {r}
q (p) (q} frl
Figure 2.54 Example DFA r {q} {r} 4, {p}
0 2.19 Consider the Moore machine described by
the transition table given here. Construct the (a) Compute the c-closure of each state
corresponding Mealy machine. (b) List all the strings of length three or less
accepted by the automata
Table 2.60 Example Moore Machine (c) Convert the automaton to its equivalent
Current Next state DFA
state Output 0 2.24 Construct a DFA for the NFA, whose state
a=0 a= I
transition function is given here. Assume 'p'
-4 ql ql q2 0 to be the initial state and F = {q, r}.
q2 ql q3 0
q3 ql q3 1 Table 2.63 Example NFA
O 2.28 Construct a DFA equivalent to the NFA: ({p, q, 0 2.31 Translate the Mealy machine in Fig. 2.56 into
r, s}, {0, 1}, SN, p, {g, s}), where SN is as given its equivalent Moore machine.
in the following table:
0/1
Table 2.64 Example NFA
0 1
Q
->p {g, r} {q)
*q {r} {g, r}
r {s} {p} 1/1
*s {p} Figure 2.56 Example Mealy machine
LEARNING OBJECTIVES
After completing this chapter, the reader will be able to
understand the following:
3.1 INTRODUCTION
The languages accepted by finite automata (FA) are described or represented by simple
expressions called regular expressions (RE). Since FA accepts regular languages (n).
regular expressions are also used to denote regular languages.
Regular expressions are like the short-form notations that denote regular languages (or
regular sets). This is analogous to the set labels such as I, which denotes the set of integers.
and 0, which denotes an empty set. The only difference in the case of regular expres-
sions is that these are composed of few operators as well; hence, the term 'expression%
As expressions are composed of some operands and some operators, it can be concluded
that regular expressions are short notations that can even denote complex and infinite
regular languages.
REGULAR EXPRESSIONS 95
a a
(a) (b) (c)
Figure 3.1 Operators in regular expressions (a) Parallel paths (b) Series connection (c) Closure
For example:
If r = a + b, then L(r) = (a, b).
If r = a • h, then L(r) = (ab)
If r = a*, then L(r) = le, a, aa, aaa, aaaa, ...)
Here e stands for zero occurrences of a. Note that a* denotes an infinite set of stns
which includes all possible strings that can be composed out of a.
Example 3.1 Using a regular expression, describe the language consisting of all strings
over E = {0, 1} with at least two consecutive 0's.
Solution The set of all strings over E = {0, 1} is given as:
1* = {e, 0, 1, 00, 01, 10, 11, 000, 001, 010, ...
This set of all strings over E = { 0, 1} can be represented by the regular expression, (0 + 1)*,
This set includes all possible combinations of 0's and l's.
However, we require at least one occurrence of '00', that is, two consecutive 0's. We
might have any number of trailing Fs and 0's, and any number of leading l's and 0's.
Therefore, the required regular expression for the language described is:
r = (0 + 1)* • 0 • 0 • (0 + 1)*
Example 3.2 Using a regular expression, represent the language defined over E = {0, 1,2},
such that every string from the language contains any number of 0's followed by any number
of l's followed by any number of 2's'.
Solution 'Any number of 'O's' means zero or more occurrences of 0's. This can be
denoted by '0*'. Similarly, 'any number of l's' can be denoted by `1 *', and 'any number
of 2's' by '2* '.
Therefore, the regular expression is given by:
r = 0* • 1* • 2*
Example 3.3 If L(r) = set of all strings over E = {0, 1, 2), such that at least one 0 is
followed by at least one 1, which is followed by at least one 2, find a regular expression r
representing this language.
REGULAR EXPRESSIONS 97
Solution This is very similar to the previous problem. The only difference here is that
we require at least one occurrence of each symbol.
The phrase, 'at least one occurrence' means one or more occurrences of the symbol.
Now, at least one occurrence of 0 can be represented by '0 • 0*', or 0+ . We similarly
represent at least one occurrence of 1 's and 2's as well.
Therefore, we can write r as:
r = 0 • 0* • 1 • 1* • 2 • 2*, or
r = 0+ . 1+ . 2 +
Example 3.4 Using a regular expression, represent the language over E = { a, 1,} with all
strings starting and ending with a's and with any number of b's in between.
Solution In this language, each string must begin and end with an a, and in between
there may be zero or more number of b's.
Therefore, we may write L(r) as:
Example 3.5 If L(r) = set of all strings over I = {0, 1} ending with '011', then find r.
Solution In this language, every string contains any combination of 0's and 1 's, and
always ends in '011'.
Therefore, we may write L(r) as:
L(r) = { 011, 0011, 1011, ...1
Hence, the regular expression r is:
r = (0 + 1)* • 011
Example 3.6 Describe in simple English the language represented by the regular expression
r = (1 + 10)*
Solution Let us try to list out all the strings in the language described by r:
L(r) = (e, 1, 10, 11, 101, 110, 1010,...}
As per the regular expression, we have two parallel paths—' 1' and '10', which are put into
an iteration (or loop), that is, zero or more number of occurrences.
The empty string e is obtained if we consider zero occurrences.
We get string '1' if we choose 1 from the two parallel paths, and consider only one
occurrence. Similarly, we get string '10' if we choose the other path.
The string `11' is obtained if we choose path I' for both the iterations.
98 THEORY OF COMPUTATION
Example 3.7 Represent the language over = 10, 1 containing all possible comb.inatifir
of 0's and I 's, but not having two consecutive 0's.
Solution We have seen in the previous example that the language containing strings
start with '1' and do not have two consecutive 0's is represented by the regular expression
ri = (1 + 10)* (3.1,
Similarly, the language containing strings that start with '0' and do not have two consecu-
tive O's can be represented as:
r2 = 0 • (1 + 10)* (3.2
Combining Eqs (3.1) and (3.2), we write the required regular expression as:
r = (1 + 10)* + 0 • (1 + 10)*
This can be simplified as:
r = (0 + e) • (1 + 10)*
Therefore,
L(r 2 ) = u • v
= {t; a, aa, aaa, ... } • {e, b, bb, bbb, ...}
= fe, a, b, aa, bb, ab, } (3.4)
Example 3.10 If L(r) = {a, c, ab, cb, abb, cbb, abbb, ...}, what is r ?
Solution We observe that the strings in L(r) either begin with a or c and are followed by
zero or more occurrences of b.
Hence, the regular expression r that denotes this set is:
r =- (a + c) • b*
Example 3.11 If L(r) = {aaa, aab, aba, abb, baa, bab, bba, bbb}, find the regular expres-
sion r which represents L(r).
Solution We observe that the length of each word in L(r) is three, and L(r) depicts all
possible combinations of a's and b's.
We also observe that each letter in any word is either a or b, that is, (a + b).
Therefore, the regular expression r can be written as:
r -= (a + b) • (a + b) • (a + b)
= (a + b)3
Example 3.12 Represent the set of all words over I = {a, b} containing at least one a,
using a regular expression.
Solution According to the language description, every string must contain at least one
a, with any number (zero or more) of a's and b's before and after it.
Thus, the regular expression may be written as:
r = (a + b)* • a • (a + b)*
Example 3.13 Let r = (a + b)* • a • (a + b)* • a • (a + b)*. Describe the language L(r)
represented by the given regular expression using simple English.
Solution We can easily see that the regular expression r denotes the languages L(r)
such that:
L(r) = language over = {a, b) containing at least two a's
100 THEORY OF COMPUTATION
Example 3.15 Represent the language over E = {a, b} containing at least one a and at
least one b, using a regular expression.
Solution Referring to Example 3.13, we may write the regular expression as:
r1 = (a + b)* a (a + b)* b (a + b)*
Please note that we have not shown the operator for concatenation `•' in the expression,
as that is assumed if one symbol follows another.
The language may also be represented using the regular expression r2, as the position
of a and b can be interchanged:
r2 = (a + b)* b (a + b)* a (a + b)*
Thus, the end result is either of the aforementioned forms, and hence can be written as:
r = (a + b)* a (a + b)* b (a + b)* + (a + b)* b (a + b)* a (a + b)*
Example 3.17 Using a regular expression, represent the set of all strings of a's and b's
containing at least one combination of double letters.
Solution A double letter combination can either be as or bb.
Therefore, the regular expression can be written as:
r = (a + b)* • (aa + bb) • (a + b)*
Example 3.19 Let L(r) = set of all strings over E = {a, b} in which the strings either
contain all b's or else, there is an a followed by some b's; the set also contains E . Find the
regular expression that represents this language.
Solution We may write the required L(r) as:
L(r) = E , a, b, ab, bb, abb, bbb, ...)
Therefore,
r = b* + a • b*
= (e + a) • b*
Example 3.20 Find the regular expression for the language consisting of all strings of a's
and b's without any combination of double letters.
Solution The required regular expression is given by:
r = (e + b) • (ab) * • (e + a)
Note: We want all such strings that do not contain double letters. There are only two patterns
that, if iterated, do not generate double letters; they are: ab and ba. The aforementioned regu-
lar expression uses the pattern ab, which is iterated zero or more times. Any string that is the
outcome of (ab)* never begins with b, and never ends with a. However, the strings can start
with either a or b and end in either a or b. Hence, `(e + b)' is concatenated in the beginning,
and `(c + a)' at the end of the regular expression.
102
THEORY OF COMPUTATION
Using another pattern, that is, ba, we may write the regular expression as:
r = (E + a) • (ba)* • (E +b)
Both are the equivalent regular expressions though they appear to be different.
Example 3.22 Represent the language that contains strings over = { 0, 1}, and has even
number of 0's.
Solution An even number of 0's means either 0, 2, 4, 6, ... number of 0's.
Hence, the required regular expression is:
r = (1* • 0 • 1* • 0 • 1*)* + 1*
The path for 1* ensures there are no 0's, while the path for (1* • 0 • 1* • 0 • 1*)* ensures
2, 4, 6, ... numbers of 0's.
We observe that for the path (1* • 0 • 1* • 0 • 1*)*, the two 0's are placed in such a
way that they are preceded, followed, and separated by zero or more number of l's. This
generates all possible strings as required.
Proof
1. Section 3.4.2 establishes the equivalence among regular expressions and FA. The way
the regular languages are recursively defined is the reason why the regular expression
is so constructed. One may refer to the definition of regular language in Section 3.5.1
2. Section 3.4.3 describes the method used to obtain the regular expression denoting the
regular language accepted by any FA.
0 + 1 ,* 0•1
(b)
(a)
(c)
Figure 3.3 Rules for constructing NFA with e-moves from given regular expression
(a) Parallel paths (b) Series (c) Closure
Figure 3.3(a) shows how the +' operator is converted to parallel paths.
One might question the use of &moves in the figure: The figure illustrates a very simple
example of `r1 + r2', where r1 is 0 and r2 is 1. In reality, these can be complex expressions
themselves. In such a case, we introduce a new start state that connects with the initial
states of the individual FA representing r1 and r2 using the &moves. The individual final
states of two FA are also similarly connected with a new final state.
Figure 3.3(b) shows how a final state of the FA for r1 is connected to the start state for
r,; here r1 is considered as 0 and r2 as `1'. It is very important to note here that if these
:4 -14.E L,N;ZN CF COMPUTATION
e.. nioe.,
thumb rules are applied to obtain the NFA with e-moves, then each NFA with
will always has a single final state.
Figure 3.3(c) shows a transition from the first state to the last state on e, which represent
r. The other path from the
zero occurrences of r—taken here as 0. This is done to bypass
first state to the last state, which goes through some intermediate states, represents one or
more occurrences of r. In all, the figure depicts 'zero or more' occurrences of r.
In order to represent r*, we introduce a new start state and a new ending state: for this,
we introduce a path from the start to the final state using e to denote zero occurrences'
We then connect the new start state to the start state of the FA for r using e. Likewise, we
connect the final state of the FA for r to a new end state using e. Thus, the new FA denotes
one occurrence of r. We then connect the original final state of FA for r to its original
initial state to achieve more than one occurrence. Overall, the new NFA with &moves thus
constructed represents zero or more occurrences of r, that is, r*.
Similarly, positive closure of 0, that is, 0 + is represented as an
NFA with e-moves as shown in Fig. 3.4. This can be generalized
0 to represent any complex r. Comparing Fig. 3.3(c) andi Fig. state3.4,
0n
Figure 3.4 NFA with (-moves representing we note that the path from the new initial to new
positive closure of 0 symbol e, representing zero occurrences is removed now.
Example 3.23 Draw an NFA with e-moves for the regular expression, r = a • (a + b)*,
which represents the language consisting of strings of a's and b's, starting with a.
Solution
Using the rules given in Fig. 3.3, the steps for converting the given regular ex-
pression to NFA with e-moves are shown in Fig. 3.5. Figure 3.5(c) shows the required NFA.
b
(a)
(b)
(c)
Figure 3.5 Steps for constructing NFA with &moves for a •
(a b) (b) Step 2:
NFA with e-moves for (a + b)* (c) (a + b)* (a) Step 1: NFA with
Step 3: Final NFA &moves for
with e-moves for a • (a + b)*
1
REGULAR EXPRESSIONS 105
Example 3.24
Draw the NFA with e-moves for the regular expression r = (a* + b*).
Solution Using the rules for converting the given regular expression to NFA with e-moves,
we construct the NFA with e-moves for (a* + b*) as shown in Fig. 3.6.
(a) (b)
(c)
Figure 3.6 NFA with e-moves for regular expression (a* + b* ) (a) Step 1: a*
(b) Step 2: b* (c) Step 3: (a* + b* )
expression. Convert it to it
2. Obtain the NFA with &transitions from the given regular
equivalent NFA without &moves. Convert this NFA to DFA (refer Chapter 2, s
tion 2.6). There are again two methods of conversion from NFA to DFA that we have
discussed in Chapter 2. Use any one of these two methods.
Using this NFA with &transitions, we can obtain the equivalent DFA through the direct
method, as shown in Fig. 3.9(a).
A
Start { C
1
q0 I q1q2q12q21 <q3 I q4q5q9q10
01 D
B (q6 q7)
(413 J 1714'715 q17 q18 0( )
1 F 0
(q8 I q5q9q1o)
1 C(q 6 J q15 (117‘118
0
HC(q191 q20 q1 q2 1712 q21)
1
(a)
(b)
Figure 3.9 DFA construction from NFA with e-moves (a) DFA directly obtained
. from NFA with c-moves in Fig. 3.8
(b) Final DFA for the regular expression [1 • (00) • 1 + 01* • Or
REGULAR EXPRESSIONS 107
tc. We begin with the initial state q0 of the given NFA with c-moves, and collect all reach-
able states (path labelled by e) from q0. These are q1 , q2, q12, and q21. As q21 is the final
ltie state of the NFA and reachable from q0, we mark this new state as final. Out of states q1 ,
q2, g12, and q21, we see that there are no transitions on 0 or 1 from q1 . Similarly, there are
N no transitions from q21 also, as it is the final state. However, q2 goes to q3 on reading 1,
while q1, goes to q13 on reading 0.
Therefore, we create two new state symbols:
Table 3.1 State transition Table 3.2 Reduced state q3, whose reachable states are q4, q5, q9, and
g table for DFA in Fig. 3.9(a) transition table for DFA q10; and q13, whose reachable states are q14, q15,
e q17, and g18.We repeat the same process for the
/ E
newly-created states q3 and q13 , and continue
Q 0 1 Q 0 1 till we reach to the stage where no more new
*A B C *A B C states can be added.
B H F B A• B• Finally, we obtain eight states in the equiva-
lent DFA, and relabel these states from A to H,
C D G C D A• as shown in Fig. 3.9 (a). Using these new labels,
D E — D C• the state transition table for the equivalent DFA
E D G `*' : Final states can be constructed as shown in Table 3.1.
`-': Modified entries We observe that G and H are equivalent to
F H F A; F is equivalent to B, and E is equivalent to C.
*G B C The reduced state transition table after minimization based on the equivalent
*H B states is shown in Table 3.2.
C
Using Table 3.2, the transition graph for the final DFA equivalent to the
`*': Final states given regular expression is drawn as shown in Fig. 3.9(b).
Example 3.26 Construct the DFA that accepts the language represented by 0* • 1* • 2*.
Solution First, let us convert 0*• 1*• 2* to its equivalent NFA with c-moves, as shown
in Fig. 3.10.
Table 3.3 State transition table Figure 3.10 NFA with e-moves for 0* • 1* • 2*
2
Q ,0
*A A. C
*C
2
(b)
*D
(a)
Figure 3.11 DFA construction from NFA with e-moves (a) DFA conversion using `*': Final states
direct method (b) DFA for 0* • 1* • 2* `•': Modified entry
Using the minimized state transition table for the final minimized DFA, as shown in
Table 3.4, we can draw the final transition graph as shown in Fig. 3.11(b).
Example 3.27 Construct a DFA that accepts the language represented by:
We obtain the NFA with e-moves from the given regular expression as shown in Fig. 3.12.
Figure 3.12 NFA with e-moves for (ab / ba)* aa (ab / ba)*
REGULAR EXPRESSIONS 109
A B C D
1, 2, 6, 11, 12 3, 13 4, 14 16, 17, 18, 22, 27 (1 9 20)i
a
b b
W17,
F
7 10, 1, 2, 6, 11, 12) (23 24 18, 22, 27)
J
(.
10, 1, 2, 6, 11, 12 / 25 26, 17, 18, 22, 27))
(a)
(13)
Figure 3.13 Constructing DFA from NFA with e-moves (a) DFA obtained directly from Fig. 3.12
(b) Final DFA for (ab I ba)* as (ab I ba)*
Table 3.5 State transition Table 3.6 Reduced state Using the direct method of conversion,
table for the DFA in Fig. 3.13(a) transition table the equivalent DFA is obtained as shown in
Fig. 3.13(a). The state transition table for the
I X DFA is shown in Table 3.5.
Q a b Q a b From the table, we identify the equivalent
A B E A B E states. Hence, we can replace states H and J
C A. with state C, and states I and F with state A.
B C F B
The reduced table is shown in Table 3.6.
*C D G *C D G Using Table 3.6, the final DFA that is
D — H D C. equivalent to the given regular expression
E A* can be drawn as shown in Fig. 3.13(b).
E I
F B E G C. 3.4.3 DFA to Regular Expression
G J — '*': Final states Conversion
`•': Modified entries In the previous section, we have seen how
*H D G
to construct a DFA accepting the language
1 B E denoted by a given regular expression. In this section, we shall prove the
*J D G equivalence in the reverse direction. We start with a DFA and obtain the
regular expression that denotes the language accepted by the DFA.
`*'_ Final states
Let us first see some trivial examples, where we can write the regular
expression by simply observing the DFA transitions. We shall discuss the conversion
algorithms later in this section.
110 THEORY OF COMPUTATION
Example 3.28 Refer to the DFA in Fig. 3.14. Obtain the regular expression that
den4„
the language accepted by the given DFA.
Solution From the initial state of the DFA to its final state, there
et ,
,b parallel paths labelled by symbols a and b. This can be represented as (a
From the final state, there is a self-loop on the parallel paths a or b. Thismear:
Figure 3.14 Example DFA we can have any number of occurrences of a or b, which is represented by cril.
ha
Therefore, the required regular expression representing the language accepted by:
DFA in Fig. 3.14 is:
r = (a + b) • (a + b)*
,b Example 3.29 Find the regular expression representing the language accepted by the il,„FA
in Fig. 3.15.
Figure 3.15 Solution In the given DFA, there is only one state, which is the initial as well as final
DFA for state. There is a self-loop from the state on parallel paths labelled by symbols a and b
Example 3.29
to the same state. Hence, it represents zero or more occurrences of a or b, i.e., (a + b )* ,
Therefore, the required regular expression is:
r = (a + b)*
Example 3.30 Find the regular expression equivalent to the DFA in Fig. 3.16.
Useless Solution State 1 is the initial state from which, there is a transition
path
on input a to state 2, which is a final state. The other transition is on
a, b input b to state 3, which is a non-final state. As we can see, state 3 is a
dead (or trap) state. Therefore, that path is useless. We should remove
the trap state and all the transitions that are incident on that trap state.
Figure 3.16 DFA for Example 3.30
Thus, the DFA without state 3 represents the regular expression:
r= a • (a + b)*
Example 3.31 Find the regular expression equivalent to the DFA in Fig. 3.17.
Solution Let us consider the path for the input symbol a from the initial
state 1. We have two different regular expressions to reach the final state
4 from 1 via 2 (i.e., the path 1 -3 2 --> 3 2 —> 4), which are as follows:
r1 = a (b a)* • a • (a + b)*
and
Figure 3.17 DFA for Example 3.31
r2 = a • (b a)* • b b • (a + b)*
Similarly, let us consider the path for the input symbol b from the initial state 1. We have;
the following two regular expressions to reach to final state via state 3 (i.e., path 1 "4
—> 2 —> 3 4):
r3 =--- b • (a b)* • b • (a + b)*
REGULAR EXPRESSIONS 111
and
r4 = b • (a b)* • a a • (a + b)*
Therefore, the required regular expression is obtained by 'OR-ing' all the aforementioned
regular expressions:
r r1 + r2 + r3 + r4
= [a • (b a)* • a + a • (b a)* • b b + b • (a b)* • b + b • (a b)* • a a] • (a + b)*
We note that this example is slightly non-trivial for obtaining the required regular expres-
sion. Essentially, one needs to take into account all possible paths from the initial state to
all final states (remember, there can be more than one final state).
Example 3.32 Find the regular expression denoting the language accepted by the DFA
in Fig. 3.18.
(-) a, b co a, b
Solution As the initial state is the only final state in Fig. 3.18, the other non-
Figure 3.18 DFA for final state will never be reached in any string that is accepted by the machine.
Example 3.32 Hence, the non-final state here is a trap (or dead) state.
Thus, the only string accepted by this DFA is e. Therefore, the required regular expres-
sion is:
r =e
Example 3.33 For the DFA in Fig. 3.19, find the equivalent regular expression.
Solution The first part of the regular expression is obtained by
0, 1 traversing the graph path, go —> qi q2:
r1 = 1* 0 0 (0 + 1)*
1
Figure 3.19 DFA for Example 3.33 The second part is obtained from, qo q1 —> qo —> qi q2:
r2 = (1* 01)* 00 (0 + 1)*
Example 3.34 For the DFA in Fig. 3.20, find the equivalent regular expression.
0, 1 Solution This DFA accepts all strings over / = {0, 1 }, of even (but
0, 1
non-zero) length.
0, 1
The regular expression is:
Figure 3.20 DFA for Example 3.34 r = (0 + 1) (0 + 1) [(0 + 1) (0 + 1)]*
112 THEORY OF COMPUTATION
Example 3.35 For the DFA in Fig. 3.21, find equivalent regular expression.
0 L = {0, 1, 000, 001, 010, 011, 100, 101, 110, 111, 00000,
0, 1 We observe that the language consists of only odd length strings over I
Figure 3.21 DFA for Hence, the required regular expression is:
Example 3.35
r= (0+ 1)•[(0+1)•(0+ 1)]*
Example 3.36 For the DFA in Fig. 3.22, find the equivalent regular expression.
0,1 0,1 0, 1 Solution The language accepted by the DFA is:
Figure 3.22 DFA for Example 3.36 L = {0, 1, 00, 01, 10, 11, 000, 001, 010, 011, 100, 101, 110, 111)
Hence, the regular expression is:
r = (0 + 1) + (0 + 1) • (0 + 1) + (0 + 1) • (0 + 1) • (0 + 1)
= (0 + 1) + (0 + 1)2 + (0 + 1)3
Example 3.37
Find the regular expression that represents the language that is accepted by
the DFA in Fig. 3.23.
Solution The language accepted by the DFA is:
L = {e, ab, aa, ha}
We observe that state q4 is a dead state.
Therefore, the regular expression is:
Figure 3.23 DFA for Example 3.37 r e + aa + ab + ba
Here, e
is included as the initial state is also a final state.
In the previous section, we have studied many examples, where we represented a language
accepted by given DFAs using different regular expri
rithm, which obtains such regular expression for anyess
giv ons. Let us now discuss an algo-
Theorem 3.1 en DFA a s input.
If L = L(M) for some DFA
M,
Proof then there is a regular expression
r such that L(R) L(t11)'
=
Let us consider any general DFA
ki.`), M, with Q {1,
2, 3, 4, ..., n} .
which represents a regular expression, and
such that there is a path w available from state whose language is Let
theusset
also
of use the label
all strings w
i to state j
in the transition
only restriction here is that the path does not traverse through graph for
any state, whose number
M. Thise
REGULAR EXPRESSIONS 113
greater than k. Note here that i and j may be greater than k, as they are not intermediate
states, but the end points of the path.
Let us build the expression 4) through inductive definition, where we start with k = 0
and incrementally build the expression till k = n. In this way, we achieve all possible paths
from i to j that traverse through all the possible states available in M.
As we have considered the state numbers to begin with 1, for k = 0 there will be no
intermediate state. Hence, for k = 0, we rely only on the direct transitions that are available.
Now, there can only be two possibilities about i and j in such a case: Either i = j or i
j, that is, there can be zero or more direct transitions available from i to j on some input
symbol. Let us consider each case:
Case 1
If i j, k = 0, and if:
1. There is no path from i to j; then, = 4).
2. There is a single symbol a, which takes the DFA M from i to j with single transition;
then, Ki = a.
3. There are multiple symbols, say 1 in number, on which M makes transitions from i to j
then, Rg = al + a2 + +
Case 2
If i = j, k = 0, and if:
1. There is no symbol which takes M from i to j (i.e., from i to i); then, = e. This is
always true for any state i. Hence, there is always a path e from i to i.
2. There is a single symbol a such that 5 (i, a) = i; then, Ki = (e + a).
3. There are multiple symbols al, a2, ..., a1 such that, 5 (i, a1) = i; 5 (i, a2) = i; (i, = i;
then, = e + al + a2 + al
Now, let us consider a path from state i to j that goes through intermediate states labelled
either as k or lower in number. When the path from i to j does not traverse through k at
all, then the regular expression can be represented as: R(k 0. Alternatively, when the path
traverses through k, it can be broken into three segments:
1. The first from i to k, without passing through k, that is, - 1).
2. There could be multiple sub-paths going from k to k, without passing through k as an
intermediate state—this is equivalent to Roc -kk1)*.
3. The third segment is from k to j, without passing through k—this is equivalent to R(k k. 1) .
Combining all the aforementioned expressions, that is, when there is a path from i to j that
does not pass through k; and when it passes through k at least once, we get the expression as:
le=
1.1
R( k 1) + ik 1) (R(k a l))* 1)
The algorithm for obtaining the regular expression from the given DFA is thus based
on the aforementioned rule, where we build the expressions in the order of increasing
superscript. Observe that the regular expression 1?1,iF only depends upon the expression
with smaller superscript, that is, 1 — 1', which in turn depends on `k — 2', and so on, till
k = 0, which is nothing but the direct transition.
114 THEORY OF COMPUTATION
We see that the algorithm does follow the inductive approach, where computation of
the next step is in terms of the previous one. If we assume that state 1 is the initial stare
R gives the language
and there are n states in DFA M, then the sum of all expressions
i, where F is the
-- set of all
accepted by M, provided all the/ s are final states, that is, Prl
final states, as we know.
Example 3.38 Obtain the regular expression for the DFA in Fig. 3.24.
0 Solution We see that the example DFA in Fig. 3.24 accepts language over {C), 1),
which contains strings having at least one 0. We can easily write the required regular
expression as r = 1* 0 (0 + 1)*. However, let us apply the aforementioned algorithm
and verify whether we get the same answer as expected.
1
For k = 0, the following table gives all the necessary regular expression com-
Figure 3.24 DFA for
Example 3.38
ponents. Remember that for k = 0, we only consider direct transitions from i to j,
wherever available.
les) (c + 1)
ko)
12 0
RO
R(2°)2 (e + 0 + 1)
Rcli) 1*
Ri12) 1* 0
Rc11)
R29 (e + 0 + I)
Let us now obtain /V) for k = 2.
REGULAR EXPRESSIONS 115
fif
As we know,
R11) = + K2) (Rw)* RV
Hence,
R?? = + R(,)) (RD)* K,)
= 1* + 0 (c, + 0 + 1)* 4,
1* + 4,
= 1*
This is obtained using the rule, R • 4) = 0 • R = 0; since 4) is an empty set, one can-
not concatenate it with a string from R to obtain R • 4) or 0 • R. Hence, in Example 338:
1* 0 (e + 0 + 1)* 4)= 4,
Similarly,
R(122) = R(112) + RV2 (C)* R22
= 1* 0 + 1* 0 (e + 0 + 1)* (e + 0 + 1)
= 1* 0 (0 + 1)*
The remaining two expressions, RVI) and C, can also be thus obtained.
R(2) 1*
R12) 1* 0 (0 + 1)*
RD (0 + 1)*
Let us now attempt to represent the language accepted by the DFA in Fig. 3.24. Here,
1 is the start state and 2 is the final state; this means that the expression R12) represents
the language accepted by the DFA, as R(122) includes all the paths, starting from the initial
state 1 to the final state 2, traversing through all the 2 states available in the DFA.
We see from the table:
ki22) = 1* 0 (0 + 1)*
Hence, this is the regular expression denoting the language accepted by the DFA in Fig. 3.24.
Example 3.39 Obtain the regular expression for the DFA described in Fig. 3.25.
0 Solution We see that the DFA in Fig. 3.25 is similar to the one in the previous
example, except that here both state 1 and state 2 are final.
Since there are two final states now, the required regular expression is the sum-
mation of RW and RR.
Figure 3.25 DFA for Therefore, the regular expression is given by:
Example 3.39
R(121) + R1(22) = 1* + 1* 0 (0 + 1)*
Note: The algorithm we discussed for obtaining RE from the given DFA can be applied to NFA
or NFA with e-moves as well. It actually does not depend on the type of FA being considered;
rather, it only uses the state transition function for the computation.
116 THEORY OF COMPUTATION
A solution here means that the right hand side of all such equations should contain onl
the input symbols and not the state labels (as in Eq. 3.9). y
As we know for any DFA M,
the set of final states, F C Q,
by the DFA M
can be represented by the regular expression:hence, the language accepted
Si are the solutions for the equations for state symbols ... Sr),
qi E F.
We use Arden's theorem to solve the state equations.
Theorem 3.2 (Arden's theorem)
If P, Q, and R are regular expressions, and:
1. If R = P + RQ, or R = RQ + P;
then, R
2. If R = + QR, or R = QR + P; can be simplified as,
then, R R PQ*
can be simplified as,
Proof R Q* P.
1. Let us consider R --= P + RQ.
Example 3.41 Consider the DFA M shown in Fig. 3.28. Obtain a regular expression R such
that L(R) = L(M). Apply Arden's theorem wherever appropriate.
Solution The state equations pertaining to the three states, q1, q2, and q3, are:
q1 = q1 0 + q3 0 + e (q1 is initial as well as final state; hence e is added)
q2 = 471 1 + q2 1 + q3 1
q3 = q2 0
Substituting for q3 in the equation for q1 , we get:
q1 = q 1 0 + q2 00 + e (3.10)
Figure 3.28 DFA for
Example 3.41 Substituting for q3 in the equation for q2, we get:
q2 = qi l + q2 1 + (q2 0) 1
=q11 + q2 (1 +01)
= q1 1 (1 + 01)* (using Arden's theorem, for R = q2
, p q1 1, Q = 1 + 01)
Now, substituting for q2 in the equation for q1 i.e. in Eq. 3.10, we get:
ql = q1 0 + [q1 1 (1 + 01)*] 00 + e
q 0 + ql 1 (1 + 01)* 00 + e
= q1 (0 + 1(1 + 01)* 00) + e
REGULAR EXPRESSIONS 119
qi = e (0 + 1 (1 + 01)* 00)*
= (0 + 1 (1 + 01)* 00)*
As q1 is the only final state of the DFA, the regular expression R denoting the language
accepted by the DFA can be written as:
R = (0 + 1 (1 + 01)* 00)*
Note: We need not solve the other two equations for q2 and q3, as these are non-final states.
However, in case there is more than one final state, then we need to solve the equations pertaining
to all the final states and add them up together. Refer to the next example, which is an illustration.
Example 3.42 Consider the DFA M in Fig. 3.29. Obtain a regular expression R such that
L(R) = L(M). Apply Arden's theorem wherever appropriate.
Solution We observe that the DFA in Fig. 3.29 is similar to that in Fig. 3.28,
except that state q3 is also a final state. Since there are two final states, the
regular expression R is written as:
R = qi + q3
q2 = q1 1 + q2 1 + q3 1
q2 = q1 1 + q2 1 +82 01
= q1 1 + q2 (1 + 0 1)
q2 = q1 1 (1 + 0 1)*
q3 = q2
(0 + 1 (1 + 01)* 00)* 1 (1 + 0 1)* 0
120 THEORY OF COMPUTATION
Theorem 3.3
If Li and L2 are regular languages, then their union L1 U L2 is also a regular language. In
other words, regular languages are closed under union.
Proof
Let us consider two regular languages, L1
and L2. This means that there exist regular ex-
pressions R1 and R2, such that L1 = L(R1) and L2 = L(R2).
From the definition of regular expressions (refer to Section 3.2), `R
1 + R2' is also a
regular expression. Therefore, L(R1 + R2) is also a regular language.
Since the regular expression `R1 + R2
' denotes all the strings that are either denoted
by R1 or R2, we can say:
L(R1 + R2) = L(R1 ) U L(R2)
Hence, L1 U L2 is also a regular language.
Theorem 3.4
If L1 and L2 are regular languages, then their concatenation L1 • L2
is also a regular language.
In other words, regular languages are closed under concatenation.
REGULAR EXPRESSIONS 121
Proof
Let us consider two regular languages, L1 and L2. This means that there exist regular ex-
pressions R1 and R2 such that:
L1 = L(R1), and L2 = L(R2).
From the definition of regular expressions (refer to Section 3.2), `R1 • R2' is also a regular
expression. Therefore, L(R1 • R2) is also a regular language.
Since the regular expression R1 • R2' denotes all the strings that are denoted by R1
concatenated with the strings denoted by R2, we can say:
Note: Please refer to the section on set concatenation in Chapter 1 for the definition of two
subsets A and B.
Theorem 3.5
If L is a regular language, then L* (Kleene closure of L) is also a regular language. In other
words, regular languages are closed under Kleene closure.
Proof
Let L be a regular language. This means that there exists a regular expression R such that
L = L(R).
From the definition of regular expressions (refer to Section 3.2), R* is also a regular
expression. Therefore, L(R*) is also a regular language.
Since the regular expression `R1 *' denotes all the strings that are denoted by R concat-
enated to itself zero or more number of times, we can say:
L(R*) = L(R) • L(R) • L(R) • L(R) • ..., zero or more number of times.
We know that the concatenation of two or more regular languages yields a regular language.
Therefore, we can say that Kleene closure is simply a repetitive concatenation. Hence, it
follows that regular languages are closed under Kleene closure.
Note: Please refer to the section on set closure in Chapter 1 for more details.
Lemma statement It states that given any sufficiently long string accepted by
FSM, we can find a sub-string near the beginning of the string that may be repeated (4t1
pumped) as many times as we like, and the resulting string will still be accepted by
the
same FSM.
Proof
go, F), having n
Let L(M) be a regular language accepted by a given DFA, M = (Q, f, 8,
number of states (nodes in the transition graph).
Let us consider an input string consisting of n or more symbols, a1 , a2, a3, am, In n.
Thus, the input string is of sufficiently long length.
If we consider m = n, then we require at least (n + 1) state labels along the path. As per
graph theory, in order to traverse the minimum linear path whose length is x, one needs to
traverse through 'x + l' nodes.
Since there are only n distinct states in the DFA 2,M is not possible ti oreach
osdsiisblnecft. Thus, to
the (n+ 1) state labels go, q1 , q2, tob
recognize the string of length m = n, we require at least (n + 1)
distinct states, which is not feasible.
Hence, there exist two integers j and k such that 0 --s-j<k< n,
which, q1 = qk. Let us consider the transition diagram for the DFA M for
Figure 3.30 Pumping lemma as given in Fig. 3.30.
Since, j < k, the length of string 'a; ±
since k n, its length is not more than ak' is at least one, and
n; that is, 1 s la; + ... n.
Now, if g„, E F, that is, if g„, is a
a2 final state, it means that 'a l, a2akl am' is in L(M); hence,
ak 1, ak 2, a„,' is also in L(M),
since there is a path from go to qm that goes
through q1 but not around the loop labelled +
linear path without a loop is (n — 1), ak ' . The maximum length of such a
The loop 'a; + as there are only n possible distinct state (node) labels.
ak' is formed over the state
has len gth m = n. We q1 qk,
because the sufficiently long string
string, of length m n,can go over the loop as many times as we like, and the resultant
will still be in L(M).
Thus, we can conclude that:
ad i , ak + 1 ... am
E L(M),
Here, i where i 0.
ficiently=long
0 denotes
and itsthe caseiswhen
length we do not go over the loop, that is, the string is not suf-
less than
n; whereas i
of the loop, that is, the string whose length is at least > 0 denotes one or more occurrences
n or more.
Formal statement of pumping lemma
n such that, Let L be a reular
if z is any word in L g
such set.
that the length of z is at Then,
least there exists a constant
we can write z uvw in such a way that: n, that is, IZI
1. luvl n, n; and
which means that the sub-st
1, which means ring near the be
For all i that v e;
times
since v is the sub-string of the string is not too long.
Example 3.43 Prove that the set L = {OP i is an integer, i 1}, which consists of all
strings of 0's whose length is a perfect square, is non-regular.
Solution Given i 1, we have:
For i =1, 0'2 = 012 = 0 (length = 12)
For i =2, OP = 022 = 0000 (length = 22)
For i =3, 0'2 = 032 = 000000000 (length = 32)
As we can see, the length of each string is a perfect square.
1. Let us assume that the language L is a regular language. Let n be the constant of the
pumping lemma.
2. Let us choose a sufficiently large string z such that z = 012, for some large 1 > 0; the
length of z is given by: Izl = /2 n.
Since we assumed that L is a regular language and from the language definition, it is
an infinite language, we can now apply pumping lemma. This means that we should
be able to write z as: z = uvw.
3. As per pumping lemma, every string `uviw', for all i > 0, is in L. Likewise, Ivl 1,
which means that v cannot be empty and must contain one or more symbols.
Let us consider the case when v contains a single symbol:
In this case, z = uvw = OP, which means that the number of 0's in z is a perfect square.
As per pumping lemma, we would expect `uv2w' also to be a member of L; however,
this cannot be possible, as v contains only a single symbol, and adding one to the perfect
square length would not always yield perfect square length. Thus, pumping v would yield
124 THEORY OF COMPUTATION
Example 3.44 Prove that the following language is non-regular, using pumping lemma:
L= {an b" In > 0}
Solution We must not confuse the n in the language definition with the constant n of
pumping lemma. Hence, we rewrite the language definition as:
L= lame I m > 0} .
1. Let us assume that the language L is a regular language. Let n be the constant of
pumping lemma.
2. Let us choose a sufficiently large string z such that z = alb', for some large 1> 0; the
length of z is given by: IzI = 21 n. Since we assumed that L is a regular language
and from the language definition, it is an infinite language, we can now apply pumping
lemma. This means that we should be able to write z as: z = uvw.
3. As per pumping lemma, every string 'uviw', for all i 0 is in L. Further, Iv' 1, which
means that v cannot be empty, and must contain one or more symbols.
Let us consider the case when v contains a single symbol from { a, b}. Hence, z = uvw
= 0)1, which means that the number of a's and b's in z are the same. Therefore, as Per
pumping lemma, we would expect `uv2 w' also to be a member of
L. However, this can-
not be the case, as v contains only a single symbol, and pumping v would yield different
number of a's and b's. Thus, `uv2w' is not a member of
L, contradicting our assumption
that L is regular.
Let us now consider the case when v contains both the symbols, that is,
The sample v could be written as `ab' , or aabb' , a as well as b.
multiple
times, such as, for example, v2 = abab, or v2 =and so on. When we try to pump v
aabbaabb,
a's can follow b in the string, which is against the languageand so on, we find that even
definition
to which, a's are followed by b's, and not vice versa. Thus, `uv 'eV', according
contradicting our assumption that L is regular. 2w' is not a member of L.
Hence, language L = fame' I m >
01 is non-regular.
REGULAR EXPRESSIONS 125
Example 3.45 Show that the following language is non-regular, using pumping lemma:
Solution Each string in the language is represented as a concatenation of two equal sub-
strings. Hence, language L can be listed as:
L = {e, 00, 11, 0101, 0000, 1010, 1111, ...}
FSM Equivalence
We have seen Moore's algorithm in Chapter 2 (refer to Section 2.12.1), which checks
whether given two FSMs are equivalent or not.
126 THEORY OF COMPUTATION
A variety of software applications from different areas can be simplified using the
c°11-
version of regular expression notation to efficient computer implementations of its cor
-
responding finite automata. The applications of regular expressions and finite automata
are spread from system software, such as language compilers, operating system
utilities,
and program development tools, to application programs such as text editors and
pattern recognizers. syntactic
REGULAR EXPRESSIONS 127
Digit
(a) (b)
Figure 3.31 Example DFAs (a) DFA for regular expression:
letter • (letter + digit)* (b) DFA for regular expression: (digit)+
THEOR't CF COMPUTATION
Here, 'pattern' is a regular expression that describes a set of strings that must be searched.
For example, the following `grep' command prints all lines from the mentioned file
containing the string `vivek', regardless of capitalization. The string `vivek' here is the
simple pattern:
grep vivek peoplelist.txt
The following example is a grep command that searches for pairs of numeric digits in the file:
grep '[0-9][0-9]' file
Here, [0-9] means 'any digit between 0 to 9'. This is equivalent to writing the regular
expression:
(0 + 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9)
The pattern [0-9] is thus an extended version of the formal regular expression definition
we have seen. The grep command thus makes use of the extended version of the regular
expressions we have seen.
There are many such application examples that can be stated for regular expressions
and finite automata.
We can now convert this NFA with c-moves to its equivalent DFA, as shown in Fig. 3.33.
A B
1, 2, 4, 7, 8, 9, 12) (1 ., 10 6, 1, 2, 4, 7, 8, 9, 1
o \o 0
0
L.
C
7-
, 13 6, 1, 2, 4, 7, 8, 9, 12 3, 10, 11 6, 1, 2, 4, 7, 8, 9, 12, 1-.
5, 13, 14 6, 1, 2, 4, 7, 8, 9, 12, 15
(a)
Q' 0 1
-4A B C
B D C
C B E
0 D C
® B E
(b)
Figure 3.33 NFA with e-moves to DFA conversion (a) Step 1 (b) Step 2
0/7"
The DFA in Fig. 3.33(b) can be seen as a Mealy machine,
any edge approaching the final state will carry output 7' (true); uiglicle
the others will carry output F (false). The Mealy machine thus c, lit
drawn as shown in Fig. 3.34. ad
Example 3.48 Construct a DFA for the language over {0, 1) having all strings such that
the third symbol from the right end is 0.
Solution We require that the third symbol from the right end should be 0. The two sym-
bols that follow it can be either 0 or 1. Similarly, all other symbols are also either 0 or 1.
Thus, the RE can be written as:
r = (0 + 1)* 0 (0 + 1) (0 + 1)
The equivalent NFA with c-transitions can be drawn as shown in Fig. 3.36.
The equivalent DFA can be obtained as in Fig. 3.37. We can relabel the states as 01411
in the figure.
A
. (o 1, 2, 4, 7
1 D
C
5 ,7, 1, 2, 4 3, 8 6, 7, 1, 2, 4, 9,3
0 F
E
5, 12 6, 7, 1, 2, 4, 13, 14, 15, 17 3, 8, 10 6, 7, 1, 2, 4, 9, 11, 13, 14, 15, 17 1
0 0
0 G 1 H
0, 8, 16 6, 7, 1, 2, 4, 9, 11, 1 (3, 8, 10, 16 6,7, 1, 2, 4, 9, 11, 13, 14, 15, 17,19
I
•-\
5, 18 6, 7, 1, 2, 4, (5, 12, 18 6, 7, 1, 2, 4, 13, 14, 15, 17, 19
Let us draw the state transition table to see if we can reduce the diagram. Refer to Table 3.9.
We see that states A and C are equivalent, as both have the same transitions and both
are non-final. Therefore, we can remove C and replace it by A. Refer to the reduced
Table 3.10.
Q' 0 1 0
Q' 1
A D C -->A D A
C D C D F E
D F E
E G I
E G F
F H J
*G F E
*G F E *H H
*H H J *I D A
*I D C *j
*j
`*' : Final states
`*': Final states
Using the state transition table, we can draw the transition graph of the required DFA.
REGULAR EXPRESSIONS 133
Theorem statement A language L is regular if and only if the equivalence set RL has a
finite number of equivalence classes of strings, and the number of states in the smallest
DFA recognizing L is equal to the number of equivalence classes in RL.
Alternate statement Given a regular language L and a pair of strings x and y, let the
equivalence relation RL be defined as „RLy if and only if for all z in E*, `xz' is in L exactly
when `yz' is in L. Then, RL is of finite index.
Proof
If L is a regular language, then by definition, there exists an FA M that recognizes L, with
only finite number of states. If there are n states, then we partition the set of all strings
into n subsets, where subset Si is the set of strings that, when given as input to automaton
M, causes it to end in state i.
For every two strings x and y that belong to the same state or same Si, and for every
choice of a third string z, the FA M reaches the same state on input `xz' as it reaches on
input `yz', and therefore, must either accept both the inputs `xz' and `yz', or reject both of
them. Thus, x and y are indistinguishable with respect to language L.
Thus, we can say that strings x and y are related by an equivalence relation RL, that is,
xRLy. Thus, every Si is a subset of an equivalence class of RL, and every string member of one
of these equivalence classes belongs to one of the sets Si; this gives a many-to-one relation
from the states of M to the equivalence classes, implying that the number of equivalence
classes is finite, and at most n.
In other words, if RL has finite equivalence classes, it is possible to design a DFA that
has one state for each equivalence class. The start state of such a DFA corresponds to the
equivalence class containing the empty string.
Thus, the existence of an FA recognizing L implies that the Myhill—Nerode relation
has a finite number of equivalence classes, which are, at most, equal to the number of
states of the automaton. The existence of a finite number of equivalence classes implies
the existence of an automaton with that many states. The Myhill—Nerode theorem may
be used to show that a language L is regular by showing that the number of equivalence
classes of RL is finite.
For example, let us consider a language consisting of binary numbers that are divisible
by 3. There can be three equivalence classes based on the remainder values, 0, 1, and 2.
The minimal state DFA that can be obtained for such a language also has three states (refer
to Chapter 2, Section 2.10.1, Example 2.14). Hence, the language is regular.
Another immediate corollary of the theorem is that if a language defines an infinite set
of equivalence classes, it is not regular. It is this corollary that is frequently used to prove
that a language is not regular.
134 THEORY OF COMPUTATION
SUMMARY
that connects with the initial states of the individual
The languages accepted by finite automata (FA) are
using the e-moves. Similarly,
described or represented by simple expressions called FAs representing r1 and r2
we connect the individual final states of the two FAs
regular expressions (RE).
into a new final state.
The class of regular expressions over E is defined
The operator is converted to a series connection.
recursively as follows:
In case of a regular expression of the form 'r1 • ry, the
1. Regular expressions over include letters, (f) final state of the FA for r1 is connected to the starting
(null set), and e (empty string of length zero). state of the FA for r2 using an e-move.
2. Every symbol a E E is a regular expression over E. In case of regular expression of the form `rio,
3. If R1 and R2 are regular expressions over E, then we introduce a new start state and a new ending
so are (R1 + R2), (R1 . R2), and (R1)*, where '+' state, apart from those of the FA equivalent to the
indicates alternation (parallel path), denotes
regular expression r. We introduce a path from this
concatenation (series connection), and '*' denotes new start to the new end state, using &moves to
iteration (closure or repetitive concatenation). denote zero occurrences of r. We then connect the
4. Regular expressions are only those that are ob- new start state to the start state of the FA for r using
tained using rules (1) to (3). an (-move, and we connect the final state of the FA
As per the definition, empty set 4), empty string e, for r to a new end state through e-moves. Thus, the
and every input symbol from the input alphabet E new FA we constructed denotes one occurrence of
itself is a regular expression. r. We then connect the original final state of the FA
The expression, R1 + R2' is like considering R1 for r to its original initial state to achieve more than
and R2 as two parallel paths in the equivalent DFA one occurrence.
generated. We also have algorithmic solutions to obtain the
The expression 'R1 - R2'means R1 followed by R2. regular expression from the given FA. The first al-
The expression `R:'denotes zero or more occur- gorithm is an iterative solution, while the other
rences of R1, that is, repetitive (zero or more times) depends on Arden's theorem and is based on equa-
concatenation of R1 to itself. If R1 is repeated zero tion solving.
times, the empty string e gets generated as well. The iterative algorithm follows the formula:
The expression 'RI' is used to denote positive clo-
Fil;k) = R( < 1) + R(ikk 1) iR(k-1) ik-1)
R
sure of R1, and represents one or more occurrences k kk "kJ
of R1. This means that at least one occurrence of In this equation, Rik) denotes regular expression
R1 is assured. That means, R+ = R • R*. Hence, we for all paths from state i to state j, passing through
see that RI can be composed out of other primitive
all the states labelled k, or less than k. The state
operations, and so it is not considered in the formal
k = 0 is assumed as a direct transition from i to j.
definition of regular expressions.
The iterative method thus progresses from k 0 =
For every regular expression, there exists an FA,
(only direct transitions) to all n states of the FA, k
which is equivalent to it and accepts the same lan-
= n. At each stage, it applies the aforementioned
guage; the converse is also true. There are certain
formula to obtain the next level; thus, the method
rules to convert a regular expression to NFA with
is inductive. If we have i
(moves, which can then be transformed into its as the initial state, j as
the final state, and k =
equivalent NFA or directly to its equivalent DFA, using n (all states), then R,(;)
denotes the regular expression for the FA. If there
the known methods.
is more than one final state, then the summation
The operator +' is converted to parallel paths. is considered as:
Consider a regular expression of the form + r2', Rnu where F
p is the set of all final states.
where r1 and r2 are the complex expressions them-
In the other method, we represent every state
selves. In such a case, we introduce a new start state
as an equation, and identify all incoming edges for
REGULAR EXPRESSIONS 135
every state. Each state can then be expressed as over 1, and is closed under union, concatenation,
a state equation of the form:
and closure operations.
qn (10 ao q1 a1 + + qm am , Pumping lemma defines an important property of
where, there are m incoming edges from the states regular languages. It is used as a tool to disprove
q 0, ch, , , qm on symbols labelled as a0, al
the regularity of certain specific languages.
, am,
respectively. Pumping lemma: Let L be a regular set. Then,
These equations are solved by substitution using there exists a constant n such that, if z is any word
Arden's theorem, which states: in L such that the length of z is at least n, that is, jzl
n; and we can write z = uvw in such a way that:
If P, Q, and R are regular expressions, and:
1. luvl n, which means that the sub-string near
(i) If R = P + RQ, or R = RQ + P; then, R can be
simplified as, R = PQ*. the beginning of the string is not too long.
(ii) If R = P + QR, or R = QR + P; then, R can be 2. Iv! 1, which means that v e; since v is the
simplified as, R = Q*P. sub-string that gets pumped.
3. For all i 0, uv'w is in L. This means that the
The class of regular sets over / is defined as:
sub-string v can be pumped as many times as we
1. Every finite set of words over 1' (including 0, the like and the resultant string will still be a member
empty set or null set) is a regular set. of L.
2. If U and Vare regular sets over /, then U U V(union)
The applications of regular expressions and fi-
and U.V (concatenation) are also regular sets.
nite automata range from system software such
3. If S is a regular set over 1, then so its closure,
as language compilers, operating system utilities,
that is, 5*.
and program development tools, to application pro-
In other words, the class of regular sets over is grams such as text editors and syntactic pattern
the smallest class containing all finite sets of words recognizers.
EXERCISES
This section lists a few unsolved problems to help the readers understand the topic better and practise examples
on regular expressions.
Objective Questions
® 3.1 Let P be the language represented by the regular 0 3.3 Let S and T be languages over ---- (a, b}
expression p*q*, and Q be {p"q't n 0}. Then represented by the regular expressions (a +
which of the following is always regular? b*)* and (a + b)*, respectively. Which of the
(a) P n Q following is true?
(b) P — Q (a) SST
(c) E* — P (b) TDS
(d) E* — Q (c) S= T
0 3.2 In a compiler, keywords of a language are (d) S D T
recognized during 0 3.4 Which of the following are regular languages?
(a) parsing of the program Ll = {an I n is odd}
(b) code generation L2 = {an I n is even }
(c) lexical analysis of the program L3 = I n is prime}
(d) dataflow analysis /A = {an I n is a perfect square}
136 THEORY OF COMPUTATION
(b) All strings in which any occurrence of the 0 3.18 Describe the language accepted by the
symbol b, is in groups of odd numbers. following finite automaton.
(c) All strings in which the total number of a, b
a's is divisible by 2.
410
0 3.6 Check the following regular expressions for a, b
equivalence and justify:
(a) R1 = (a + bb)* (b + aa)* Figure 3.38 Example DFA
R2 = (a + b)*
(b) R1 = (a + b)* abab* 0 3.19 Describe as simply as possible in English the
language represented by: (0/1)* 0.
R2 = b* a (a + b)* ab*
0 3.20 Construct an NFA that recognizes the regular
® 3.7 Describe in English the sets denoted by the
expression (a / b)* • a • b. Convert it to a DFA,
following regular expressions:
and draw the state transition table.
(a) (a + c) (b + ba)*
0 3.21 Construct a regular expression corresponding
(b) (0*1*)*
to the state diagram shown here, using Arden's
0 3.8 Construct an NFA with c-moves, which
theorem.
accepts the language defined by:
[(0 + 1)* 10 + (00)* (11)*]*
0 3.9 Let R1and R2 be two regular expressions. With
the help of transition diagrams, illustrate the
three operations (+, • , x) on R1 and R2.
® 3.10 Show that the regular expressions, (a* bbb)*
a* and a* (bbba*)*, are equivalent.
CO 3.11 Give a regular expression for representing all Figure 3.39 Example FA
strings over {a, b} that do not include the sub-
strings `bba' and `abb'. 3.22 Is the following language regular? Justify.
3.12 Consider the two regular expressions: L = {OP 1Pp"- qlp 1, q 1}
0
R1 = a* + b* © 3.23 Construct the regular expression and finite
R2 = ab* + ba* + b* a + (a* b)* automata for: L = L1 fl L2 over alphabet {a,
(a) Find a string corresponding to R1 but not b}, where:
to R2. L1 = all strings of even length
(b) Find a string corresponding to R2 but not L2 = all strings starting with b
to R1. OL 3.24 Which of the following are true? Explain.
(c) Find a string corresponding to both R1 (a) baa E a* b* a* b*
and R2. (b) b*a* fl a* b* = a* U b*
0 3.13 Construct an NFA for the regular expression, (c) a* b* fl b* c* = 4)
(d) abed E [a (cd)*
(a / b)* ab. Convert the NFA to its equivalent
DFA and validate the answer with suitable O 3.25 Construct the regular expressions for the
examples. following DFAs:
® 3.14 Define the term regular language.
0 3.15 Write short note on: pumping lemma for regular
sets. Start
® 3.16 Construct an NFA (Q, I, 3, q( , F) for the
following regular expression:
01R(10)+ + 111)* + 0]* 1 0 1
0 3.17 Prove that the regular expressions given here
are equivalent. Figure 3.40 Example DFAs
(a) (a* bbb)* a* OU 3.26 Which of the following languages are regular
(a) a* (bbb a*)* sets? Justify your answer.
138 THEORY OF COMPUTATION