0% found this document useful (0 votes)
7 views7 pages

Formal Language

Uploaded by

soumyashaw58
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views7 pages

Formal Language

Uploaded by

soumyashaw58
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

4 Formal Languages

In this chapter \ve introduce the concepts of grammars and formal languages
and discuss the Chomsky classification of languages. We also study the
inclusion relation between the four classes of languages. Finally. we discuss the
closure properties of these classes under the variuus operations.

4.1 BASIC DEFINITIONS AND EXAMPLES


The theory of formal languages is an area with a number of applications in
computer science. Linguists were trying in the early 1950s to define precisely
valid sentences and give structural descriptions of sentences. They wanted to
define a fomlal grammar (i.e. to describe the rules of grammar in a rigorous
mathematical way) to describe English. They tbought that such a desCliption of
natural languages (the languages that we use in everyday life such as English,
Hindi. French, etc.) would make language translation using computers easy. It
was Noam Chomsky who gave a mathematical model of a grammar in 1956.
Although it was not useful for describing natural languages such as English, it
turned alit to be useful for computer languages. In fact. the Backus-Naur form
used to describe ALGOL followed the definition of grammar (a context-free
grammar) given by Chomsky.
Before giving the definition of grammar, we shall study, for the sake of
simplicity. two types of sentences in English with a view to formalising the
construction of these sentences. The sentences we consider are those \vith a
nOL"; and a verb, or those with a noun-verb and adverb (such as 'Ram ate
quickly' or 'Sam ran '). The sentence 'Ram ate quickly' has the words 'Ram',
'ate', 'quickly' written in that order, If we replace 'Ram' by 'Sam', 'Tom',
'Gita', etc. i.e. by any noun, 'ate' by 'ran', 'walked', etc. i.e, by any verb in the

107
108 ~ Theory of Compurer Science

past tense, and 'quickly' by 'slowly', l.e. by any adverb. we get other
grammatically correct sentences. So the structure of 'Ram ate quickly' can be
given as (noun) (verb) (adverb). For (noun) we can substitute 'Ram'. 'Sam',
'Tom'. 'Gita', etc. . Similarly. we can substirure "ate'. 'walked', 'ran', etc. for
(verb). and 'quickly". 'slowly' for (adverb). Similarly, the structure of 'Sam ran'
can be given in the form (noun)
We have to note that (noun) (vdb) is not a sentence but only the
description of a particular type of sentence. If we replace (noun), (verb) and
(adverb) by suitable \vords, we get actual grammatically correct sentences. Let
us call (noun), (adverb) as variables. Words like 'Ram', 'Sam', 'ate',
'ran'. 'quickly", 'slowly' which form sentences can be caHed terminals. So our
sentences tum out to be strings of terminals. Let S be a variable denoting a
sentence. Now. we can form the following rules to generate two types of
sentences:
S -+ (noun) (verb) <adverb>,
\

5 --.+ (noun) (verb)


(noun) -+ Sam
(noun) -+ Ram
(noun) -+ Gita
-+ ran
(verb) -+ ate
(verb) -+ walked
(adverb) -+ slowly
(adverb) -+ quickly
(Each arrow represents a rule meaning that the word on the right side of the
alTOW can replace the word on the left side of the arrow.) Let us denote the
collection of the mles given above by P.
If our vocabulary is thus restricted to 'Ram', 'Sam', 'Gila', 'ate', 'ran"
'walked', 'quickly' and 'slowly', and our sentences are of the fonn (noun)
(verb) (adverb) and (noun) (verb). we can describe the grammar by a 4-tuple
(V\" I, P, S), where
li\ = {(noun). (verb). (adverb)!
I = {Ram, Sam, Gita. ale. ran. walked, quickly, slowly}
P is the collection of rules described above (the rules may be called
productions),
S ]s the special symbol denoting a sentence.
The sentences are obtained by (i) starting with S. (ii) replacing words
using the productions. and (iii) terminating when a string of terminals is
obtained.
\Vith this background \ve can give the definition of a grmmnar. As
mentioned earlier. this defil1ltion is due to Noam Chomsky.
Chapter 4: Formal Languages J;;;;\ 109

4.1.1 DEFINITION OF A GRAMMAR

Definition 4.1 A phrase-structure grammar (or simply a grammar) IS


WI, L, P, 5), where
(i) Vv is a finite non empty set \V'hose elements are called variables,
(ii) L is a finite nonempty set 'whose elements are called terrninals,
VI (', L = 0.
(iv) 5 is a special variable (i.e, an element of Ii,J called the start symboL
and
P is a finite set \vhose elements are a -7 {3. \vhere a and {3 are strings
on \\ u 2:. a has at least one symbol from V The elements of Pare
called productions or production rules or re\vriting rules.
Note: The set of productions is the kemel of grammars and language
specification. We obsene the following regarding the production rules.
0) Reverse substitution is not permitted. For example, if S -7 AB is a
production, then we can replace S by AB. but we cannot replace AB
by S.
(ii) No inversion operation is permitted. For example. if S -7 AB IS a
production. it is not necessary that AB -7 S is a production.
- - - - - -
EXAMPLE 4.1

G = (VI' L P, S) is a grammar
where
Vy = {(sentence). (noun). (verb). (adverb)}
L = [Ram. Sam. ate, sang. well]
5 = (sentence)
P consists of the follmving productions:
(sentence) -7 (noun) (verb)
(semence) -7 (noun) (verb) (adverb)
(noun) --7 Ram
(noun) ----7 Sam
(verb) -7 ate
(\ erb) -7 sang
(adverb) ----7 well
NOTATION: (i) If A is any set. then A'" denotes the set of all strings over A.
A + denotes A. ':' - {;\}. where ;\ is the empty string.
(ii) A, B. C, A 1, A 2• ... denote the variables.
(i ll) a, b, c. ' .. denote the terminals.
(iv) x. y. ;. H • . . . denote the strings of terminals.
lY, {3, Y. ... denote the elements of (t', u D*.
(vi) ":'{J = <\ for any symbol X in V\ u "
11 0 ~ Theory of Computer Science
----------------

4.1.2 DERIVATIONS AND THE LANGUAGE GENERATED


BY A GRAMMAR

Productions are used to derive one stling over \/N U L from another string.
We give a formal definition of derivation as follows:
Definition 4.2 If a ~ f3 is a production in a grammar G and y, 8 are any
two strings on U 2:, then we say that ya8 directly derives yf38 in G (we

"\!fite this as ya8 ~ yf38). This process is called one-step derivation. In


G
particular. if a ---1 f3 is a production. then a ~ f3.
G
Note: If a is a part of a stling and a ~ f3 is a production. we can replace
a by f3 in that string (without altenng the remaining parts). In this case we
say that the string we started with directly derives the new string.
For example,
G = US}. {O. I}, {S -t 051. S -t Ol}, 5)
has the production 5 ---1 OS1. So, 5 in 04S1 4 can be replaced by 051. The
resulting string is 04 0511". Thus. we have 04 51"+ ~ 0"OS11 4 .
G
Note: ~ induces a relation R on IVy U 2:)*. i.e. aRf3 if a ~ [3.
G G

Defmition 4.3 If a and f3 are strings on \/v U :E, then we say that a derives
~, '"
f3 if a ~ f3. Here ~ denotes the reflexive-transitive closure of the relation ~
G G G
in (Fy U :E)* (refer to Section 2.1.5).

Note: We can note in particular that a 7 a. Also, if a 7 f3. [X '1'= f3, then
there exist strings [Xl- a2, .. " (tll' where II ;::: 2 such that
(X = (X] ~ a2 ~ a 3 . .. ~ all = f3
G G G

When a ~ f3 is in n steps. we write a b f3.


G G
Consider. for example. G = ({5}, {O. I}. {S -t OSl, 5 -t OIl, 5).
*
As S ~ 051 ~ 02S1 2 ::::? 03S1 3, we have S ~ 03S1 3 . We also have
G G G G
3
0 51 3
~ 0 351 3 (as (X ~ a).
G G

Definition 4.4 The language generated by a grammar G (denoted by L( G)) is


defined as {w E :E* IS 7 H}. The elements of L(G) are called sentences.
Stated in another way, L( G) is the set of all terminal strings derived from
the start symbol S.
Definition 4.5 If 5 ,;, ex, then a is called a sentential form. We can note
G
that the elements of L(G) are sentential forms but not vice versa.
Chapter 4: Formal Languages ~ 111

Definition 4.6 G i and G: are equivalent if L(GJ = L(G:).


Remarks on Derivation
1. Any derivation involves the application of productions. When the
number of times we apply productions is one, we write a =? {3; when
.,. e
it is more than one, \ve \vrite CY. ~ f3 (Note: a ~ a).
e G
") The string generated by the most recent application of production is
called the working string.
3. The derivation of a string is complete when the working string cannot
be modified. If the final string does not contain any variable. it is a
sentence in the language. If the final string contains a variable. it is a
sentential form and in this case the production generator gets 'stuck'.

NOTATION: (i) We \vrite CY. ~ {3 simply as CY. :b {3 if G is clear from the context.
e
(ji) If A ~ CY. is a production where A E Vv, then it is called an
A-production.
(iii) If A ~ ai' A. ~ a:. .. nA. ~ CY.!/! are A-productions. these
productions are written as A ~ ai i CY.: j ... I am'
We give several examples of grammars and languages generated by them.

EXAMPLE 4.2
If G = ({5}. {a. I}. {5 ~ 051, s~ A}. S). find L(G).

Solution
As 5 ~ A is a production. S =? A. So A is in L(G). Also. for all n :::: 1.
G

=? 0"51" =? 0"1"
G G
Therefore.
0"1" E L(G) for n :::: a
(Note that in the above derivation, S ~ 051 is applied at every step except
in the last step. In the last step, we apply 5 ~ A). Hence, {O"I" In:::: O} ~ UG).
To show that L( G) ~ {O''1'' i 17 :::: A}. we start with ].V in L(G). The
derivation of It' starts with 5. If S ~ A is applied first. we get A. In this case
].V = A. Othenvise the first production to be applied is 5 ~ 051. At any stage

if we apply 5 ~ A, we get a terminal string. Also. the terminal string is


obtained only by applying 5 ~ A. Thus the derivation of IV is of the foml

5 =? 011 51" =? 0"1" for some n :::: 1


G G

l.e.
112 ~ Theory of Computer Science

Therefore.
LeG) = {Qlll l1 /n 2: Q}

EXAMPLE 4.3
If G = ({ 5}, {a}, {5 ----;; 55}, 5), find the language generated by G.

Solution
L(G) = 0. since the only production 5 -> 55 in G has no terminal on the
right-hand side.

EXAMPLE 4.4
Let G = ({ S. C}, {a, b}, P, 5), where P consists of 5 ----;; aCa. C ----;; aCa I b.
Find L(G).

Solution
S:=;. aea :=;. aba. So aba E L(G)
5 :=;. aCa (by application of 5 ----;; aCa)
b d'Cd' (by application of C ----;; aCa (n - 1) times)
fl
:=;. d'ba (by application of C ----;; b)
Hence. a"ba" E LeG), where n :2: 1. Therefore.
{d'ba"ln 2: I} s:: L(G)
As the only S-production is 5 ----;; aCa, this is the first production we have
to apply in the derivation of any terminal string. If we apply C ----;; b. we get aba.
Otherwise we have to apply only C ----;; aCa. either once or several times. So
we get d'Ca" with a single variable C. To get a terminal string we have to
replace C by b. by applying C ----;; b. So any delivation is of the fonn
S b a"bu n with n 2: 1
Therefore.
L( G) s:: {a" bail I n 2: ]}
Thus.
L(G) = {(/ba ll i Jl 2: I}
EXERCISE Construct a grammar G so that UG) = {a"bc/ 1I
1 n. m 2: l}.
Remark By applying the com'ention regarding the notation of variables.
terminals and the start symbol. it \vill be clear from the context whether a
symbol denotes a variable or terminal. We can specify a grammar by its
productions alone.
Chapter 4: Formal Languages ~ 113

EXAMPLE 4.5
If Gis S ~ as i ItS [ a [ h, find L(G).

Solution
We sho\v that U C) = {a. b} As V·le have only two terminals a, h,
7.

UG) :;;;;; {a. b} *. All productions are S-productions. and so A can be in L(G)
on1\ when S ~ A is a production in the grammar G. Thus.
UG) :;;;;; {a. h} ':' - {A} = {a, b} +

To show {Cl, b r :; ; ;
ICG). consider any string al a: ... ali' where each ai
is either a or h. The first production in the delivation of ClI{l2 ... all is S ~
as or 5 ~ bS according as a] = a or (lj = b. The subsequent productions are
obtained in a similar way. The last production is S ~ a or S ~ b according
as = a or a" = b. So aja2 ... ali E UG). Thus. we have L(G) = {a, h ]+.

EX~RCISE If G is S ~ as [a, then show that L( G) = {a} +


Some of the following examples illustrate the method of constructing a
grammar G generating a gi ven subset of stlings over E. The difficult P<hrt is the
construction of productions. \Ve try to define the given set by recursion and then
de\clop productions generating the strings in the given subset of E*.

EXAMPLE 4.6
Let L be the set of all pahndromes over {a. h}. Construct a grammar G
generating L.

Solution
For constructing a grammar G generating the set of all palindromes. \ve use
the recursive definition (given in Section 2.4) to observe the following:
ii) A is a palindrome.
Iii) a. b are palindromes.
(Jii) If x is a palindrome axo. then bxb are palindromes.
So \\e define P as the set consisting of:
S ~.\
S ~ (fand S ~ b
Oii) S ~ aSa and S ~ hSb
Let G = ({5} {a. b}, P. S). Then
5 => A, S => (I, S=>b
The. fore.
A. a. h E L(G)

If x is a palindrome of even length, then x = a 1a 2 .. " ([III {[ill . • • a!, where


....... ron
"'3" el"tJ'e~
' L's
~ L Ui _d 1 ~ lJ
'1 (), Tlf1e11 S =>':' d" \ U2
L. • -
i. .'1! (Ii; a !1i-l ... l-1
. .. (,1"1 "..: <::app'''l'TIa
'I b\' _ If b

S --" aSa or S ~ bSb. Thus. x E L(G).

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy