Language Grammar and Automata
Language Grammar and Automata
1
43 If ∑ is the alphabet, ∑ * is the set of strings obtained by concatenating zero or more symbols from∑. The
44 set ∑ * always contain λ and it is called star closure. ∑+ denotes it excludes λ and it is called positive
45 closure. If it is given ∑ it is to be assumed that it is star closure and contains λ. We can now define,
46 ∑+= ∑*-{ λ}
47 A language is defined very generally as of subset of ∑*. A string in a language L will be called a sentence
48 of L. Note that ∑* is countably infinite set. Hence ∑*=∑0U∑1 U∑2……. and ∑+=∑1 U∑2U∑3…….
49 For example, if ∑={a, b}, the following are the language over ∑
50 L1= {λ, a, b}
51 L2= {ab, aabb, aaabbb …}
52 L3= {w ε∑*| number of a’s and number of b’s in w is equal}
53 In the above example L1 is finite language but L2 and L3 is infinite language.
54 Grammar:
55 In 1959, Noam Chomsky tried to give a mathematical definition for grammar. He defined four types of
56 grammar viz., type 0, type 1, type2 and type 3. At the same time, the programming language ALGOL was
57 being considered. It was considered as a block-structured language and a grammar was required which
58 could describe all syntactically correct programs. The definition given was called Backus-Naur Form
59 (BNF). This definition found to be same as the definition given by Chomsky for type2 grammar. A
60 grammar for the English language tells us whether a particular sentence is well-formed or not. A typical
61 rule for grammar is “a sentence can consist of a noun phrase followed by a predicate.”
62 Consider the following two sentences and their parse tree,
Sentence
The Dog
63
64 Fig.1.1 “The dog is running”
Sentence
a beautiful city
65
66 Fig.1.2 “Kolkata is a beautiful city”
2
67 The rules of the grammar for a sentence1 can be written as follows:
68 Sentence<Noun Phrase>< Verb Phrase>
69 Noun Phrase<Article><Noun>
70 Verb Phrase<Verb><Noun Phrase>
71 The derivation of the grammar is as follows
72 Sentence<Noun Phrase>< Verb Phrase>
73 Sentence<Article><Noun><Verb Phrase>
74 SentenceThe <Noun><Verb Phrase>
75 SentenceThe dog <Verb Phrase>
76 SentenceThe dog <Verb><Noun Phrase>
77 SentenceThe dog is running
78 The rules of the grammar for a sentence2 can be written as follows:
79 Sentence<Noun Phrase>< Verb Phrase>
80 Noun Phrase<PropNoun>
81 Verb Phrase<Verb><Noun Phrase>
82 Noun Phrase<Article><Adjective><Noun>
83 Sentence<Noun Phrase>< Verb Phrase>
84 Sentence <PropNoun>< Verb Phrase>
85 SentenceKolkata < Verb Phrase>
86 SentenceKolkata <Verb><Noun Phrase>
87 SentenceKolkata is <Noun Phrase>
88 SentenceKolkata is <Article><Adjective><Noun>
89 SentenceKolkata is a <Adjective><Noun>
90 SentenceKolkata is a beautiful <Noun>
91 SentenceKolkata is a beautiful city.
92 A grammar G is defined by the quadruple
93 G= (V, T, S, P)
94 Where V is the finite set of objects called variables,
95 T is the finite set of objects called terminal symbols,
96 SεV is a special symbol called start symbol,
97 And P is a finite set of productions.
98 The production rule is the heart of the grammar; they specify how the grammar transforms one string to
99 another. This will be denoted by “”.In this book we will assume that all production rules are of the form
100 xy and yz where x, y ε (VUT)+ and z ε (VUT)*
101 Now if we have a string w1=uxv then it can be written as w2=uyv and then w3=uzv, or it can can be
102 directly written as w1=*=>w3, * indicates that some unspecified number of steps (including zero) can be
103 taken to derive w3 from w1.
104 Definition
105 Let G= (V, T, S, P) be a grammar. Then the set
106 L(G)={wεT*: S=*=>w3}
107 Is the language generated by G.
108 If wεL(G), then the sequence
3
109 S==> w1==> w2==> w3…………..==> wn==> w is a derivation of a sentence w. The strings S, w1, w2,
110 w3, wn which contains variables as well as terminals, are called sentential forms of the derivation. If w
111 contains terminal symbol then it is called sentence.
112 Now Let us discuss what should be the production rule for the language
113 Example 1:
114 L(G)={ an bn:n≥0}
115 If we consider a variable S and the production
116 SaSb
117 Sλ.
118 Then SaSbaaSbbaaaSbbbaaaλbbbaaabbb
119 So the grammar,
120 V=S
121 T=a,b
122 S=S
123 P= SaSb
124 Sλ
125 If L(G)={ an bn:n≥1}
126 SaSb
127 Sab.
128 Then SaSbaaSbbaaaSbbbaaaabbbbaaaabbbb
129 So the grammar,
130 V={S}
131 T={a,b}
132 S=S
133 P= SaSb
134 Sab
135 Example 2:
136 Consider the language,
137 L(G)={ an bn+1:n≥0}
138 The production rule is,
139 SaSb
140 Sb
141 The production rule can be written in the following manner also,
142 SAb
143 AaAb/ λ
144 In the 1st case number of production require is 2 and in the 2nd case number of production required is 3.
145 Hence we obviously choose the 1st case.
146 Example 3:
147 L(G)={ an b2n:n≥0}
148 In this problem b will appear twice than a.
149 So the production of the grammar is
150 SaSbb
151 Sλ
152 Example 3:
4
153 L(G)={ an+2 bn:n≥0}
154 In this problem two more “a” requires than equal no of a and b.
155 So the production of the grammar is
156 SaSb
157 Saa.
158 Example 4:
159 L(G)={ an bn-3:n≥3}
160 This problem is quit similar to the previous one.
161 SaSb
162 Saaa.
163 Example 5:
164 L(G)={ an bm:n≥0, m>n}
165 In this problem three condition have to be considered.
166 a)Number of a may be “0” or more than “0” but number of b appears more than “1”
167 b)Equal number a= Equal number b.
168 c)More “b” appears than “a”
169 So the production rule is,
170 SaSb/Bb
171 Bb/ λ
172 V={S, B}
173 T= {a, b}
174 S= {S}
175 P= SaSb/Bb
176 Bb/ λ
177 Example 6:
178 Find the grammars for ∑= {a,b} that generate the sets of all strings with exactly one ”a”
179 SAaA
180 AAb/ λ
181 Example 7:
182 Find the grammars for ∑= {a,b} that generate the sets of all strings with atleast one ”a”
183 SAaA
184 AAa/Ab/ λ
185 Example 8:
186 Find the grammars for ∑= {a,b} that generate the sets of all strings with no more than three “a”s.
187 SAaA/ AaAaA/AaAaAaA
188 AAb/ λ
189 Example 9:
190 Find the grammars for ∑={a,b} that generate the sets of all strings with at least three “a”s.
191 S AaAaAaA
192 AAa/Ab/ λ
193 Example 10:
194 Find grammars for the languages on ∑= {a} and L(G)={w:|w|mod3=0}
195 SaaaS/λ
196 Find grammars for the languages on ∑= {a} and L(G)={w:|w|mod3>0}
5
197 This problem can be divided into two sub problem |w|mod3=1 and |w|mod3=2
198 |w|=1 can be represented as 1+ 0X3 |w|=2 can be represented as 2+ 0X3
199 |w|=4 can be represented as 1+ 1X3 |w|=5 can be represented as 2+ 1X3
200 |w|=7 can be represented as 1+ 2X3 |w|=8 can be represented as 2+ 2X3
201 |w|=10 can be represented as 1+ 3X3 |w|=11 can be represented as 2+ 3X3
202 …………………………….. ………………………………….
203 Hence the production rule is
204 SS1/S2
205 S1aaaS1/a
206 S2aaaS1/aa
207 Example 11:
208 Give a simple description of the language generated by the grammar with the production
209 SaA
210 AbS
211 S λ
212 Let’s make an arbitrary derivation of the grammar
213 SaAabSabaAababSababλabab
214 Hence the language of the grammar is (ab)n
215 Example 12:
216 What language does the grammar with this production generate?
217 SaA
218 AB
219 BAa
220 Let us make an arbitrary derivation of the grammar
221 SaAaBaAaaBa
222 This derivation will not stop ever. Hence this type of language is called infinite language.
223 Example 13:
224 Let L(G)={ab, aa, baa}. Which of the following strings are L*: abaabaab, ababbaa, baababaa.
225 Example 14:
226 Are the two grammars with respective productions
227 SaSb/ab/λ
228 And
229 SaAb/ab
230 AaAb/ λ
231 Equivalent or not where S is the start symbol.
232 Let us make a derivation of the two production with minimum condition.
233 For the first production
234 S λ
235 For the second production
236 Sab
237 As the two grammars producing separate terminals in the minimum condition, hence they are not
238 equivalent.
239 Example 15:
240 Find the language of the grammar with the production
6
241 SSS
242 S λ
243 SaSb
244 SbSa
245 The language is L (G) = {na(w)= nb(w)}
246 Where na(w) is the number of a in w and nb(w) is the number of b in w.
247 Example 16:
248 Find the grammar of the language
249 L(G) ={ an bn am bm:n≥0, m≥0}
250 The production rule of the grammar is
251 SSS
252 SaSb/λ
253 Automata
254 The Automaton is an abstract model of any digital computer. Any smallest unit of a machine or computer
255 having a basic job to perform is called automaton and the plural form is called automata. There are
256 different varieties of such abstract models (also called models of computation) which can be
257 defined mathematically. Some of them are as powerful in principle as today's real computers,
258 while the simpler ones are less powerful. We will study the simpler machines as they are easier
259 to introduce some formalism used in theory. The simplest computational model is called a finite state
260 machine or a finite automaton. The basic block diagram of a finite state machine is shown in fig 1.3. It
261 consists of an input file, which the automaton can read but not change. The input file is divided into cells,
262 each of which can hold one symbol. The input mechanism can read the input file from left to right, one
263 symbol at a time. The input mechanism can also sense the end of the input string. The automaton can
264 produce output of the same form. It contains a temporary storage device, consisting of unlimited number
265 of cells, each capable of holding symbol corresponding to an alphabet (it is not necessary that alphabet
266 should be stored in the storage device). The automation can read and change the contents of the storage
267 cells. Finally the automation has a control unit, which can be in any of the in any of a finite number of
268 internal states, and which can change states in a defined manner.
Input File
0 0 1 0 1 0 1 0 0 0
0
0
1
Control Unit
Storage Unit
1
0
0
0
0 0 0 1 0 1 0 0 0 0
7
271 The concept of a finite automation appears to have arise in the 1943 paper. A logical calculus of the ideas
272 immanent in nervous activity, by Warren McCullock and Walter Pitts. In 1951 Kleene introduced regular
273 expressions to describe the behavior of finite automata. In 1959, Dana Scott and Michael Rabin
274 introduced non-deterministic automata and showed the surprising theorem that they are equivalent to
275 deterministic automata.
276 Let us consider a finite state machine that takes input bits serially and produce the output “1” when the
277 last four inputs are 0101 else it produce the output “0” as shown in fig 1.3. Input file contains the input
278 0010101000 and the control unit read the input from left to right and produce the output serially
279 0000101000. Storage unit store the contents of the input file, alphabets.
280 At any given time, the control unit is in some internal state, and the input mechanism is scanning a
281 particular symbol on the input file. The internal state of the control unit at the next transition is
282 determined by the next state or transition function. The transition function gives the next state in terms of
283 the current state, the current input symbol, and the information currently in the temporary input storage.
284 Automata generally of two types accepter and transducer. When the automata produces the output in
285 simple “yes” or “no” or “0” or “1” then it is called accepter. The sequence detector we discussed earlier
286 is an example of accepter. A more general automata, capable of producing strings of symbols (number or
287 character) as output, is called a transducer. Example of a transducer is counter, shift register.
288
289
290
291
292
293
294
295