0% found this document useful (0 votes)
45 views32 pages

Lecture 2 - Chapter 1 (1.2)

Uploaded by

wondimelese2012
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views32 pages

Lecture 2 - Chapter 1 (1.2)

Uploaded by

wondimelese2012
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Chapter 1: Introduction

Languages, Grammars and


Automata
 Language
 Symbol
 Alphabet
 String
 Length of a string, concatenation of string, …
 Grammars
 Automata
 Deterministic Automata
 Non-deterministic Automata
 The Chomsky Hierarchy
 Before we define a language lets come clear with some
important terms.
 Symbol - an arbitrary datum which has some meaning to or
effect on the machine.
 Letters and digits are examples of frequently used
symbols.
 Alphabet (Â): is a finite set of symbols
 Example:- Â = {a, b, c, d}, Â = {0, 1}
 String ( or Word) – is a finite sequence of symbols
 Example:-W = abc, V = 123, Z = 01010110
 Length a string W, denoted by |W| , is the number of symbols
composing the string.
 Example: if W = abbab, then
 |W| = 5
 |W|b = 3 (length of a word w with respect to b )

• Empty string is denoted by ε or l, |l| = 0


 If Â= {a, b}, then ab, abb, a, b, are some of the strings over Â
 Σ* =All strings of symbols from Σ (Kleene closure)
 Σ+ = Σ* - {ε} ( Positive closure)
 Σi = Set of Strings over Σ having length i.
 Example If Σ = {a, b, c} then
 Σ* = {l, a, b,c, aa, ab, bc, abc, , …}
 Σ+ = { a, aa, abc, aabc, …}
 Σ 1= Σ
 Σ 3= {aab, abc, ccc, abb, …}
 If w is a string then w0 = l
 Note that
 Σ* =

 Σ+ =
 Given a string w = abc
 Prefix of w are l, a, ab, & abc
 Suffix of w are l, c, bc, & abc
 Proper prefix or suffix is a prefix or suffix other
than the string itself
 Proper Prefix of w are l, a, & ab
 Proper Suffix of w are l, c, & bc
 The concatenation of two strings is obtained by appending the
second string to the right of the first string
 Example: -
 Given w = a1a2..an & v = a1a2..am then concatenation of w &
v , denoted as wv, is given as wv = a1a2..ana1a2..am
 a3.a2.= aaabb
 Two strings are equal if they have the same number of symbols
and these symbols match, character for character.
 Concatenation operation is associative, and it is not
commutative
 The empty string serves as the identity element
for concatenation.
 That is, for all strings x, x.l =l.x =x
 String Reversal – Let w = a1a2…an be a string,
then the string reversal of w, reverse (w) or wR, is
anan-1…a1
 E.g. If w = abc, then reverse(w) = wR =cba
 Definition:
 A language, L, is a collection of strings over a given
alphabet with some rules known as grammars. It may or
may not be finite.
 A language may be thought of as a subset of all possible
strings (L z Σ*)

 Example Let Σ = {a, b}


 The set {a, aa, abab, ba} is a finite language over Σ
 The set L = {anbn:n > 0} is an infinite language over Σ
 Example: Let Σ = {0, 1}, then
 L = {x|x is in Σ* and x contains an even number of
0’s} = {11, 1010, 00, 0000, 1001, etc}
 Language are sets: Hence, the union, intersection, &
differences of two languages are defined.
 Complement of a language L with respect to Σ* is
L = Σ*- L
 The reverse of a language is the set of all string
reversals, LR = {wR:weL}
 The concatenation of two languages L1 & L2 is the set of all
strings obtained by concatenating any element of L1 with any
element of L2:
 i.e. L1L2 = {xy: x e L1, y e L2 }
• Example: L1 = {a, b} & L2 = {c, de}, then L1L2 = {ac, ade,
bc, bde}
• Ln is L concatenated with itself n times
 L0 = {l}, L1 = L, L2 = L1 L1 , L3 = L1 L1 L1…
 L* = L0UL1UL2… (Star Closure)
 L+ = L1UL2UL3… (Positive closure)
 Example: If L = {anbn:n > 0} then
 L0 = {l}
 L1 = L = {anbn:n > 0}
 L2 = {anbnambm :n > 0, m > 0}. note that n and m
are unrelated – the string aaabbbaabb is in L2
 L3 = {anbnambm akbk:n > 0, m > 0, k > 0}, here
again n, m and k are unrelated – (that is they be the
same or different)
 LR = {bnan:n > 0}
 Definition
 A Grammar is defined as a quadruple
G = (N, T, S, P), where
 N (sometimes denoted as V) is the set of finite set of objects called
non-terminals or variables.
 They are place holders and are denoted by capital letters.
 There should exist rules in P that tells us how to replace the non-
terminals.
 T (sometimes denoted as Σ ) is a finite set of objects called
terminals.
 Used to construct strings of a language, How?
 Are usually represented by small letters and digits
 P - stands for production rules, are hearts of
grammar
 Each production rule is in the form of
(N U T)+ f (N U T)*
 The left hand side should contain at least one non-
terminal
 A production rule XfY implies replace X by Y in
some string
 Example: wxy  wyv … using production XfY
 S e N is a start symbol
 There should be at least one production that has S at
left hand side. (S f (NUT)*).
 The derivation of a string of a language should start
with S.
 Definition:
 Let G = (N, T, S, P) be a grammar. Then the set
L(G) = { w e T* : S
*
w} is the language
generated by G.
 Example1:
 Consider the grammar G = ({S}, {a, b}, S, P) with p
S f aSb ……..(1)
S f l ………..(2)
 Find the language generated by G
 Solution
 Let the language be L(G)
 Look how strings of the language are produced and the
general formula for the language produced by the
grammar is given
S l (2) So, l e L(G)
S aSb using (1) ab e L(G)
ab using (2) So,
S aSb using (1) aabb e L(G)
aaSbb Using (1) So, or
aabb using (2) a2b2 e L(G)
S aSb using (1) aaabbb e L(G)
aaSbb Using (1) So, or
aaaSbbb Using (1) a3b3 e L(G)
aaabbb Using (2)
… …
So, L(G) = {, ab, aabb, aaabbb, aaaabbbb, etc}
L(G) = {anbn : n≥0}
 Example 2:
 Find the language generated by the following grammar G =
{N, T, S, P} where
 N = {S, B}
 T = {a, b, c}
 P : S f aBSc….(1)
S f abc….(2)
Ba f aB….(3)
Bb f bb….(4)
 Answer: L(G) = {anbncn : n e N}
S  abc using (2) So abc e L(G)
S  aBSc using (1)
aBabcc using (2) aabbcc e L(G)
aaBbcc using (3) or
aabbcc using (4) So a2b2c2 e L(G)

S  aBSc using (1)


 aBaBScc using (1) aaabbbccc e L(G)
 aBaBabccc using (2) Or
 aaBaBbc3 using (3 & 3) So a3b3c3 e L(G)
 aaaBbbc3 using (3 & 4)
 a3b3c3 using (4)
… …
 Example:
 Find a grammar that generates L(G) = {anbn+1: n ≥ N}
 Answer
 G = (N, T, S, P) where
 N = {S, A}, T = {a, b}, S = S
 P: S f Ab, S f aSb, A f l
Or
 (P: S f Ab, A f aS, A f l)
1. Find a grammar that generates
a) L(G) = {an: n ≥ 0}
b) L(G) = {(ab)n: n ≥ 0}

2. Find the language generated by the following grammar


a) G = (N, T, S, P) where
N = {A, S}
T = {a, b}
P: S  Ab
A  aA | a | b | bA
 An automata is an abstract model of a digital computer
(= it is a self operating machine).
 Every automaton includes some essential features.
 Mechanism for reading input – the input is a string
over a given alphabet written on an input file
 Produce an output in some form
 Have temporary storage device – the storage cells
can be read and changed.
 Has a control unit
 Can be in any one of a finite number of internal states
 The internal state of the control unit is determined by the next-
state or transition function
input

Storage

Control Unit

Output
BATTERY

input: switch
output: light bulb
Actions (Control Unit): flip switch
States (Storage): on, off
 A particular state of the control unit, input file, and
temporary storage is known as configuration.
 A move is the transition of the automaton from one
configuration to the next.
 One automaton differs from the other in the way in
which the output can be produced and the nature of
the temporary storage.
 An automaton can be deterministic or non-
deterministic
 A deterministic automaton is the one in which
each move is uniquely determined by the current
configuration.
 i.e. if the internal state, the input and the content of
the temporary storage, we can predict the future
behavior of the automaton.
 In non-deterministic automaton there may have
several possible moves, so we can only predict a set of
possible actors.
 An automaton whose output response is limited to a
simple “yes” or “no” is called an accepter (because it
accepts or rejects)
 A more general automaton, capable of producing
strings of symbols as output is called a transducer.
 There are four categories of formal grammars in the
Chomsky Hierarchy,
 Type 0 - the most general  difficult to parse but highly expressive
 Type 1
 Type 2
 Type 3 - the most restrictive easy to parse but less expressive

 Every type 3 grammar is a type 2 grammar, and every type 2 is


a type 1 and so on.
 Each category of grammars have corresponding category of
languages and automata
The Chomsky Hierarchy
 ….
 Type 3 grammars - Regular Grammars - is a grammar
which is either left-linear or right-linear
 Left linear – all production have the form A -> Bx, A -> x
where A & B are non-terminals and x is any string of
terminals
 Right linear – all production have the form A -> xB, A ->
x where A & B are non-terminals and x is any string of
terminals

 Type 2 grammars - Context Free Grammars


 Productions are of the form X–> v where v is an arbitrary
string of symbols in V, and X is a single non-terminal.
 Type 1 Grammars - Context-sensitive grammars
 Productions are of the form uXw –> uvw where u, v and w are
arbitrary strings of symbols in V, with v non-null, and X a single
non-terminal.
 In other words, X may be replaced by v but only when it is
surrounded by u and w. (i.e. in a particular context).
 Type 0 Grammars - Free or unrestricted grammars
 These are the most general productions which have the form u –>
v where both u and v are arbitrary strings of symbols in V, with
u non-null.
 There are no restrictions on what appears on the left or right-
hand side other than the left-hand side must be non-empty.
Grammar Language Automata

Regular Grammars - Regular Languages Finite State Automata


Type 3
Context Free Context Free Push down Automata
Grammars - Type 2 Languages
Context-sensitive Context sensitive Bounded Turing
grammars – Type 1 languages Machine
Free or unrestricted Recursively Turing Machine
grammars– Type 0 Enumerable
languages

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy