0% found this document useful (0 votes)
9 views32 pages

TOC Notes Endsem

The document provides an overview of formal language theory, detailing the Chomsky hierarchy and the types of grammars, including regular, context-free, context-sensitive, and recursively enumerable languages. It explains the properties and implications of context-free grammars (CFGs), including derivations, ambiguity, and methods to address ambiguity. Additionally, it discusses the role of pushdown automata (PDAs) in recognizing context-free languages and outlines decision problems related to these languages.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views32 pages

TOC Notes Endsem

The document provides an overview of formal language theory, detailing the Chomsky hierarchy and the types of grammars, including regular, context-free, context-sensitive, and recursively enumerable languages. It explains the properties and implications of context-free grammars (CFGs), including derivations, ambiguity, and methods to address ambiguity. Additionally, it discusses the role of pushdown automata (PDAs) in recognizing context-free languages and outlines decision problems related to these languages.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 32

Grammer

Set of rules
V –> Variables / Non terminals
T  Terminals
P  Production rules
S  Start symbol
Type 3  regular languages

A → σB, where A and B are variables, and σ is a terminal.


A → ϵ, where A is a variable, and ϵ represents the empty string.
Type 2  Context free languages
V  (V + T ) *
Type 1 Context sensitive languages

α → β, where α, β ∈ (V ∪ T)∗, |β| ≥ |α|, and α contains at least one variable.

Type 0  recursive and recusively enumerable languages


(VUT)*  (VUT)*
Language  set of strings formed by well defined rules
Grammer represents a languages L(G)
Regular Grammer  represrent regular language
LRG  Left recursive grammar 
RRG  Right recursive grammar 

L(Any type of RG) is Regular language\

Context free grammar


LHS contains exactly one non terminal and 0 terminals-
Find out the first non recursive rule and derive the language for that and them
subsititue from botton to top

All RG are Context free , vice versa not true


Here S does not have
a terminal to end , so L
= phi
Remembe
r to take
different

No need as
(a+b)* covers
a^n.b^n
R =Reverse
Context free grammers
Sentential forms
sentential form is a string of symbols (both terminals and nonterminals)
derived from the start symbol of a context-free grammar (CFG) through
a sequence of replacement operations, known as derivations.

sentential form is any string that can be generated by applying the


production rules of a CFG to the start symbol, without any restriction on
the number of steps or the order of application.

LMD and RMD

Formal Definitions

Leftmost Derivation (LMD): A derivation in a CFG is a leftmost derivation if, at each


step, a production rule is applied to the leftmost variable in the current sentential form.

Rightmost Derivation (RMD): Similarly, a derivation is a rightmost derivation if, at each


step, a production rule is applied to the rightmost variable in the current sentential form.
Example
Let's consider the CFG G with the following productions (similar to the grammar in Example
4.2):

S → a | S + S | S ∗ S | (S)
1.

Suppose we want to derive the string a + ( a * a ).


Leftmost Derivation (LMD):
S (Start symbol)
S + S (Applying production 2 to the leftmost S)
a + S (Applying production 1 to the leftmost S)
a + (S) (Applying production 4 to the leftmost S)
a + (S * S) (Applying production 3 to the leftmost S)
a + (a * S) (Applying production 1 to the leftmost S)
a + (a * a) (Applying production 1 to the leftmost S)
In this LMD, at each step, we choose the leftmost S and replace it with the right-hand side of
a production rule.
Rightmost Derivation (RMD):
S (Start symbol)
S + S (Applying production 2 to the rightmost S)
S + (S) (Applying production 4 to the rightmost S)
S + (S * S) (Applying production 3 to the rightmost S)
S + (S * a) (Applying production 1 to the rightmost S)
S + (a * a) (Applying production 1 to the rightmost S)
a + (a * a) (Applying production 1 to the rightmost S)
In the RMD, we always select the rightmost S for replacement.
An ambiguous grammar is a CFG where at least one string in the language it generates has
more than one derivation tree, or equivalently, more than one leftmost derivation (LMD)
or rightmost derivation (RMD) (Definition 4.18 and Theorem 4.17).
Implications of Ambiguity:
● Multiple Interpretations: Ambiguity can lead to multiple interpretations of a string,
which can be problematic in applications like programming language parsing, where each
string should have a unique and unambiguous meaning.
● Parsing Challenges: Ambiguous grammars can make parsing more complex, as the parser
needs to consider multiple possible derivation trees for a given input string.

A context-free grammar is considered unambiguous if every string in the language it


generates has a unique derivation tree (or a single LMD and RMD).
Advantages of Unambiguity:
●Clear Interpretation: Unambiguous grammars provide a single, well-defined
interpretation for each string in the language.
●Simplified Parsing: Parsing becomes simpler with unambiguous grammars, as there is only
one correct derivation tree to consider.

Addressing Ambiguity
There are several approaches to addressing ambiguity in CFGs:

Rewriting the Grammar: The most desirable approach is often to rewrite the grammar to
eliminate ambiguity. This may involve introducing new variables and productions to enforce
the desired interpretation.

Precedence and Associativity Rules: In grammars for expressions, ambiguity can be
resolved by explicitly specifying precedence and associativity rules for operators. This
approach is illustrated in the unambiguous grammar for Expr.

Parser-Based Disambiguation: If rewriting the grammar is not feasible, the parser can be
designed to handle the ambiguity by applying specific disambiguation rules. However, this
approach can be complex and less desirable than having an unambiguous grammar.
Definition

 Context-Free Languages (CFLs) are generated by Context-Free Grammars (CFGs).


 A CFG consists of:
o Variables (Nonterminals): Symbols replaced by other variables or terminals.
o Terminals: Symbols forming the final string.
o Start Symbol: A designated variable to start derivations.
o Production Rules: Define how variables can be replaced.
 CFL is formally denoted as L(G)L(G)L(G), where GGG is the CFG.

Properties

1. Non-Regular Languages:
o CFLs are more powerful than regular languages.

2. \
3. Closure Properties:
o CFLs are closed under the following operations:
 Union: The union of two CFLs is a CFL.
 Concatenation: Concatenation of two CFLs results in a CFL.
 Kleene Star: Repeating a CFL zero or more times gives a CFL.
4. Non-Closure Properties:
o CFLs are not closed under:
 Intersection: The intersection of two CFLs may not be a CFL.
 Complement: Complementing a CFL may not result in a CFL.
 Difference: Subtracting one CFL from another may not yield a CFL.

Key Theorems

1. Pumping Lemma for CFLs:


o A tool to prove that certain languages are not context-free.
o For any CFL, long enough strings can be “pumped” (repeated sections) and
still remain in the language.
o If a language violates the pumping lemma, it is not a CFL.
2. Ogden's Lemma:
o A generalization of the pumping lemma with more flexibility.
o Used to prove that a language is not context-free by marking positions in a
string.
Decision Problems

1. Solvable Problems for CFLs:


o Membership Problem: Determine if a string belongs to a CFL.
o Emptiness Problem: Check if a CFL generates no strings.
2. Undecidable Problems for CFLs:
o Nonempty Intersection: Determine if the intersection of two CFLs is non-
empty.
o Generates All Strings: Check if a CFG generates all possible strings over its
alphabet.

Context-Free Grammars (CFGs)

 Definition: CFGs are used to formally define CFLs.


 Derivation:
o Process of generating strings by repeatedly applying production rules.
o Types:
 Leftmost Derivation (LMD): Always replace the leftmost variable
first.
 Rightmost Derivation (RMD): Always replace the rightmost variable
first.
 Ambiguity in CFGs:
o A CFG is ambiguous if a string has more than one derivation tree or
LMD/RMD.
o Problems of Ambiguity:
 Leads to multiple interpretations of strings.
 Causes issues in applications like programming languages.
o Solutions: Resolve ambiguity by:
 Rewriting the grammar.
 Applying precedence/associativity rules.
 Using parser-based disambiguation techniques.

Pushdown Automata (PDAs)

 Definition: PDAs are machines used to recognize CFLs.


 How it Works: Extends finite automata with a stack to handle nested structures.
 Types of PDAs:
o Deterministic PDA (DPDA): Recognizes a subset of CFLs.
o Nondeterministic PDA (NPDA): Recognizes all CFLs.
 Note: Not all CFLs are recognized by DPDAs.

Applications of CFLs
1. Programming Languages: CFGs define the syntax of programming languages.
2. Compiler Design: Parsing (breaking code into components) relies on CFGs and
PDAs.
3. Natural Language Processing (NLP): Models syntactic structures of languages.
4. Formal Verification: Specifies and verifies system behavior using CFLs.

The Chomsky Hierarchy

 Context-Free Languages (CFLs) belong to Type 2 in the Chomsky Hierarchy,


which categorizes languages by their complexity:
o Type 3 (Regular Languages): Recognized by finite automata.
o Type 2 (Context-Free Languages): Recognized by PDAs.
o Type 1 (Context-Sensitive Languages): Recognized by linear bounded
automata.
o Type 0 (Recursively Enumerable Languages): Recognized by Turing
machines.

Simplification
Remove null production A  epsilon
Remove unit procuctions
A  B , both side single non terminal

Remove useless

Normalization of CFG
First step is to minimize using steps shown
Then you can do CNF/GNF
CNF (Chomsky normal form)
V  AB / a
If a CFG is CNF, then if we try to constrict a word of length n then exactly 2n-1
steps are required
Greiback normal form

A  aα , here α is V* and a is a single terminal

If a CFG is GNF, then if we try to constrict a word of length n then exactly n


steps are required
Closure properties:
Concatenation
Union
Kleene star
Every CFG can be deduced to PDA.
If for a language, PDA cannot be deduced then it is not a CFG
Definition: In acceptance by final state, a PDA M accepts a string x if, after
processing the entire input x, the PDA ends in a state that belongs to the
designated set of accepting states A.
Key points:

Focus on the final state: The primary criterion for acceptance is whether the
PDA is in an accepting state at the end of the input. The content of the stack at
this point is not considered in determining acceptance.

Default method: Unless explicitly stated otherwise, acceptance by PDA is
generally assumed to be by final state. This was the case in the formal definition
of PDA acceptance (Definition 5.2) and in the early examples of PDAs in the
sources.

Mirrors finite automata: This method of acceptance is analogous to how finite
automata (FAs) work. FAs accept a string if they end in an accepting state after
reading the entire input.
Acceptance by Empty Stack
Definition: In acceptance by empty stack, a PDA M accepts a string x if, after
processing the entire input x, the PDA's stack becomes empty. The final state of
the PDA is irrelevant in this case.
Key points:

Focus on the stack: The sole criterion for acceptance is whether the stack is
empty at the end of the input. The state of the PDA at this point does not play a
role in determining acceptance.

Explicitly defined: When a PDA is defined to accept by empty stack, it needs to
be explicitly stated (Definition 5.27).

Alternative but equivalent: While less commonly used as the default,
acceptance by empty stack is a valid and powerful method. Theorem 5.28
assures us that for any PDA accepting a language by final state, there exists
another PDA that accepts the same language by empty stack. This signifies that
both methods are equivalent in terms of the languages they can recognize.
Check finiteness of CFG, convert it into CNF and then draw the graph and if it
contains cycle then it is a infinite language else finite

In CFG, equality and ambiguity is undecidable( no algorithm)


empty nonempty finite infinite and membership are decidable
Turing church thesis
Any algorithmic procedure that can be carried out by a human can be carried out
by a turing machine
https://www.tutorialspoint.com/design-a-tm-that-perform-right-shift-over-0-1

Post correspondence problem


Variants of turing machines
1) NTM
a. It is the non deterministic turing machine
b. For a particular state, and a input symbol , the turing machine
transition can has many choises
c. Same as standard TM but the transition function changes\
d.

2) Multitape turing machine


a. Machine can have many tapes
b. Transition function gives the tape symbol to write and direction for
all n states in the transition function
c.

https://youtu.be/oCBi3g0N358?si=-eBQbPb27VO_dzeE
Recursive and recursively enumerable languages
Twos complement tm
rL  all deciadavle
CFL  membership finiteness emptiness

Equality non empty intersection id L is regular

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy