Languages, Grammar and Recognizers
Languages, Grammar and Recognizers
AND RECOGNIZERS
GROUP 1
YUSSUF MARIAM AGBEKE
ANIFOWESHE CLINTON
OGUNTIMOJU SARAH
ADEGBOYEGA OLAYINKA
Introduction to Formal Languages
Definition: A formal language is a precisely defined set of strings over an alphabet. Unlike
natural languages (like English) which can be ambiguous, formal languages follow
strict mathematical rules that eliminate ambiguity.
Key Difference: Unlike natural languages, formal languages follow strict mathematical rules
KEY COMPONENTS
•Alphabet (Σ): Finite set of symbols
EXAMPLE:
E→E+T|T
T→T*F|F
F → (E) | id
Derivations and Parse Tree
Derivation: Sequence of rule applications transforming start symbol into terminal string
Types:
Leftmost derivation: Always replace leftmost non-terminal first
Rightmost derivation: Always replace rightmost non-terminal first
Parse Tree: a graphical representation of the derivation process of an input program, showing each
grammar symbol used in the derivation
The Chomsky Hierarchy
Type 0 (Unrestricted):
•Rules: α → β (no restrictions)
•Recognizer: Turing Machine
•Example: Valid programs that halt
Type 1 (Context-Sensitive):
Rules: αAβ → αγβ (|γ| ≥ 1)
Recognizer: Linear Bounded Automaton
Example: a^n b^n c^n
Type 2 (Context-Free):
Rules: A → γ (A is a single non-terminal)
Recognizer: Pushdown Automaton
Example: Balanced parentheses, a^n b^n
Type 3 (Regular):
Rules: A → aB or A → a
Recognizer: Finite Automaton
Example: (ab), ab*
Ambiguity in Grammars
Definition: A grammar is ambiguous if there exists at least one string with multiple valid
leftmost derivations or parse trees.
Example: E → E + E | E * E | (E) | id
String "a + b * c" can be parsed two different ways
Resolving Ambiguity:
Is a formal language that can be recognized by a finite automation or defined by a regular expression
Properties:
Closed under union, intersection, concatenation, complement, Kleene star
Cannot count or match balanced structures
Cannot recognize nested structures
Regular Expressions: - Basic elements: ε (empty string), a (terminal symbols) - Operations: Concatenation
(rs), Alternation (r|s), Kleene Star (r*), Grouping ((r)), Optional (r?), One or more (r+)
Properties:
Closed under union, concatenation, Kleene star (NOT intersection or complement)
Can recognize balanced structures
Examples of Context-Free Languages: - Balanced parentheses: S → (S)S
Applications:
XML/HTML validation
Components:
States
Input alphabet
Transition function
Start state
Accepting states
Applications:
Lexical analysis
Pattern matching
Pushdown Automata
Definition: A Pushdown Automata (PDA) is a type of finite automata with an added stack
memory, enabling it to recognize Context-Free Languages. PDAs are used in computation
theory and are more capable than finite-state machines
Components: States, input alphabet, stack alphabet, transition function, start state, initial stack
symbol, accepting states
Example:
• Syntax Analysis:
Uses context-free grammars
Builds parse trees
Techniques: LL, LR parsing
Tools: Yacc/Bison, ANTLR
• Semantic Analysis:
Type checking
Symbol table management
Regular Expressions
Basic Elements
ε (empty string)
a (for any a ∈ Σ)
Operations:
Concatenation: rs
Alternation: r|s
Kleene Star: r*
Grouping: (r)
Optional: r?
One or more: r+
Rules for Regular Expressions
1.Every letter of ∑ can be made into a regular expression, null string, ∈ itself is a regular expression.
2..If r1 and r2 are regular expressions, then (r1), r1.r2, r1+r2, r1*, r1 + are also regular expressions.
•Recursive Descent
•LL(1), LL(k) Parsing
•Starts with start symbol, expands downward
Bottom-Up Parsing:
•Shift-Reduce
•LR(0), SLR(1), LALR(1), LR(1)
•Starts with input, reduces to start symbol
Parsing Challenges:
• Other Applications:
Natural language processing
THANK YOU
GROUP 1