Untitled document (26)
Untitled document (26)
Automata Theory
1. Concepts of Automata Theory
Automata theory is the study of abstract machines (automata) and the computational problems
that they can solve. It forms the foundation of formal languages and computational theory.
Key Components
Example:
● (a|b)* represents all strings containing a and b in any order, including the empty string
ε.
Regular languages are the simplest and can be recognized by Finite Automata (FA).
4. Regular Grammar
A Regular Grammar is a formal grammar that generates regular languages. It follows these
rules:
Example:
● Right-linear: S → aA | bB | ε
● Left-linear: A → Ba | ε
A DFA is a finite automaton where for each state and input, there is exactly one transition.
Formal Definition
A DFA is a 5-tuple:
M=(Q,Σ,δ,q0,F)
where:
● Q = Finite set of states
● Σ = Input alphabet
● δ = Transition function Q × Σ → Q
● q_0 = Start state
● F = Set of final (accepting) states
Q = {q0, q1}
Σ = {a, b}
q0 → q0 on 'a'
q0 → q1 on 'b'
q1 → q1 on 'b'
F = {q1}
An NFA allows multiple transitions for the same input, including transitions without input
(ε-moves).
Example:
For ab*, the NFA has transitions:
● q0 → q1 on a
● q1 → q1 on b
Example:
If NFA has states {q0, q1, q2}, the DFA will have states like {q0}, {q0, q1}, {q0, q2},
etc.
To remove ε-transitions:
8. Minimization of DFA
Minimization reduces the number of states in a DFA while preserving language recognition.
Example Comparison
Definition of CFG
A CFG is a 4-tuple:
G=(V,Σ,P,S)
where:
S→(S)S ∣ ϵ
2. Parse Trees
A Parse Tree (Derivation Tree) visually represents the structure of a string derived from a CFG.
For the grammar: S→aSb ∣ ϵ The string aabb has the following parse tree:
/ \
a S
/ \
a S
/ \
ε b
b
3. Derivation in CFG
A derivation is the sequence of rule applications to generate a string.
Types of Derivation:
Example:
Given CFG:
S→aSb ∣ ab
● Leftmost Derivation:
1. S → aSb
2. S → ab
3. aSb → a ab → aabb
● Rightmost Derivation:
1. S → aSb
2. S → ab
3. aSb → a ab → aabb
4. Ambiguity in Grammar
A grammar is ambiguous if there is more than one parse tree or derivation for a given string.
E→E+E ∣ E∗E ∣ id
● Use precedence rules (e.g., multiplication has higher precedence than addition).
● Modify the grammar to enforce order.
Example:
E → E + T | T
T → T * F | F
F → id
A→Aα∣β
E→E+T ∣ T
E' → + T E' | ε
6. Left Factoring
Left factoring is used when a grammar has common prefixes, making parsing difficult.
A→αβ1∣αβ2
Here, α is common, so we rewrite:
A→αA′
A′→β1∣β2
S → if E then S S' | A
S' → else S | ε
Formal Definition
A PDA is a 6-tuple:
(Q,Σ,Γ,δ,q0,F)
where:
● Q = Set of states
● Σ = Input alphabet
● Γ = Stack alphabet
● δ = Transition function: (Q × Σ × Γ) → (Q × Γ*)
● q_0 = Start state
● F = Set of final states
L={anbn∣n≥0}
Conclusion
● CFGs define context-free languages, used in compilers and parsers.
● Parse trees help in understanding derivation and ambiguity.
● Left recursion and left factoring improve parsing efficiency.
● PDAs extend finite automata with stacks, making them more powerful for handling
nested structures.
● DPDAs are less powerful than NPDAs but still useful for many practical languages.
M=(Q,Σ,Γ,δ,q0,qaccept,qreject)
where:
scss
CopyEdit
(q0, a) → (q1, X, R)
(q1, b) → (q2, Y, L)
(q2, X) → (q0, X, R)
(q1, □) → (q_accept, □, R)
1. Recursively Enumerable Languages (RE): A TM accepts all strings in L, but may not
halt for non-members.
2. Recursive Languages: A TM decides the language, meaning it halts on all inputs
(either accepting or rejecting).
● Undecidable: No algorithm can decide if every Turing Machine halts on every input.
● Proof (By Contradiction): Suppose a TM H can decide halting. Then, construct a
machine D that contradicts its behavior.
Formal Definition
A k-tape TM has:
● k tapes
● k tape heads
● δ: Q × Γ^k → Q × Γ^k × {L, R}^k
● Encode multiple tapes into one by separating symbols with special markers (#).
● Simulate each tape’s head movements by shifting within the encoded tape.
● This transformation may introduce polynomial time overhead, but does not change
computability.
Key Conclusion:
Impact on Computability:
● Many problems in formal language theory and real-world computing are undecidable.
● Practical consequence: Compilers, AI, and software verification rely on
approximations and heuristics.
Conclusion
● Turing Machines model computation and define language classes.
● The Halting Problem and other decision problems are undecidable.
● Multitape TMs do not increase computability power but improve efficiency.
● Undecidability limits what can be algorithmically solved.
4.Compiler Design
1. Introduction to Compiler
A compiler is a software that translates source code written in a high-level language into
machine code (binary).
Phases of a Compiler
Example
For input:
int x = 10;
Identifier [a-zA-Z_][a-zA-Z0
-9_]*
Number [0-9]+(\.[0-9]+)?
Keyword `if
Operator `+
The Lexical Analyzer uses Finite Automata (FA) to recognize these patterns.
Types of Parsing:
5. Top-Down Parsing
Top-Down Parsing starts from the start symbol and applies derivation rules to match the
input.
Example Grammar:
E → T + E | T
T → F * T | F
F → (E) | id
Recursive Functions:
6. Bottom-Up Parsing
Bottom-Up Parsing starts from tokens and reduces them to the start symbol.
Shift-Reduce Parsing
Types of LR Parsers
Parser Generators
Conclusion
● Lexical Analysis converts source code into tokens using Regular Expressions.
● Parsing checks syntax using Top-Down (LL(1)) or Bottom-Up (LR) techniques.
● Compiler Construction Tools like Lex/YACC simplify compiler development.
● LR Parsing is powerful for automatic parser generation.
Key Components:
2. Dependency Graphs
A Dependency Graph represents dependencies between semantic actions.
Components of Dependency Graphs:
Example
Dependency Graph:
(E1.val) (T.val)
\ /
(+) (int.lexval)
\ /
(E.val)
Example:
kotlin
CopyEdit
E → T { E.val = T.val }
T → num { T.val = num.lexval }
Example:
4. Type Checking
Ensures semantic correctness of expressions, function calls, etc.
Example:
Example:
Expression
a = b + c
d = a - e
f = b + c
DAG Representation:
(+)
/ \
b c
\ /
a f
|
(-)
/ \
a e
Optimization: f = b + c is redundant.
For expression:
x = (a + b) * (c - d)
TAC Representation:
t1 = a + b
t2 = c - d
x = t1 * t2
Example:
(+ a b t1)
(- c d t2)
(* t1 t2 x)
Triples Representation
Example:
(+, a, b) → t1
(-, c, d) → t2
(*, t1, t2) → x
Benefits of SSA:
Conclusion
● Syntax Directed Translation (SDT) assigns actions to grammar rules.
● Dependency Graphs determine the order of evaluation.
● S-Attributed & L-Attributed Definitions affect parsing methods.
● Type Checking ensures semantic correctness.
● Intermediate Code Generation bridges parsing & machine code.
● DAGs, TAC, Quadruples, Triples, and SSA optimize code representation.
Stack Allocation
| Return Address |
| Parameters |
| Local Variables |
| Temporary Values |
t1 = a + b
t2 = t1 * c
x = t2 - d
Flow Graph
B1: a = b + c
| |
v v
B3: x = a B4: y = -a
Example:
x = a + b;
y = a + b; // Redundant
Optimized:
x = a + b;
y = x;
Example:
int x = 5 * 4; // Compiler replaces with x = 20
Example:
int x = 10;
Example:
x = y * 1; // Simplified to x = y
y = z + 0; // Simplified to y = z
5. Simple Code Generator
A simple code generator translates three-address code (TAC) to assembly code.
TAC Input:
t1 = a + b
t2 = t1 * c
x = t2 - d
MOV R1, a
ADD R1, b
MOV R2, R1
MUL R2, c
MOV R3, R2
SUB R3, d
MOV x, R3
6. Code Optimization
Code optimization improves execution speed and reduces memory usage.
Example:
MOV R1, 4
MOV R2, 5
Optimized:
MOV R1, 9
Example:
return 5;
x = 10; // Unreachable
Optimized:
return 5;
Example:
MUL R1, 2 ; Multiplication is costly
Optimized:
SHL R1, 1 ; Bitwise shift (faster)
4. Eliminating Redundant Load/Store Instructions
Example:
MOV R1, a
Optimized:
MOV R1, a
Conclusion
● Code Generation converts IR into machine code.
● Basic Blocks & Flow Graphs help in control flow analysis.
● Optimizations improve speed and reduce code size.
● Peephole Optimization applies local optimizations for efficiency.