0% found this document useful (0 votes)
36 views29 pages

Untitled document (26)

The document provides an overview of Automata Theory, including concepts such as automata, formal languages, and regular expressions. It covers the Chomsky hierarchy of grammars, finite automata, context-free grammars, and Turing machines, detailing their definitions, types, and relationships. Additionally, it discusses key topics like ambiguity, left recursion, and undecidable problems in computation.

Uploaded by

Amrita P
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views29 pages

Untitled document (26)

The document provides an overview of Automata Theory, including concepts such as automata, formal languages, and regular expressions. It covers the Chomsky hierarchy of grammars, finite automata, context-free grammars, and Turing machines, detailing their definitions, types, and relationships. Additionally, it discusses key topics like ambiguity, left recursion, and undecidable problems in computation.

Uploaded by

Amrita P
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

1.

Automata Theory
1. Concepts of Automata Theory
Automata theory is the study of abstract machines (automata) and the computational problems
that they can solve. It forms the foundation of formal languages and computational theory.

Key Components

●​ Alphabets (Σ): A finite set of symbols.


●​ Strings: A finite sequence of symbols from the alphabet.
●​ Languages (L): A set of strings formed using a given alphabet.
●​ Automaton: A mathematical model that processes input strings and determines if they
belong to a given language.

2. Formal Language and Regular Expressions


A formal language is a set of strings over a given alphabet that follows specific syntactic rules.

Regular Expressions (RE)

A regular expression is a symbolic representation of a regular language. Some basic rules


include:

●​ Concatenation: ab represents strings where a is followed by b.


●​ Union (OR operation): a | b represents either a or b.
●​ Kleene Star (*): a* represents zero or more occurrences of a.
●​ Plus Operator (+): a+ represents one or more occurrences of a.
●​ Optional (?): a? represents zero or one occurrence of a.

Example:

●​ (a|b)* represents all strings containing a and b in any order, including the empty string
ε.

3. Chomsky Hierarchy of Grammar


Noam Chomsky classified grammars into four types:
Grammar Type Automaton Used Rules Format Example

Type 0 (Unrestricted) Turing Machine α → β (α & β can be any `S → aSb


string)

Type 1 Linear Bounded αAβ → αγβ AB →


(Context-Sensitive) Automaton AABB

Type 2 (Context-Free) Pushdown Automaton A→γ `S → aSb

Type 3 (Regular) Finite Automaton A → aB or A → a `S → aS

Regular languages are the simplest and can be recognized by Finite Automata (FA).

4. Regular Grammar
A Regular Grammar is a formal grammar that generates regular languages. It follows these
rules:

●​ Right-linear grammar: Rules have the form A → aB or A → a.


●​ Left-linear grammar: Rules have the form A → Ba or A → a.

Example:

●​ Right-linear: S → aA | bB | ε
●​ Left-linear: A → Ba | ε

Regular grammars are equivalent to Finite Automata and Regular Expressions.

5. Finite Automata (FA)


5.1 Deterministic Finite Automaton (DFA)

A DFA is a finite automaton where for each state and input, there is exactly one transition.

Formal Definition

A DFA is a 5-tuple:​
M=(Q,Σ,δ,q0,F)

where:
●​ Q = Finite set of states
●​ Σ = Input alphabet
●​ δ = Transition function Q × Σ → Q
●​ q_0 = Start state
●​ F = Set of final (accepting) states

Example DFA for a*b

Q = {q0, q1}
Σ = {a, b}
q0 → q0 on 'a'
q0 → q1 on 'b'
q1 → q1 on 'b'
F = {q1}

5.2 Nondeterministic Finite Automaton (NFA)

An NFA allows multiple transitions for the same input, including transitions without input
(ε-moves).

Differences between DFA and NFA

Property DFA NFA

Transitions Exactly one per input Multiple or none per input

ε-moves Not allowed Allowed

Expressivenes Equivalent to NFA Equivalent to DFA


s

Complexity More complex in some Simpler state transitions


cases

6. Conversions in Finite Automata


6.1 Regular Expression to NFA

To convert a Regular Expression to an NFA:

1.​ Create basic NFAs for individual symbols.


2.​ Combine them using union, concatenation, and Kleene star.
3.​ Construct the final NFA.

Example:​
For ab*, the NFA has transitions:

●​ q0 → q1 on a
●​ q1 → q1 on b

6.2 NFA to DFA Conversion (Subset Construction Method)

To convert an NFA to a DFA:

1.​ Create subsets of NFA states as DFA states.


2.​ Determine transitions based on all possible NFA transitions.
3.​ Mark final states if any subset contains an NFA final state.

Example:​
If NFA has states {q0, q1, q2}, the DFA will have states like {q0}, {q0, q1}, {q0, q2},
etc.

7. Finite Automata with Epsilon (ε) Transitions


An ε-NFA allows transitions without consuming input (ε-moves).

7.1 Eliminating Epsilon Transitions

To remove ε-transitions:

1.​ Find ε-closure: The set of states reachable by ε-moves.


2.​ Modify transitions: Replace ε-moves with direct transitions.
3.​ Update final states: Any state leading to a final state via ε is also final.

8. Minimization of DFA
Minimization reduces the number of states in a DFA while preserving language recognition.

Steps for DFA Minimization

1.​ Remove unreachable states.


2.​ Merge equivalent states using equivalence classes.
3.​ Construct a new minimized DFA.
9. Finite Automata with Outputs
Finite Automata can produce output in addition to recognizing a language.

9.1 Moore Machine

●​ Output depends only on the state.


●​ Output is generated at each state.
●​ Defined as 6-tuple: (Q,Σ,δ,q0,F,λ), where λ: Q → O.

9.2 Mealy Machine

●​ Output depends on both state and input.


●​ Defined as 6-tuple: (Q,Σ,δ,q0,O,λ), where λ: Q × Σ → O.

Example Comparison

Type Output Depends On When Output Occurs

Moore State At state transitions

Mealy State & Input At each transition

2.Context-Free Grammars (CFGs) and


Pushdown Automata (PDA)
1. Context-Free Grammar (CFG)
A Context-Free Grammar (CFG) is a formal grammar used to generate context-free languages.
It is widely used in programming languages and compilers.

Definition of CFG

A CFG is a 4-tuple:

G=(V,Σ,P,S)

where:

●​ V = Finite set of variables (non-terminals).


●​ Σ = Finite set of terminals (input symbols).
●​ P = Finite set of production rules.
●​ S = Start symbol.

Example CFG for Balanced Parentheses

S→(S)S ∣ ϵ

This grammar generates strings like (), (()), ()(), etc

2. Parse Trees
A Parse Tree (Derivation Tree) visually represents the structure of a string derived from a CFG.

Properties of Parse Trees:

1.​ Root node: The start symbol of the grammar.


2.​ Internal nodes: Non-terminals.
3.​ Leaf nodes: Terminals or ε.
4.​ Derivation order: Leftmost or rightmost derivation.

Example for the CFG

For the grammar: S→aSb ∣ ϵ The string aabb has the following parse tree:

/ \

a S

/ \

a S

/ \

ε b

b
3. Derivation in CFG
A derivation is the sequence of rule applications to generate a string.

Types of Derivation:

●​ Leftmost Derivation (LM): Always replace the leftmost non-terminal first.


●​ Rightmost Derivation (RM): Always replace the rightmost non-terminal first.

Example:​
Given CFG:

S→aSb ∣ ab

For input aabb:

●​ Leftmost Derivation:​

1.​ S → aSb
2.​ S → ab
3.​ aSb → a ab → aabb
●​ Rightmost Derivation:​

1.​ S → aSb
2.​ S → ab
3.​ aSb → a ab → aabb

4. Ambiguity in Grammar
A grammar is ambiguous if there is more than one parse tree or derivation for a given string.

Example of Ambiguous Grammar:

E→E+E ∣ E∗E ∣ id

For id + id * id, two parse trees exist:

1.​ (id + id) * id


2.​ id + (id * id)

How to Remove Ambiguity?

●​ Use precedence rules (e.g., multiplication has higher precedence than addition).
●​ Modify the grammar to enforce order.

Example:

E → E + T | T

T → T * F | F

F → id

This ensures id + id * id is parsed as id + (id * id).

5. Removal of Left Recursion


Left recursion occurs when a production rule has the form:

A→Aα∣β

where A appears on the left-hand side first.

Example of Left Recursion:

E→E+T ∣ T

Removing Left Recursion:

Convert to right recursion:

1.​ Rewrite the grammar as:​



E → T E'

E' → + T E' | ε

This eliminates direct left recursion.

6. Left Factoring
Left factoring is used when a grammar has common prefixes, making parsing difficult.

Example of Left Factoring:

A→αβ1∣αβ2
Here, α is common, so we rewrite:

A→αA′

A′→β1∣β2

Example Before Left Factoring:

S → if E then S else S | if E then S | A

After Left Factoring:

S → if E then S S' | A

S' → else S | ε

7. Pushdown Automata (PDA)


A Pushdown Automaton (PDA) is a finite automaton with a stack for handling context-free
languages.

Formal Definition

A PDA is a 6-tuple:

(Q,Σ,Γ,δ,q0,F)

where:

●​ Q = Set of states
●​ Σ = Input alphabet
●​ Γ = Stack alphabet
●​ δ = Transition function: (Q × Σ × Γ) → (Q × Γ*)
●​ q_0 = Start state
●​ F = Set of final states

8. Languages Recognized by PDAs


●​ Deterministic PDA (DPDA): Recognizes deterministic context-free languages (e.g.,
{a^n b^n} where n ≥ 0).
●​ Nondeterministic PDA (NPDA): Recognizes all context-free languages.
Example PDA for Language L = {a^n b^n | n ≥ 0}

1.​ Push 'a' onto the stack for each a.


2.​ Pop 'a' for each b.
3.​ If stack is empty and input is consumed, accept.

9. Equivalence of PDA and CFG


Every PDA has an equivalent CFG and vice versa.

Conversion from CFG to PDA:

1.​ Push the start symbol onto the stack.


2.​ Expand non-terminals using production rules.
3.​ Match terminals with input symbols.

Conversion from PDA to CFG:

1.​ For each transition, create a production rule.


2.​ Match stack operations with non-terminal replacements.

10. Deterministic Pushdown Automata (DPDA)


A DPDA is a PDA where for each state and input, at most one transition is possible.

Key Differences Between DPDA and NPDA

Property DPDA NPDA

Determinism One transition per input Multiple transitions possible

Power Less powerful More powerful

Language Deterministic CFL All CFLs


Class
Example Language Accepted by DPDA:

L={anbn∣n≥0}

This can be recognized by DPDA but not by a simple finite automaton.

Conclusion
●​ CFGs define context-free languages, used in compilers and parsers.
●​ Parse trees help in understanding derivation and ambiguity.
●​ Left recursion and left factoring improve parsing efficiency.
●​ PDAs extend finite automata with stacks, making them more powerful for handling
nested structures.
●​ DPDAs are less powerful than NPDAs but still useful for many practical languages.

3.Turing Machines (TM)


1. Introduction to Turing Machines
A Turing Machine (TM) is a mathematical model of computation that defines an abstract
machine capable of simulating any algorithm. It serves as the foundation of computability
theory.

Definition of a Turing Machine

A Turing Machine is a 7-tuple:

M=(Q,Σ,Γ,δ,q0,qaccept,qreject)
where:

●​ Q = Finite set of states


●​ Σ = Input alphabet (does not include the blank symbol □)
●​ Γ = Tape alphabet (Σ ⊆ Γ, includes □)
●​ δ = Transition function: Q×Γ→Q×Γ×{L,R}
●​ q_0 = Initial state
●​ q_accept = Accept state (halting state)
●​ q_reject = Reject state (halting state)
2. Transition Diagrams for Turing Machines
A transition diagram is a graphical representation of TM states and their transitions.

Example: TM for Language L = {a^n b^n | n ≥ 1}

●​ Initial State (q0): Read a, replace with X, move right.


●​ Move Right (q1): Skip as, stop at first b, replace with Y, move left.
●​ Move Left (q2): Go back to the leftmost X, repeat.
●​ Halting (q_accept): If all as and bs are replaced, accept.

scss
CopyEdit
(q0, a) → (q1, X, R)
(q1, b) → (q2, Y, L)
(q2, X) → (q0, X, R)
(q1, □) → (q_accept, □, R)

3. Language of a Turing Machine


A language L is recognized by a Turing Machine M if it accepts all strings in L.

●​ A TM accepts a language if it reaches q_accept.


●​ A TM decides a language if it halts for all inputs (accepting or rejecting).

Types of Languages Recognized by TMs:

1.​ Recursively Enumerable Languages (RE): A TM accepts all strings in L, but may not
halt for non-members.
2.​ Recursive Languages: A TM decides the language, meaning it halts on all inputs
(either accepting or rejecting).

4. Turing Machines and Halting


The Halting Problem: Given a Turing Machine M and input w, determine if M halts on w.

●​ Undecidable: No algorithm can decide if every Turing Machine halts on every input.
●​ Proof (By Contradiction): Suppose a TM H can decide halting. Then, construct a
machine D that contradicts its behavior.

Conclusion: The Halting Problem is undecidable.


5. Multitape Turing Machines
A Multitape Turing Machine is an extension of a standard TM with multiple tapes and
independent tape heads.

Formal Definition

A k-tape TM has:

●​ k tapes
●​ k tape heads
●​ δ: Q × Γ^k → Q × Γ^k × {L, R}^k

Example Use Case

●​ String comparison: One tape stores w, another stores w^R (reverse).


●​ Efficient computation: Simulating complex algorithms like sorting and multiplication.

6. Equivalence of One-Tape and Multitape TMs


Even though Multitape TMs appear more powerful, they can be simulated by a single-tape
TM.

Simulation of a k-Tape TM on a Single-Tape TM

●​ Encode multiple tapes into one by separating symbols with special markers (#).
●​ Simulate each tape’s head movements by shifting within the encoded tape.
●​ This transformation may introduce polynomial time overhead, but does not change
computability.

Key Conclusion:

Multitape TMs do not recognize more languages than single-tape TMs.

7. Undecidable Problems About Turing Machines


Certain problems related to TMs are undecidable, meaning no algorithm exists to solve them
for all cases.

Famous Undecidable Problems:

1.​ The Halting Problem: Given (M, w), does M halt on w?


○​ Undecidable (proved by contradiction).
2.​ The Membership Problem: Given (M, w), does M accept w?
○​ Undecidable for Recursively Enumerable Languages.
3.​ The Equivalence Problem: Given two TMs M1 and M2, do they recognize the same
language?
○​ Undecidable.
4.​ The Emptiness Problem: Given a TM M, is L(M) = ∅?
○​ Undecidable.
5.​ Post’s Correspondence Problem (PCP): Given a set of string pairs, is there a
sequence where concatenations match?
○​ Undecidable.

Impact on Computability:

●​ Many problems in formal language theory and real-world computing are undecidable.
●​ Practical consequence: Compilers, AI, and software verification rely on
approximations and heuristics.

Conclusion
●​ Turing Machines model computation and define language classes.
●​ The Halting Problem and other decision problems are undecidable.
●​ Multitape TMs do not increase computability power but improve efficiency.
●​ Undecidability limits what can be algorithmically solved.

4.Compiler Design
1. Introduction to Compiler
A compiler is a software that translates source code written in a high-level language into
machine code (binary).

Phases of a Compiler

A compiler operates in six main phases:

1.​ Lexical Analysis (Tokenization)


2.​ Syntax Analysis (Parsing)
3.​ Semantic Analysis (Type Checking)
4.​ Intermediate Code Generation (IR Representation)
5.​ Code Optimization (Performance Improvement)
6.​ Code Generation (Machine Code Output)

Additionally, there are two supporting phases:

●​ Symbol Table Management (Tracks variables and functions).


●​ Error Handling (Detects and reports errors).

2. Role of Lexical Analyzer


The Lexical Analyzer is the first phase of the compiler. It reads the source code and converts
it into tokens.

Responsibilities of Lexical Analyzer

●​ Tokenization: Identifies keywords, identifiers, operators, literals, etc.


●​ Eliminates Whitespace & Comments.
●​ Error Detection: Reports invalid characters.
●​ Interaction with Symbol Table: Stores variable and function names.

Example

For input:

int x = 10;

Lexical Analysis produces:

TOKEN(INT), TOKEN(IDENTIFIER(x)), TOKEN(ASSIGNMENT),


TOKEN(NUMBER(10)), TOKEN(SEMICOLON)

3. Specification & Recognition of Tokens Using Regular


Expressions
Each token type can be defined using regular expressions:

Token Type Regular Expression

Identifier [a-zA-Z_][a-zA-Z0
-9_]*

Number [0-9]+(\.[0-9]+)?
Keyword `if

Operator `+

The Lexical Analyzer uses Finite Automata (FA) to recognize these patterns.

4. Syntax Analysis (Parsing)


The Syntax Analyzer (Parser) checks whether the given tokens form a valid program based on
the grammar of the language.

Types of Parsing:

●​ Top-Down Parsing (Starts from the root and expands)


●​ Bottom-Up Parsing (Starts from tokens and reduces to the root)

5. Top-Down Parsing
Top-Down Parsing starts from the start symbol and applies derivation rules to match the
input.

Recursive Descent Parsing

●​ A manual implementation of top-down parsing using recursive functions.


●​ Works well for small grammars, but fails for left-recursive grammars.

Example Grammar:

E → T + E | T
T → F * T | F
F → (E) | id

Recursive Functions:

void E() { T(); if(nextToken == '+') { match('+'); E(); } }


void T() { F(); if(nextToken == '*') { match('*'); T(); } }
void F() { if(nextToken == 'id') match('id'); else if(nextToken ==
'(') { match('('); E(); match(')'); } }
Predictive Parsing (LL(1))

●​ Eliminates recursion by using a lookahead symbol.


●​ Uses a Parsing Table to decide which rule to apply.

LL(1) Parsing Table Construction:

1.​ Compute First and Follow sets.


2.​ Fill table using these sets.
3.​ Use Stack-Based Parsing for efficient implementation.

6. Bottom-Up Parsing
Bottom-Up Parsing starts from tokens and reduces them to the start symbol.

Shift-Reduce Parsing

●​ Shift: Read input token and push it onto the stack.


●​ Reduce: Replace symbols on the stack using grammar rules.
●​ Repeat until the start symbol is reached.

LR Parsers (Left-to-Right, Rightmost Derivation in Reverse)

LR Parsers are efficient and powerful bottom-up parsers.

Types of LR Parsers

Type Meaning Complexity Parsing


Table

LR(0) Simple Shift-Reduce Basic Large

SLR(1) Simple LR Uses Follow Sets Smaller Table

CLR(1) Canonical LR Full Lookahead Largest Table

LALR(1) Lookahead LR Merges CLR Smaller Table


States

7. Compiler Construction Tools


Compiler tools help automate the development of compiler components.
Lexical Analyzer Generators

●​ Lex/Flex: Generates lexical analyzers from regular expressions.

Parser Generators

●​ YACC/Bison: Creates parsers using grammar rules.

Intermediate Code Generators

●​ LLVM: Provides intermediate representation for optimization.

Code Optimization Tools

●​ GCC: Uses various optimizations to improve performance.

Conclusion
●​ Lexical Analysis converts source code into tokens using Regular Expressions.
●​ Parsing checks syntax using Top-Down (LL(1)) or Bottom-Up (LR) techniques.
●​ Compiler Construction Tools like Lex/YACC simplify compiler development.
●​ LR Parsing is powerful for automatic parser generation.

5.Syntax Directed Translation (SDT)


1. Introduction to Syntax Directed Translation (SDT)
Syntax Directed Translation (SDT) uses syntax rules to drive the translation of a source
program into an intermediate representation.

Key Components:

1.​ Syntax-Directed Definitions (SDD)


○​ Associate semantic actions with grammar rules.
2.​ Translation Schemes
○​ SDTs with explicit order of evaluation (e.g., actions within productions).

2. Dependency Graphs
A Dependency Graph represents dependencies between semantic actions.
Components of Dependency Graphs:

●​ Nodes: Represent attributes of grammar symbols.


●​ Edges: Show dependencies (evaluation order).

Example

For the grammar rule:

E → E1 + T { E.val = E1.val + T.val }


T → int { T.val = int.lexval }

Dependency Graph:

(E1.val) (T.val)
\ /
(+) (int.lexval)
\ /
(E.val)

Evaluation Order: Compute T.val, then E1.val, then E.val.

3. S-Attributed & L-Attributed Definitions


S-Attributed Definitions

●​ Only synthesized attributes (depend on child nodes).


●​ Evaluated in bottom-up order.
●​ Works well with bottom-up parsing (LR parsing).

Example:

kotlin
CopyEdit
E → T { E.val = T.val }
T → num { T.val = num.lexval }

Evaluation order: num.lexval → T.val → E.val (Bottom-Up).


L-Attributed Definitions

●​ Synthesized & Inherited attributes (can depend on left siblings).


●​ Evaluated in left-to-right order.
●​ Works well with top-down parsing (LL parsing).

Example:

E → T E' { E.val = T.val + E'.val }


E' → + T E' { E'.val = T.val + E'1.val }
E' → ε { E'.val = 0 }

Evaluation Order: Left to Right.

4. Type Checking
Ensures semantic correctness of expressions, function calls, etc.

Types of Type Checking:

1.​ Static Type Checking: Performed at compile-time (e.g., C, Java).


2.​ Dynamic Type Checking: Performed at run-time (e.g., Python, JavaScript).

Type Checking Methods:

●​ Type Inference: Deriving types from context.


●​ Symbol Table Lookup: Verifies declared types.

Example:

int x = "hello"; // Type Error: int ≠ string

5. Intermediate Code Generation (ICG)


Intermediate Code Generation bridges the gap between syntax analysis and machine code
generation.

Intermediate Code Representations:

1.​ Abstract Syntax Trees (ASTs)


○​ Compact representation of parsed input.
2.​ Directed Acyclic Graphs (DAGs)
○​ Optimizes expressions by removing redundancy.
3.​ Three-Address Code (TAC)
○​ Uses temporary variables for operations.
4.​ Quadruples & Triples
○​ Represent operations with tuples.
5.​ Static Single Assignment (SSA)
○​ Ensures each variable is assigned only once.

6. Directed Acyclic Graph (DAG)


A DAG eliminates common subexpressions, improving optimization.

Example:

Expression

a = b + c
d = a - e
f = b + c

DAG Representation:

(+)
/ \
b c
\ /
a f
|
(-)
/ \
a e

Optimization: f = b + c is redundant.

7. Three-Address Code (TAC)


TAC uses temporary variables to break down complex expressions.
Example

For expression:

x = (a + b) * (c - d)

TAC Representation:

t1 = a + b
t2 = c - d
x = t1 * t2

This simplifies code generation and optimization.

8. Quadruples & Triples


Quadruples Representation

Each instruction has 4 fields:

(operator, arg1, arg2, result)

Example:

(+ a b t1)
(- c d t2)
(* t1 t2 x)

Triples Representation

Each instruction has 3 fields (without explicit result variable):

(operator, arg1, arg2)

Example:

(+, a, b) → t1
(-, c, d) → t2
(*, t1, t2) → x

Triples avoid explicit temporaries but are harder to modify.

9. Static Single Assignment (SSA)


SSA ensures each variable is assigned only once by renaming variables when needed.

Example (Before SSA)


x = a + b
x = x * c

After SSA Transformation


x1 = a + b
x2 = x1 * c

Benefits of SSA:

●​ Simplifies optimization (constant propagation, dead code elimination).


●​ Used in modern compilers like LLVM.

Conclusion
●​ Syntax Directed Translation (SDT) assigns actions to grammar rules.
●​ Dependency Graphs determine the order of evaluation.
●​ S-Attributed & L-Attributed Definitions affect parsing methods.
●​ Type Checking ensures semantic correctness.
●​ Intermediate Code Generation bridges parsing & machine code.
●​ DAGs, TAC, Quadruples, Triples, and SSA optimize code representation.

6.Code Generation and Optimization


1. Introduction to Code Generation
Code Generation is the final phase of a compiler that translates intermediate representation
(IR) into machine code or assembly language.
Issues in Code Generation

1.​ Target Machine Dependencies


○​ Different architectures (RISC, CISC) require different code generation strategies.
2.​ Instruction Selection
○​ Choosing the most efficient machine instructions for operations.
3.​ Register Allocation
○​ Efficient use of CPU registers to minimize memory accesses.
4.​ Code Size vs. Execution Speed Trade-off
○​ Optimized code should be both fast and space-efficient.
5.​ Handling Variables and Memory Management
○​ Static vs. Stack allocation for variables.

2. Static & Stack Allocation


Static Allocation

●​ Memory is allocated at compile time.


●​ Used for global variables, constants, and static data.
●​ No recursion or dynamic memory allocation support.

Stack Allocation

●​ Memory is managed at runtime using a stack.


●​ Supports local variables, function calls, and recursion.
●​ Uses Activation Records (AR) to store function parameters, return values, and local
variables.

Example Stack Layout for a Function Call:

| Return Address |

| Old Stack Pointer |

| Parameters |

| Local Variables |

| Temporary Values |

3. Basic Blocks & Flow Graphs


Basic Block
A basic block is a sequence of instructions that:

●​ Has one entry point (no jumps into the middle).


●​ Has one exit point (no jumps except at the end).

Example Basic Block:

t1 = a + b

t2 = t1 * c

x = t2 - d

This executes sequentially without branching.

Flow Graph

A Flow Graph represents the control flow between basic blocks.

●​ Nodes → Basic Blocks.


●​ Edges → Control flow between blocks.

Example Control Flow Graph (CFG):

B1: a = b + c

B2: if a > 0 goto B3 else goto B4

| |

v v

B3: x = a B4: y = -a

Flow graphs help in optimizing control flow and detecting loops.

4. Optimization of Basic Blocks


Optimization techniques improve execution time and reduce redundant computations.
Key Optimizations:

1.​ Common Subexpression Elimination​

○​ Remove duplicate computations.

Example:​
x = a + b;

y = a + b; // Redundant

Optimized:​
x = a + b;

y = x;

2.​ Constant Folding​

○​ Compute constant expressions at compile time.

Example:​

int x = 5 * 4; // Compiler replaces with x = 20

3.​ Dead Code Elimination​

○​ Remove unused variables and computations.

Example:​

int x = 10;

x = 20; // The first assignment is useless

4.​ Algebraic Simplification​

○​ Apply algebraic identities to simplify expressions.

Example:​

x = y * 1; // Simplified to x = y

y = z + 0; // Simplified to y = z
5. Simple Code Generator
A simple code generator translates three-address code (TAC) to assembly code.

Example TAC → Assembly

TAC Input:

t1 = a + b

t2 = t1 * c

x = t2 - d

Generated Assembly Code (x86):

MOV R1, a

ADD R1, b

MOV R2, R1

MUL R2, c

MOV R3, R2

SUB R3, d

MOV x, R3

●​ Uses registers R1, R2, R3 to store intermediate results.

6. Code Optimization
Code optimization improves execution speed and reduces memory usage.

Principal Sources of Optimization:

1.​ Redundant Computation Removal


2.​ Loop Optimization (reducing unnecessary computations inside loops).
3.​ Register Allocation (minimizing memory access).
4.​ Control Flow Optimization (removing unnecessary jumps).
7. Peephole Optimization
Peephole optimization is a local optimization technique that improves small instruction
sequences.

Common Peephole Optimizations:

1.​ Constant Folding & Propagation​

Example:​

MOV R1, 4

MOV R2, 5

ADD R1, R2 ; R1 = 9 (can be computed at compile time)

Optimized:​

MOV R1, 9

2.​ Eliminating Unreachable Code​

Example:​
return 5;

x = 10; // Unreachable

Optimized:​

return 5;

3.​ Strength Reduction (Replacing expensive operations with cheaper ones)​

Example:​
MUL R1, 2 ; Multiplication is costly

Optimized:​

SHL R1, 1 ; Bitwise shift (faster)
4.​ Eliminating Redundant Load/Store Instructions​

Example:​
MOV R1, a

MOV a, R1 ; Unnecessary store

Optimized:​

MOV R1, a

Conclusion
●​ Code Generation converts IR into machine code.
●​ Basic Blocks & Flow Graphs help in control flow analysis.
●​ Optimizations improve speed and reduce code size.
●​ Peephole Optimization applies local optimizations for efficiency.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy