0% found this document useful (0 votes)
25 views18 pages

Implementation of Scanner: Build Understand Categorize

The document discusses implementing a scanner and constructing a symbol table. It describes building a scanner to tokenize source code into tokens, patterns, and lexemes. It also discusses constructing a symbol table to store semantic information about identifiers, with entries for names, types, and locations. The symbol table is implemented as a linked list of hash tables, one per block level, to lookup identifiers in the correct scope.

Uploaded by

ADNAN
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views18 pages

Implementation of Scanner: Build Understand Categorize

The document discusses implementing a scanner and constructing a symbol table. It describes building a scanner to tokenize source code into tokens, patterns, and lexemes. It also discusses constructing a symbol table to store semantic information about identifiers, with entries for names, types, and locations. The symbol table is implemented as a linked list of hash tables, one per block level, to lookup identifiers in the correct scope.

Uploaded by

ADNAN
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 18

Implementation of Scanner

Objectives:
• Be able to Build a simple scanner
• Be able to Understand Symbol table construction
• Be able to Categorize source program components
Tokens, Patterns, and Lexemes
• A token is a classification of lexical units
• For example: id and num
• Lexemes are the specific character strings that make
up a token
• For example: abc and 123
• Patterns are rules describing the set of lexemes
belonging to a token
• For example: “letter followed by letters and digits” and “non-
empty sequence of digits”

2
Regular Definitions and Grammars

Grammar
stmt  if expr then stmt
 if expr then stmt else stmt

expr  term relop term
 term
term  id Regular definitions

 num if  if
then  then
else  else
relop  <  <=  <>  >  >=  =
id  letter ( letter | digit )*
num  digit+ (. digit+)? ( E (+-)? digit+ )?
3
Context Free Grammar

How to
Define CFG for
Mathematical Expression ?
Testing? Or Verification?
BNF Notation

• Symbol ::= is often used for ->.


• Symbol | is used for “or.”
• A shorthand for a list of productions with the same left side.
• Example: S -> 0S1 | 01
• S::=0<S>1|01
• S-> 0A0|1A1|0|1|ε (Covert yourself)

5
What strings are produced by these CFG’s???

• S→1S|0A0S|ε
•A→1A|ε
And

• S→1S|0T|ε
• T→1T|0S
Applications of CFG’s
1- Validity of syntax (Parsing)
• A-> AA|(A)|ε

• S -> SS | iS| iSeS| ε


• <if_stmt>→if<logic_expr>then<stmt>|if<logic_expr>then<stmt>else<stmt>
<stmt>→<if_stmt> | <any non-if statement>
2-Context-Free Grammars for Mathematical
expressions

Grammar for expressions (another form)


1. goal → expr
2. expr → expr op term |term
4. term→ number | id
6. op → + | -

9
The Front End
• For this CFG
S = goal
T = { number, id, +, -}
N = { goal, expr, term, op}
P = { 1, 2, 3, 4, 5, 6, 7}

10
Parse
Production Result
goal
1 expr
2 expr op term
5 expr op y
7 expr – y
2 expr op term – y
4 expr op 2 – y
6 expr + 2 – y
3 term + 2 – y
5 x+2–y
11
Example Grammar

Context-free grammar for simple expressions:

G = <{list,digit}, {+,-,0,1,2,3,4,5,6,7,8,9}, P, list>

with productions P =

list  list + digit

list  list - digit

list  digit

digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
12
Table generation
Name Class Type Other *Location

fun proc Int ,intint - 23

a para int local 4

bin para boolean global 35

How to construct the table? Data structure?operations?


The Symbol Table
• When identifiers are found, they will be entered into a symbol table,
which will hold all relevant information about identifiers.
• This information will be used later by the semantic analyzer and the
code generator.

Lexical Syntax Semantic Code


Analyzer Analyzer Analyzer Generator

Symbol
Table
Symbol Table

The symbol table is globally accessible (to all phases of the compiler)

Each entry in the symbol table contains a string and a token value:
struct entry
{ char *lexptr; /* lexeme (string) for tokenval */
int token;
};
struct entry symtable[];

insert(s, t): returns array index to new entry for string s token t
lookup(s): returns array index to entry for string s or 0

Possible implementations:
- simple C code as in the project
- hashtables 15
Structure of the Symbol Table
• We will implement the symbol table as a linked list of hash tables, one
hash table for each block level.

Level 3 Level 2 Level 1 Level 0

Hash table Hash table Hash table


of of of null
Locals Globals Keywords
Looking up a Symbol

• If an identifier is declared both globally and locally, which one will be


found when it is looked up?
• If an identifier is declared only globally and we are in a function, how
will it be found?
• How do we prevent the use of a keyword as a variable name?
Summary

•First phase of compiler construction is Lexical analysis.

•CFG is used to design parser.

•Left most and right most derivations are derived to


check if the grammar is unambiguous.

•There ar seven phases of compiler.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy