Lexical Analysis of Compiler
Lexical Analysis of Compiler
1
Lexical Analysis
2
Lexical Analysis
3
Tokens
4
Tokens
5
Tokens
6
Ad-hoc Lexer
7
Ad-hoc Lexer
Token nextToken()
{
If ( idChar(next) )
return readId();
if ( number(next) )
return readNumber();
If ( next == ‘”’ )
return readString();
. . . .
} 9
Ad-hoc Lexer
Token readId()
{
string id = “”;
while(true)
{
char c = input.read();
If(idChar(c) == false)
return new Token(TID,id);
id = id + string(c);
}
}
10
Ad-hoc Lexer
boolean idChar(char c)
{
if( isAlpha(c) )
return true;
if( isDigit(c) )
return true;
if( c == ‘_’ )
return true;
return false;
}
11
Ad-hoc Lexer
Token readNumber()
{
string num = “”;
while(true)
{
next = input.read();
if( !isNumber(next))
return new Token(TNUM,num);
num = num + string(next);
}
}
12
Ad-hoc Lexer
Problems :
• We do not know what kind of token we are going to read
from seeing first character.
• If token begins with “i”, is it an identifier “i” or keyword
“if”?
• If token begins with “=”, is it “=” or “==” ?
13