0% found this document useful (0 votes)
205 views16 pages

Experiment No. 9 3118013: Aim: Theory: Lexical Analyzer

The document describes the use of the FLEX tool for generating lexical analyzers. FLEX reads descriptions of lexical patterns in a file and generates C code for a lexical analyzer. The generated code defines a function that scans input for tokens defined by regular expressions. The code sample provided shows a FLEX specification file that defines tokens for a programming language and the corresponding C code generated by FLEX.

Uploaded by

Husain Gadiwala
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
205 views16 pages

Experiment No. 9 3118013: Aim: Theory: Lexical Analyzer

The document describes the use of the FLEX tool for generating lexical analyzers. FLEX reads descriptions of lexical patterns in a file and generates C code for a lexical analyzer. The generated code defines a function that scans input for tokens defined by regular expressions. The code sample provided shows a FLEX specification file that defines tokens for a programming language and the corresponding C code generated by FLEX.

Uploaded by

Husain Gadiwala
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

EXPERIMENT NO.

9 3118013

AIM: To study and implement experiment using Lexical analyzer tool: FLEX

THEORY:

LEXICAL ANALYZER
Lexical analysis or scanning is the process where the stream of characters making
up the source program is read from left-to-right and grouped into tokens. Tokens are
sequences of characters with a collective meaning. There are usually only a small number of
tokens for a programming language: constants (integer, double, char, string, etc.), operators
(arithmetic, relational, logical), punctuation, and reserved words.

The Lexical Analyzer is the interface between the source program and the
compiler. The lexical Analyzer reads the source program one character at a time
and can be treated as a single entity. The lexical analyzer (scanner) is the first phase
of a compiler. Its main task is to read the input characters and produce as output a
sequence of tokens that the parser uses for syntax analysis.

Source Program
Lexical Parser
Analyzer

Symbol
Table

Fig 9.1 Compilation Process

The lexical analyzer is a subroutine of the parser i.e. scanner will operate
under the control of the parser. Upon receiving a “get next token” command from
the parser, the lexical analyzer reads input characters from it until it can identify the
next token.

Identifiers, keywords, constants, operators and punctuation symbols such as


commas and parenthesis are typical tokens. What is called a token depends upon
the language at hand and to some extent, on discretion of the computer designer but
in general, each token is a substring of the source program that has to be treated as
a single unit.
Regular expressions are used to specify the tokens. The advantage of using
regular expressions is that a recognizer for the token, called Finite Automata could
be easily constructed.

Other functions performed by the lexical analyzer are:


• Removal of comments
• Case conversion
• Removal of white spaces
• Communication with symbol table i.e. storing information
regarding an identifier in the symbol table

Fast LEX (Flex) :

Flex is a tool for generating scanners. A scanner, sometimes called a


tokenizer, is a program which recognizes lexical patterns in text. The flex program
reads user-specified input files, or its standard input if no file names are given, for
a description of a scanner to generate. The description is in the form of pairs of
regular expressions and C code, called rules. Flex generates a C source file named,
"lex.yy.c", which defines the function yylex(). The file "lex.yy.c" can be compiled
and linked to produce an executable. When the executable is run, it analyzes its
input for occurrences of text matching the regular expressions for each rule.
Whenever it finds a match, it executes the corresponding C code.

Flex can only generate code for C and C++. To use the scanner code
generated by flex from other languages a language binding tool such as SWIG can
be used.

A similar lexical scanner for C++ is flex++, which is included as part of the
flex package. At the moment, flex supports generating code only for C and C++.
The generated code does not depend on any runtime or external library except for
a memory allocator (malloc or a user-supplied alternative) unless the input also
depends on it. This can be useful inembedded and similar
situations where traditional operating system or C runtime facilities may not
be available.

The flex++ classes and code require a C++ compiler to create lexical and
pattern-matching programs. The flex++ generated C++ scanner includes the header
file FlexLexer.h, which defines the interfaces of the two C++ generated classes.
How to use flex:

FLEX (Fast LEXical analyzer generator) is a tool for generating scanners.


Instead of writing a scanner from scratch, you only need to identify the vocabulary
of a certain language (e.g. Simple), write a specification of patterns using regular
expressions (e.g. DIGIT [0-9]), and FLEX will construct a scanner for you. FLEX
is generally used in the manner depicted here:

Fig 9.2 Compilation Process


First, FLEX reads a specification of a scanner either from an input file *.lex,
or from standard input, and it generates as output a C source file lex.yy.c. Then,
lex.yy.c is compiled and linked with the "-lfl" library to produce an executable
a.out. Finally, a.out analyzes its input stream and transforms it into a sequence of
tokens.

• *.lex is in the form of pairs of regular expressions and C code.


• lex.yy.c defines a routine yylex() that uses the specification to recognize tokens.
• a.out is actually the scanner

How to input file:

1. Format:
definitions
%%
rules
%% user
code
2. The definitions section: "name definition"
The rules section: "pattern The user code action"
section: "yylex() routine"

How to execute with flex:

1. Try sample*.lex
2. Command Sequence:
flex sample*.lex
gcc lex.yy.c -lfl
./a.out

Code:
lex.l
%{

int total =0 ;

%}

%option noyywrap

%%

#.* {total++; fprintf(yyout,"This is Pre-processor directive: %s\n\n",yytext);}

[''|,|;(|)|{|}|.|_] {total++; fprintf(yyout,"This is Delimiter: %s\n\n",yytext);}

[[]] {total++; fprintf(yyout,"This is Delimiter: %s\n\n",yytext);}

"#"|"@"|"$"|"^"|"%"|"^"|"&" {total++; fprintf(yyout,"This is Special Characters:


%s\n\n",yytext);}

"["|"]" {total++; fprintf(yyout,"This is Delimiter: %s\n\n",yytext);}

"=" {total++; fprintf(yyout,"This is Assignment Operator: %s\n\n",yytext);}


"+"|"-"|"*"|"/" {total++; fprintf(yyout,"This is Arithmatic Operator: %s\n\n",yytext);}

"and"|"or"|"not"|"nand"|"xor"|"nor"|"xnor" {total++; fprintf(yyout,"This is Logical Operators:


%s\n\n",yytext);}

"<="|">="|"++"|"!="|"=="|"<"|">" {total++; fprintf(yyout,"This is Relational Operator:


%s\n\n",yytext);}

("int")|("if")|("else")|("while")|("do")|("break")|("continue")|("double")|("float")|("return")|("E
OF") {total++; fprintf(yyout,"This is Keyword: %s\n\n",yytext);}

("return")|("char")|("case")|("sizeof")|("long")|("short")|("typedef")|("switch")|("unsigned")|("v
oid")|("static")|("struct")|("goto") {total++;fprintf(yyout,"This is Keyword:%s\n",yytext);}

[a-zA-Z_][a-zA-Z0-9_]*\( {total++; fprintf(yyout,"This is Function: %s\n\n",yytext);}

[a-zA-Z_][a-zA-Z0-9_]* {total++; fprintf(yyout,"This is Identifier: %s\n\n",yytext);}

[0-9]*"."[0-9]+ {total++;fprintf(yyout,"This is Fraction : %s\n\n", yytext);}

[-][0-9]*"."[0-9]+ {total++;fprintf(yyout,"This is Negative Fraction : %s\n\n", yytext);}

[0-9]+ {total++; fprintf(yyout,"This is Integer: %s\n\n",yytext);}

"-"[0-9]+ {total++; fprintf(yyout,"This is Negative Integer: %s\n\n",yytext);}

["]([^"\\\n]|\\.|\\\n)*["] {total++; fprintf(yyout,"this is String:%s\n\n",yytext);}

"//".*" "* {total++;fprintf(yyout,"this is single line Commments: %s\n\n",yytext);}

\/\*(.*\n)*.*\*\/ {total++;fprintf(yyout,"this is multi line Comments: %s\n\n",yytext);}

. {fprintf(yyout,"",yytext);}

[\t\n]+

%%

main()
{
extern FILE *yyin, *yyout;

yyin = fopen("input.txt", "r");


yyout = fopen("Output.txt", "w");

yylex();

fprintf(yyout,"\n\n\n\n\n Total Tokens = %d",total);

return 0;
}

input.txt
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main()
{
char str[100],word[30];
int i,j,found,slen,wlen,index;

printf("Enter the String: ");


gets(str);
printf("Enter the Word: ");
gets(word);

slen=strlen(str);
wlen=strlen(word);

index=-1;
for(i = 0; i < slen; i++){

found = 1;

for(j = 0; j < wlen; j++){

if(str[ i + j ] != word[j]){

found = 0;

break;
}

}
if(str[i + j] != ' ' && str[i + j] != '\t' && str[i + j] != '\n' && str[i + j] != '\0')
{
found = 0;
}

if(found == 1){

for(j=i;j<=slen-wlen;j++){
str[j]=str[j+wlen];
}

printf("String after removing '%s': \n%s", word, str);

Output.txt
his is Pre-processor directive: #include <stdio.h>

This is Pre-processor directive: #include <stdlib.h>

This is Pre-processor directive: #include <string.h>

This is Keyword: int

This is Function: main(

This is Delimiter: )

This is Delimiter: {

This is Keyword:char
This is Identifier: str

This is Delimiter: [

This is Integer: 100

This is Delimiter: ]

This is Delimiter: ,
This is Identifier: word

This is Delimiter: [

This is Integer: 30

This is Delimiter: ]

This is Delimiter: ;

This is Keyword: int

This is Identifier: i

This is Delimiter: ,

This is Identifier: j

This is Delimiter: ,

This is Identifier: found

This is Delimiter: ,

This is Identifier: slen

This is Delimiter: ,

This is Identifier: wlen

This is Delimiter: ,

This is Identifier: index

This is Delimiter: ;

This is Function: printf(

this is String:"Enter the String: "

This is Delimiter: )

This is Delimiter: ;

This is Function: gets(


This is Identifier: str

This is Delimiter: )

This is Delimiter: ;

This is Function: printf(

this is String:"Enter the Word: "

This is Delimiter: )

This is Delimiter: ;

This is Function: gets(

This is Identifier: word

This is Delimiter: )

This is Delimiter: ;

This is Identifier: slen

This is Assignment Operator: =

This is Function: strlen(

This is Identifier: str

This is Delimiter: )

This is Delimiter: ;

This is Identifier: wlen

This is Assignment Operator: =

This is Function: strlen(

This is Identifier: word

This is Delimiter: )

This is Delimiter: ;
This is Identifier: index

This is Assignment Operator: =

This is Negative Integer: -1

This is Delimiter: ;

This is Function: for(

This is Identifier: i

This is Assignment Operator: =

This is Integer: 0

This is Delimiter: ;

This is Identifier: i

This is Relational Operator: <

This is Identifier: slen

This is Delimiter: ;

This is Identifier: i

This is Relational Operator: ++

This is Delimiter: )

This is Delimiter: {

This is Identifier: found

This is Assignment Operator: =

This is Integer: 1

This is Delimiter: ;

This is Function: for(

This is Identifier: j
This is Assignment Operator: =

This is Integer: 0

This is Delimiter: ;

This is Identifier: j

This is Relational Operator: <

This is Identifier: wlen

This is Delimiter: ;

This is Identifier: j

This is Relational Operator: ++

This is Delimiter: )

This is Delimiter: {

This is Function: if(

This is Identifier: str

This is Delimiter: [

This is Identifier: i

This is Arithmatic Operator: +

This is Identifier: j

This is Delimiter: ]

This is Relational Operator: !=

This is Identifier: word

This is Delimiter: [

This is Identifier: j

This is Delimiter: ]
This is Delimiter: )

This is Delimiter: {

This is Identifier: found

This is Assignment Operator: =

This is Integer: 0

This is Delimiter: ;

This is Keyword: break

This is Delimiter: ;

This is Delimiter: }

This is Delimiter: }

This is Function: if(

This is Identifier: str

This is Delimiter: [

This is Identifier: i

This is Arithmatic Operator: +

This is Identifier: j

This is Delimiter: ]

This is Relational Operator: !=

This is Delimiter: '

This is Delimiter: '

This is Special Characters: &

This is Special Characters: &

This is Identifier: str


This is Delimiter: [

This is Identifier: i

This is Arithmatic Operator: +

This is Identifier: j

This is Delimiter: ]

This is Relational Operator: !=

This is Delimiter: '

This is Identifier: t

This is Delimiter: '

This is Special Characters: &

This is Special Characters: &

This is Identifier: str

This is Delimiter: [

This is Identifier: i

This is Arithmatic Operator: +

This is Identifier: j

This is Delimiter: ]

This is Relational Operator: !=

This is Delimiter: '

This is Identifier: n

This is Delimiter: '

This is Special Characters: &

This is Special Characters: &


This is Identifier: str

This is Delimiter: [

This is Identifier: i

This is Arithmatic Operator: +

This is Identifier: j

This is Delimiter: ]

This is Relational Operator: !=

This is Delimiter: '

This is Integer: 0

This is Delimiter: '

This is Delimiter: )

This is Delimiter: {

This is Identifier: found

This is Assignment Operator: =

This is Integer: 0

This is Delimiter: ;

This is Delimiter: }

This is Function: if(

This is Identifier: found

This is Relational Operator: ==

This is Integer: 1

This is Delimiter: )

This is Delimiter: {
This is Function: for(

This is Identifier: j

This is Assignment Operator: =

This is Identifier: i

This is Delimiter: ;

This is Identifier: j

This is Relational Operator: <=

This is Identifier: slen

This is Arithmatic Operator: -

This is Identifier: wlen

This is Delimiter: ;

This is Identifier: j

This is Relational Operator: ++

This is Delimiter: )

This is Delimiter: {

This is Identifier: str

This is Delimiter: [

This is Identifier: j

This is Delimiter: ]

This is Assignment Operator: =

This is Identifier: str

This is Delimiter: [

This is Identifier: j
This is Arithmatic Operator: +

This is Identifier: wlen

This is Delimiter: ]

This is Delimiter: ;

This is Delimiter: }

This is Delimiter: }

This is Delimiter: }

This is Function: printf(

this is String:"String after removing '%s': \n%s"

This is Delimiter: ,

This is Identifier: word

This is Delimiter: ,

This is Identifier: str

This is Delimiter: )

This is Delimiter: ;

This is Delimiter: }

this is single line Commments: //krishno

this is multi line Comments: /*abc*/

Total Tokens = 215

Command prompt:
C:\Users\Husain\OneDrive\Desktop\GnuWin32\Flex>flex lex.l

C:\Users\Husain\OneDrive\Desktop\GnuWin32\Flex>gcc lex.yy.c

C:\Users\Husain\OneDrive\Desktop\GnuWin32\Flex>a.exe

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy