0% found this document useful (0 votes)
42 views124 pages

CD Lab Kare With Solution With Header

This document contains the details of a compiler design lab manual, including a bonafide certificate, experiment evaluation summary, course introduction, and course plan. The document was prepared by R.Raja Sekar and contains information for students enrolled in the Compiler Design course (CSE18R274) at Kalasalingam Academy of Research and Education, including a list of planned experiments and evaluation criteria.

Uploaded by

bandik280
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views124 pages

CD Lab Kare With Solution With Header

This document contains the details of a compiler design lab manual, including a bonafide certificate, experiment evaluation summary, course introduction, and course plan. The document was prepared by R.Raja Sekar and contains information for students enrolled in the Compiler Design course (CSE18R274) at Kalasalingam Academy of Research and Education, including a list of planned experiments and evaluation criteria.

Uploaded by

bandik280
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 124

PREPARED BY R.Raja Sekar, AP/CSE. KARE.

SCHOOL OF COMPUTING
Department of Computer Science and
Engineering

Compiler Design Lab Manual

(CSE18R274)

Student Name : ……………………………………………………….

Register Number : ……………………………………………………….

Section : ………………………………………………………..
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

TABLE OF CONTENTS

S.No Topic Page No.

1 Bonafide Certificate 3

2 Experiment Evaluation Summary 4

3 Course Plan 5

4 Introduction 11

Experiments
Implementation of Symbol Table
5

6 Implementation of a lexical analyzer for converting a


source program into tokens

7 Implement a code that converts the Regular Expression


(RE) to Finite Automata (DFA/Epsilon NFA)

8 Implement a predictive Parser to Find First and Follow


of a given grammar.

Implementation of shift reduce parsing Algorithm


9

10 Implementation of Operator precedence Parser for the


given operator grammar

11 Implement any one bottom up LR Parser for the given


grammar(SLR/CLR/LALR)

12 Construct the three-address code and parse tree for the


given expression

13 Implement a code that converts the given expression


into Triples, Quadruples and Indirect Triples.

14 Implement a simple code generator for the given


intermediate code.

15 Implement a code optimizer to perform possible


optimization like dead code optimization, copy
propagation etc

Use LEX tool to implement a lexical analyzer


16

17 Use LEX and YACC tool to implement a desktop


calculator
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

SCHOOL OF COMPUTING

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

BONAFIDE CERTIFICATE

Bonafide record of work done by

of III Year / VI Semester in CSE18R274 / Compiler Design during Even

semester in academic year 2022-2023

Staff In-charge Head of the Department

Submitted to the End Semester Practical Examination held at Kalasalingam

Academy of Research and Education, Krishnankoil on ----------------------------

REGISTER NUMBER

INTERNAL EXAMINER EXTERNAL EXAMINE


PREPARED BY R.Raja Sekar, AP/CSE. KARE.

EXPERIMENT EVALUATION SUMMARY

Marks Faculty
S.No. Date Experiment
(100) Signature

Implementation of Symbol Table


1
Implementation of a lexical analyzer for
2
converting a source program into tokens

Implement a code that converts the Regular


3
Expression (RE) to Finite
Automata(DFA/Epsilon NFA)

Implement a predictive Parser to Find First


4
and Follow of a given grammar.

Implementation of shift reduce parsing


5
Algorithm

Implementation of Operator precedence


6
Parser for the given operator grammar

Implement any one bottom up LR Parser for


7
the given grammar(SLR/CLR/LALR)

Construct the three address code and parse


8
tree for the given expression

Implement a code that converts the given


9
expression into Triples, Quadruples and
Indirect Triples.

Implement a simple code generator for the


10
given intermediate code.

Implement a code optimizer to perform


11
possible optimization like dead code
optimization ,copy propagation etc

Use LEX tool to implement a lexical


12
analyzer

Use LEX and YACC tool to implement a


13
desktop calculator
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

INTRODUCTION

COMPILER

Compiler is a program that reads a program written in one language – the


source language – and translates it in to an equivalent program in another language
– the target language.

ANALYSIS-SYNTHESIS MODEL OF COMPILATION

There are two parts to compilation: Analysis and Synthesis. The analysis part
breaks up the source program into constituent pieces and creates an intermediate
representation of source program. The synthesis part constructs the desired target
program from the intermediate representation. Of the two parts, synthesis requires
the most specialize technique.

PHASES OF A COMPILER

A compiler operates in six phases, each of which transforms the source


program from one representation to another. The first three phases are forming the
bulk of analysis portion of a compiler. Two other activities, symbol table
management and error handling, are also interacting with the six phases of compiler.
These six phases are lexical analysis, syntax analysis, semantic analysis, intermediate
code generation, code optimization and code generation.

LEXICAL ANALYSIS

In compiler, lexical analysis is also called linear analysis or scanning. In lexical


analysis the stream of characters making up the source program is read from left to
right and grouped into tokens that are sequences of characters having a collective
meaning.

SYNTAX ANALYSIS

It is also called as Hierarchical analysis or parsing. It involves grouping the


tokens of the source program into grammatical phrases that are used by the compiler
to synthesize output. Usually, a parse tree represents the grammatical phrases of the
source program.
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

SEMANTIC ANALYSIS

The semantic analysis phase checks the source program for semantic errors

and gathers type information for the subsequent code generation phase. It uses the
hierarchical structure determined by the syntax-analysis phase to identify the operators
and operands of expressions and statements. An important component of semantic
analysis is type checking. Here the compiler checks that each operator has operands that
are permitted by the source language specification.

SYMBOL TABLE MANAGEMENT

Symbol table is a data structure containing the record of each identifier, with fields
for the attributes of the identifier. The data structure allows us to find the record for each
identifier quickly and store or retrieve data from that record quickly. When the lexical
analyzer detects an identifier in the source program, the identifier is entered into symbol
table. The remaining phases enter information about identifiers in to the symbol table.

ERROR DETECTION

Each phase can encounter errors. The syntax and semantic analysis phases usually
handle a large fraction of the errors detectable by compiler. The lexical phase can detect
errors where the characters remaining in the input do not form any token of language.
Errors where the token stream violates the structure rules of the language are determined
by the syntax analysis phase.

INTERMEDIATE CODE GENERATION

After syntax and semantic analysis, some compilers generate an explicit


intermediate representation of the source program. This intermediate representation
should have two important properties: it should be easy to produce and easy to translate
into target program

CODE OPTIMIZATION

The code optimization phase attempts to improve the intermediate code so that the
faster running machine code will result. There are simple optimizations that significantly
improve the running time of the target program without slowing down compilation too
much.
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

CODE GENERATION

The final phase of compilation is the generation of target code, consisting normally
of reloadable machine code or assembly code.

7
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

SCHOOL OF COMPUTING

DEPARTMENT OF COMPUTER SCIENCE AN ENGINEERING

COURSE PLAN

Subject with code Compiler Design – CSE18R274

Course B.Tech

Semester / Sec VI / CBCS

Course Credit 4 (Integrated Course)

Course Coordinator Dr.S.J.Subhashini

Module Coordinator Dr.R.Murugeswari

Programme Coordinator Dr.N.C.Brintha

COURSE PRE-REQUISITE : Formal Language and Automata(CSE18R252)

COURSE DESCRIPTION

This self-paced course will discuss the major ideas used today in the
implementation of programming language compilers, including lexical analysis, parsing,
syntax-directed translation, abstract syntax trees, types and type checking, intermediate
languages, dataflow analysis, program optimization, code generation, and runtime
systems. As a result, you will learn how a program written in a high-level language
designed for humans is systematically translated into a program written in low-level
assembly more suited to machines.

COURSE OBJECTIVES

To make acquainted the students about the functional units of computer and
how each unit works along with the architectural and performance issues.

COURSE OUTCOMES(COS)

CO1: Understand the different phases of compilation


8
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

CO2: Apply context free grammars to parsing and compare different parsing, Technique
CO3: Analyze the various LR parsing methods and evaluate the intermediate code
representation
CO4: Create the various code generation schemes.
CO5: Apply the various optimization techniques for the generated code
CO6: Create an Efficient algorithm for analysis and syntheis part of the compiler
CO7: Implement the problem statements in programming languages , Lex and Yacc tools
efficiently

PROGRAMME SPECIFIC OUTCOMES

PSOs DESCRIPTION

PSO1 Able to develop software solutions for real world problems using core computing
technologies .

PSO2 Able to apply technologies such as AIML and data science for effective decision-
making towards sustainable development of a smart city.

PROGRAMME OUTCOMES

POs DESCRIPTION

PO1 Ability to apply knowledge of mathematics, science and computer engineering


to solve computational problems.

PO2 Identify, formulate, analyze and solve complex computing problems.

PO3 Capability to design and develop computing systems to meet the requirement
of industry and society with due consideration for public health, safety and
environment.

PO4 Ability to apply knowledge of design of experiment and data analysis to derive
solutions in complex computing problems and society with due consideration
for public health, safety and environment.

PO5 Ability to develop and apply modeling, simulation and prediction tools and
techniques to engineering problems.

PO6 Assess and understand the professional, legal, security and societal
responsibilities Relevant to computer engineering practice.

PO7 Ability to understand the impact of computing solutions in economic,


environmental and societal context for sustainable development.

9
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

PO8 Applying ethical principles and commitment to ethics of IT and software


profession.

PO9 Ability to work effectively as an individual as well as in teams.

PO10 Effectively communicating with technical community and with society.

PO11 Demonstrating and applying the knowledge of computer engineering and


management principles in software project development and in
multidisciplinary areas.

PO12 Understanding the need for technological changes and engage in life-long
learning.

ABET – CSO STATEMENT


At the end of the programme, the students will be able to:
CSO1 : Analyze a complex computing problem and to apply principles of computing and
other relevant disciplines to identify solutions.

CSO2 : Design, implement, and evaluates a computing-based solution to meet a given set
of computing requirements in the context of the program’s discipline.

CSO3 : Communicate effectively in a variety of professional contexts.

CSO4 : Recognize professional responsibilities and make informed judgments in


computing practice based on legal and ethical principles.

CSO5 : Function effectively as a member or leader of a team engaged in activities


appropriate to the program’s discipline.

CSO6 : Apply Computer Science theory and software development fundamentals to


produce computing-based solutions.

ABET – ESO STATEMENT


At the end of the programme, the students will be able to:
ESO1 : Ability to identify, formulate and solve complex engineering problems by applying
principles of Engineering, Science, and Mathematics.

ESO2 : Ability to apply engineering design to produce solutions that meet specified needs
with consideration of public health, safety, and welfare, as well as global, cultural, social,
10
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

environmental, and economic factors.

ESO3 : An ability to communicate effectively with a range of audiences.

ESO4 : Ability to recognize ethical and professional responsibilities in engineering


situations and make informed judgments, which must consider the impact of engineering
solutions in global, economic, environmental, and societal contexts.

ESO5 : Ability to function effectively on a team whose members together provide


leadership, create a collaborative and inclusive environment, establish goals, plan tasks,
and meet objectives.

ESO6 : Ability to develop and conduct appropriate experimentation, analyze and interpret
data, and use engineering judgment to draw conclusions.

MAPPING OF COURSE OUTCOMES WITH PO, PSO

POs PSOs

1 2 3 4 5 6 7 8 1. 10 11 12 1 2

CO1 S S

CO2 S S S S

CO3 S S S S M S S

CO4 S S S S S S

CO5 S S S S S M M M M S S

CO6 S S S S S S

CO7 S S S S S S

S- Strong Correlation M- Medium Correlation L – Low Correlation

11
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

WEB RESOURCES

S.N Topic Name Website Link


o

1 Compiler Construction http://www.linuxgazette.com/issue39/sevenich.html


Tools

2 Derivation and Parse http://www.softpanorama.org/Algorithms/compilers


tree .shtml

3 Parse tree, Lex, Yacc www.flint.cs.yale.edu

4 Context free grammar http://www.cs.nmsu.edu/~jeffery/courses/unlv/478


/lecture.html

5 LL(1) http://ag-kastens.uni-paderborn.de /lehre /material


/compi /aufgaben/blatt3/Blatt3.html

6 SLR,LALR,First and www.pdclab.cs.ucdavis.edu


Follow

7 Parsing ,LALR http://www.cs.umd.edu/class/spr98/c msc430/slides


,LR(0),LR(1)

8 LR(0),LR(1),SLR,YACC, www.cs.gmu.edu
LEX

9 Semantics, Grammar www.csee.umbc.edu

10 Predictive Parsing www.ambda.uta.edu

11 Recursive descent www.userpages.umbc.edu

14 LR Parser www.cwi.nl/~jurgenv/publications/slides/cc2002.ppt

15 Intermediate www.hardcoreprocessing.com/articles/presentations/
Languages tiliaoc/TheDocument.html

12
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

LIST OF EXPERIMENTS

Number Cumulative
S.No Experiment Details of Number of
Periods Periods

1 Implementation of Symbol Table 2 2

Implementation of a lexical analyzer for converting a


2 2 4
source program into tokens

Implement a code that converts the Regular Expression


3 2 6
(RE) to Finite Automata (DFA/Epsilon NFA)

Implement a predictive Parser to Find First and Follow of a


4 2 8
given grammar.

5 Implementation of shift reduce parsing Algorithm 2 10

Implementation of Operator precedence Parser for the


6 2 12
given operator grammar

Implement any one bottom up LR Parser for the given


7 4 16
grammar(SLR/CLR/LALR)

Construct the three address code and parse tree for the
8 2 18
given expression

Implement a code that converts the given expression into


9 2 20
Triples, Quadruples and Indirect Triples.

Implement a simple code generator for the given


10 2 22
intermediate code.

Implement a code optimizer to perform possible


11 optimization like dead code optimization ,copy 2 24
propagation etc

12 Use LEX tool to implement a lexical analyzer 4 28

13 Use LEX and YACC tool to implement a desktop calculator 2 30

ADDITIONAL EXPERIMENTS:
1. Peephole optimization

2. Loops in flow graphs

13
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

ASSESSMENT METHOD:
S.No Assessment Split up

Regular Lab Exercises (5)


1 Internal Assessment (15 marks)
Model Lab (10)

2 External Assessment (15 marks) Record (5)

End Semester program and output (10)

RUBRICS FOR INDIVIDUAL EXPERIMENTS

Modules Unacceptable Fair Acceptable Excellent


Level of Very little Some Introduction is Introduction
understanding background introductory nearly complete, complete,
information information, missing some provides all
provided or but still minor points(7) necessary
information is missing some background
incorrect major points(4) principles for
(1) the
experiment(10)
Algorithm Several major Algorithm Algorithm is Algorithm is
aspects of the misses one or nearly complete, complete and
exercise are missing, more major missing some well-written;
student displays a aspects of minor points provides all
lack of carrying out necessary
understanding about the exercise (6) (10) background
how to write an principles for
algorithm (2) the exercise(15)
Design Missing several Written in Written in Program Logic
principles & important proper logic, proper logic, is well written,
Program Logic experimental details still missing important details all details are
or not written in some are covered, covered
proper logic in important some minor (40)
program details details missing
(10) (20) (30)
Output Output contains Partial output; Output is good
errors or are poorly missing some but some minor Output is
constructed, important problems or excellent
(2) output could still be (10)
features(4) improved(7)

14
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Discussion/Vi Answered for less Answered for Answered for Answered for
va than 40% of the 60% of the 60% of the more than 90%
questions indicating questions. but questions. Still of the questions
a lack of incomplete need some correctly, good
understanding of understanding improvements understanding
results of results is still (7) of results is
(2) evident(4) conveyed(10)

15
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Ex.No.: 1 IMPLEMENTATION OF SYMBOL TABLE


DATE:
AIM
To create and print a symbol table that contains the name, address and type for the given
expression.
THEORY
Symbol table is an important data structure created and maintained by compilers in
order to store information about the occurrence of various entities such as variable names,
function names, objects, classes, interfaces, etc. Symbol table is used by both the analysis and
the synthesis parts of a compiler.
A symbol table may serve the following purposes depending upon the language in hand:
● To store the names of all entities in a structured form at one place.
● To verify if a variable has been declared.
● To implement type checking, by verifying assignments and expressions in the source
code are semantically correct.
● To determine the scope of a name (scope resolution).
If a compiler is to handle a small amount of data, then the symbol table can be
implemented as an unordered list, which is easy to code, but it is only suitable for
small tables only. A symbol table can be implemented in one of the following ways:
● Linear (sorted or unsorted) list
● Binary Search Tree
● Hash table
Among all, symbol tables are mostly implemented as hash tables, where the source code
symbol itself is treated as a key for the hash function and the return value is the information
about the symbol.
Items stored in Symbol table:
• Variable names and constants
• Procedure and function names
• Literal constants and strings
• Compiler generated temporaries
• Labels in source languages
Information used by the compiler from Symbol table:
• Data type and name
• Declaring procedures
• Offset in storage
• If structure or record then, a pointer to structure table.
• For parameters, whether parameter passing by value or by reference
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

• Number and type of arguments passed to function


• Base Address
Operations of Symbol table – The basic operations defined on a symbol table include:

Fig 1: Organization of Symbol Table Example


ALGORITHM
1. Start the Program.
2. Get the input from the user with the terminating symbol ‘$’.
3. Allocate memory for the variable by dynamic memory allocation function.
4. If the next character of the symbol is an operator, then the memory is allocated.
5. While reading, the input symbol is inserted into symbol table along with its memory
address.
6. The steps are repeated till”$”is reached.
7. To reach a variable, enter the variable to the searched and symbol table has been checked
for corresponding variable, the variable along its address is
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Program:
#include<stdio.h>
#include<math.h>
#include<string.h>
#include<ctype.h>
#include<stdlib.h>
void main ()
{
int x=0, n, i=0,j=0;
void *mypointer,*Sym_address[5];
char ch,Sym_Search,Sym_Array2[15],Sym_Array3[15],c;
printf("Input the expression ending with $ sign:");
while((c=getchar())!='$')
{
Sym_Array2[i]=c;
i++;
}
n=i-1;
printf("Given Expression:");
i=0;
while(i<=n)
{
printf("%c",Sym_Array2[i]);
i++;
}
printf("\n Symbol Table display\n");
printf("Symbol \t addr \t type");
while(j<=n)
{
c=Sym_Array2[j];
if(isalpha(toascii(c)))
{
mypointer=malloc(c);
Sym_address[x]=mypointer;
Sym_Array3[x]=c;
printf("\n%c \t %d \t identifier\n",c,mypointer);
x++;
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

j++;
}
else
{
ch=c;
if(ch=='+'||ch=='-'||ch=='*'||ch=='=')
{
mypointer=malloc(ch);
Sym_address[x]=mypointer;
Sym_Array3[x]=ch;
printf("\n %c \t %d \t operator\n",ch,mypointer);
x++;
j++;
}
}
}
}
OR
#include<stdio.h>
#include<conio.h>
#include<malloc.h>
#include<string.h>
#include<math.h>
#include<ctype.h>
void main()
{
int i=0,j=0,x=0,n,flag=0; void *p,*add[15];
char ch,srch,b[15],d[15],g[10],c;
clrscr();
printf("Expression terminated by $:");
while((c=getchar())!='$')
{
b[i]=c; i++;
}
n=i-1;
printf("Given expression:::");
i=0;
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

while(i<=n)
{
printf("%c",b[i]); i++;
}
printf("\n.....symbol table ... \n");
printf("symbol\taddr\ttype\n");
while(j<=n)
{
c=b[j];
if(isalpha(toascii(c)))
{
if(j<=n)
{
p=malloc(c); add[x]=p;
d[x]=c;
printf("%c\t%d\tidentifier\n",c,p); goto b;
}
else
{
b:
ch=b[j+1];
if(ch=='+'||ch=='-'||ch=='*'||ch=='='||ch==’/’)
{
p=malloc(c);
add[x]=p;
g[x]=ch;
printf("%c\t%p\t Operator \n",g[x],p);
x++;
}
}
} j++;
}
printf("the symbol is to be searched\n");
scanf("%s",&srch);
//srch=getch();
for(i=0;i<=x;i++)
{
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

if(srch==d[i]||srch==g[i])
{
printf("symbol found...");
printf("%c%s%p\n",srch,"@address",d[i]);
flag=1;
}
}
if(flag==0)
printf("symbol not found\n");
}
SAMPLE OUTPUT

VIVA QUESTIONS

1. What is a symbol table?


2. What is the purpose of a symbol table?
3.What are the operations involved in the symbol table?
4.What are the different ways to implement the symbol table?
5.What are Information used by the compiler from Symbol table?

RESULT:
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

EVALUATION
Assessment Marks Scored

Understanding Problem statement (10)


Efficiency of understanding algorithm (15)
Efficiency of program (40)
Output (15)
Viva (20)
(Technical – 10 and Communications - 10)
Total (100)
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Ex. No.: 2 IMPLEMENTATION OF A LEXICAL ANALYZER FOR


CONVERTING A SOURCE PROGRAM INTO TOKENS
DATE:
AIM
To Implement a Lexical Analyzer to separate the tokens from the given source
program.

THEORY:
Lexical analysis is the process of converting a sequence of characters (such as in a computer
program of web page) into a sequence of tokens (strings with an identified “meaning”). A
program that perform lexical analysis may be called a lexer, tokenizer or scanner.

Fig: Role of Lexical Analyzer


Token: Token is a sequence of characters that can be treated as a single logical entity.
Typical tokens are,
1) Identifiers 2) keywords 3) operators 4) special symbols 5) constants
Pattern: A set of strings in the input for which the same token is produced as output. This
set of strings is described by a rule called a pattern associated with the token.
Lexeme: A lexeme is a sequence of characters in the source program that is matched with
the pattern for a token.
The process of forming tokens from an input stream of characters is called tokenization
Consider this expression in the C programming language:
Sum=3 + 2;
Tokenized and represented in the following table:
Lexeme Token
Sum Identifier
= Assignment Operator
3 Literal or constant
+ Addition Operator
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

2 Literal or constant
; End of the statement

ALGORITHM:
1. Start the program.
2. Include necessary header files.
3. Declare all the variables and file pointers.
4. Include the input program (input.c) using file function.
5. Read the file input file and display the tokens.
6. Display the header files of the input program.
7. Separate the operators of the input program and display it.
8. Print the punctuation marks.
9. Print the constant that are present in input program.
10. Print the identifiers of the input program.
11. Also count the numbers of each token that occurs in input file and print it.
12. Stop the program.

PROGRAM :
#include<stdio.h>
#include<ctype.h>
#include<string.h>
void keyw(char *p);
int i=0,id=0,kw=0,num=0,op=0;
char keys[32][10]={"auto","break","case","char","const","continue","default",
"do","double","else","enum","extern","float","for","goto","if","int","long","register","return"
,"short","signed","sizeof","static","struct","switch","typedef","union",
"unsigned","void","volatile","while"};
main()
{
char ch,str[25],seps[15]=" \t\n,;(){}[]#\"<>",oper[]="!%^&*-+=~|.<>/?";
int j;
char fname[50];
FILE *f1;
//clrscr();
printf("enter file path (drive:\\fold\\filename)\n");
scanf("%s",fname);
f1 = fopen(fname,"r");
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

//f1 = fopen("Input","r");
if(f1==NULL)
{
goto END;
}
while((ch=fgetc(f1))!=EOF)
{
for(j=0;j<=14;j++)
{
if(ch==oper[j])
{
printf("%c is an operator\n",ch);
op++;
str[i]='\0';
keyw(str);
}
}
for(j=0;j<=14;j++)
{
if(i==-1)
break;
if(ch==seps[j])
{
if(ch=='#')
{
while(ch!='>')
{
printf("%c",ch);
ch=fgetc(f1);
}
printf("%c is a header file\n",ch);
i=-1;
break;
}
if(ch=='"')
{
do
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

{
ch=fgetc(f1);
printf("%c",ch);
}while(ch!='"');
printf("\b is an argument\n");
i=-1;
break;
}
str[i]='\0';
keyw(str);
}
}
if(i!=-1)
{
str[i]=ch;
i++;
}
else
i=0;
}
printf("Keywords: %d\nIdentifiers: %d\nOperators: %d\nNumbers: %d\n",kw,id,op,num);
//getch();
END:
printf("file not found");
}
void keyw(char *p)
{
int k,flag=0;
for(k=0;k<=31;k++)
{
if(strcmp(keys[k],p)==0)
{
printf("%s is a keyword\n",p);
kw++;
flag=1;
break;
}
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

}
if(flag==0)
{
if(isdigit(p[0]))
{
printf("%s is a number\n",p);
num++;
}
else
{
//if(p[0]!=13&&p[0]!=10)
if(p[0]!='\0')
{
printf("%s is an identifier\n",p);
id++;
}
}
}
i=-1;
}

INPUT: (INPUT.C)
#include<stdio.h>
#include<conio.h>
void main()
{
Int a,b,c;
a=10;
b=5;
c=a+b;
printf(“The sum is %d”,c);
getch();
}
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

OUTPUT

VIVAQUESTIONS:
1. State the use of Lexical Analyzer.
2. Define Token with example.
3. What do you mean by Lexeme? Give example.
4. What is the use of pattern? Write the pattern for identifiers.
5. List the lexical errors.

RESULT:
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

EVALUATION
Assessment Marks Scored

Understanding Problem statement (10)


Efficiency of understanding algorithm (15)
Efficiency of program (40)
Output (15)
Viva (20)
(Technical – 10 and Communications - 10)
Total (100)
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Ex.No.: 3 IMPLEMENT A CODE THAT CONVERTS THE REGULAR


EXPRESSION TO FINITE AUTOMAMTA (DFA/EPSILON NFA)
DATE:
AIM
To convert the given regular expression to NFA using Thomson Construction
Method.

THEORY
Regular Expression
o The language accepted by finite automata can be easily described by simple expressions
called Regular Expressions. It is the most effective way to represent any language.
o The languages accepted by some regular expression are referred to as Regular languages.
o A regular expression can also be described as a sequence of pattern that defines a string.
o Regular expressions are used to match character combinations in strings. String
searching algorithm used this pattern to find the operations on a string.
Operations on Regular Language
The various operations on regular language are:
Union: If L and M are two regular languages then their union L U M is also a union.
L U M = {s | s is in L or s is in M}
Intersection: If L and M are two regular languages then their intersection is also an
intersection.
L ⋂ M = {st | s is in L and t is in M}
Kleen closure: If L is a regular language then its Kleen closure L1* will also be a regular
language.
L* = Zero or more occurrence of language L
Finite Automata
o Finite automata are used to recognize patterns.
o It takes the string of symbol as input and changes its state accordingly. When the
desired symbol is found, then the transition occurs.
o At the time of transition, the automata can either move to the next state or stay in the
same state.
o Finite automata have two states, Accept state or Reject state. When the input string is
processed successfully, and the automata reached its final state, then it will accept.
Formal Definition of FA
A finite automaton is a collection of 5-tuple (Q, ∑, δ, q0, F), where:
1. Q: finite set of states
2. ∑: finite set of the input symbol
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

3. q0: initial state


4. F: final state
5. δ: Transition function
Finite Automata Model:
Finite automata can be represented by input tape and finite control.
Input tape: It is a linear tape having some number of cells. Each input symbol is placed in
each cell.
Finite control: The finite control decides the next state on receiving particular input from
input tape. The tape reader reads the cells one by one from left to right, and at a time only
one input symbol is read.

Types of Automata:
There are two types of finite automata:
1. DFA(deterministic finite automata)
2. NFA(non-deterministic finite automata)

1. DFA
DFA refers to deterministic finite automata. Deterministic refers to the uniqueness of the
computation. In the DFA, the machine goes to one state only for a particular input
character. DFA does not accept the null move.
2. NFA
NFA stands for non-deterministic finite automata. It is used to transmit any number of
states for a particular input. It can accept the null move.
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Some important points about DFA and NFA:


1. Every DFA is NFA, but NFA is not DFA.
2. There can be multiple final states in both NFA and DFA.
3. DFA is used in Lexical Analysis in Compiler.
4. NFA is more of a theoretical concept.

ALGORITHM
By using Thompsons construction algorithm to find out a Finite Automaton from a
Regular Expression.
Input − A Regular Expression R
Output − NFA accepting language denoted by R
The steps are
Case 1 : The NFA representing the empty string is:

Case ii : If the regular expression is just a character, eg. a, then the corresponding NFA is :

Case iii: The union operator is represented by a choice of transitions from a node; thus a|b
can be represented as:

Case iv Concatenation simply involves connecting one NFA to the other; eg. ab is:

Case v : The Kleene closure must allow for taking zero or more instances of the letter from
the input; thus a* looks like:
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Program:
#include<stdio.h>
#include<string.h>
int main()
{
char reg[20]; int q[20][3],i=0,j=1,len,a,b;
for(a=0;a<20;a++) for(b=0;b<3;b++) q[a][b]=0;
scanf("%s",reg);
printf("Given regular expression: %s\n",reg);
len=strlen(reg);
while(i<len)
{
if(reg[i]=='a'&&reg[i+1]!='|'&&reg[i+1]!='*') { q[j][0]=j+1; j++; }
if(reg[i]=='b'&&reg[i+1]!='|'&&reg[i+1]!='*') { q[j][1]=j+1; j++; }
if(reg[i]=='e'&&reg[i+1]!='|'&&reg[i+1]!='*') { q[j][2]=j+1; j++; }
if(reg[i]=='a'&&reg[i+1]=='|'&&reg[i+2]=='b')
{
q[j][2]=((j+1)*10)+(j+3); j++;
q[j][0]=j+1; j++;
q[j][2]=j+3; j++;
q[j][1]=j+1; j++;
q[j][2]=j+1; j++;
i=i+2;
}
if(reg[i]=='b'&&reg[i+1]=='|'&&reg[i+2]=='a')
{
q[j][2]=((j+1)*10)+(j+3); j++;
q[j][1]=j+1; j++;
q[j][2]=j+3; j++;
q[j][0]=j+1; j++;
q[j][2]=j+1; j++;
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

i=i+2;
}
if(reg[i]=='a'&&reg[i+1]=='*')
{
q[j][2]=((j+1)*10)+(j+3); j++;
q[j][0]=j+1; j++;
q[j][2]=((j+1)*10)+(j-1); j++;
}
if(reg[i]=='b'&&reg[i+1]=='*')
{
q[j][2]=((j+1)*10)+(j+3); j++;
q[j][1]=j+1; j++;
q[j][2]=((j+1)*10)+(j-1); j++;
}
if(reg[i]==')'&&reg[i+1]=='*')
{
q[0][2]=((j+1)*10)+1;
q[j][2]=((j+1)*10)+1;
j++;
}
i++;
}
printf("\n\tTransition Table \n");
printf(" \n");
printf("Current State |\tInput |\tNext State");
printf("\n \n");
for(i=0;i<=j;i++)
{
if(q[i][0]!=0) printf("\n q[%d]\t | a | q[%d]",i,q[i][0]);
if(q[i][1]!=0) printf("\n q[%d]\t | b | q[%d]",i,q[i][1]);
if(q[i][2]!=0)
{
if(q[i][2]<10) printf("\n q[%d]\t | e | q[%d]",i,q[i][2]);
else printf("\n q[%d]\t | e | q[%d] ,
q[%d]",i,q[i][2]/10,q[i][2]%10);
}
}
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

printf("\n \n");
return 0;
}

SAMPLE INPUT
Enter the regular expression : (a+b)*

SAMPLE OUTPUT

a
e e
2 3 b b
Start a

1 6 7 8 9
b
e 4 5 e

OR
Transition Table
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

VIVA QUESTIONS
1. Define Regular Expression.
2. List the operations of regular expressions.
3.Compare NFA and DFA.
4. Write the regular expression for the string that starts and ends with different symbol over
{a,b}.
5. Which is more powerful DFA,NFA and RE? Justify.

RESULT:

EVALUATION
Assessment Marks Scored

Understanding Problem statement (10)


Efficiency of understanding algorithm (15)
Efficiency of program (40)
Output (15)
Viva (20)
(Technical – 10 and Communications - 10)
Total (100)
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Ex.No.4 IMPLEMENT A PREDICTIVE PARSER TO FIND THE FIRST AND


FOLLOW OF THE FREE GRAMMAR
DATE;

AIM
To find first and follow of a given context free grammar

THEORY:
Why FIRST?
To avoid backtracking in parsing we need to calculate first.
S -> cAd
A -> bc|a
And the input string is “cad”.
If the compiler would have come to know in advance, that what is the “first character of the
string produced when a production rule is applied”, and comparing it to the current character
or token in the input string it sees, it can wisely take decision on which production rule to
apply.
Computing the Function FIRST
If X is Grammar Symbol, then First (X) will be −
• If X is a terminal symbol, then FIRST(X) = {X}
• If X → ε, then FIRST(X) = {ε}
• If X is non-terminal & X → a α, then FIRST (X) = {a}
• If X → Y1, Y2, Y3, then FIRST (X) will be
(a) If Y is terminal, then
FIRST (X) = FIRST (Y1, Y2, Y3) = {Y1}
(b) If Y1 is Non-terminal and
If Y1 does not derive to an empty string i.e., If FIRST (Y1) does not contain ε then,
FIRST (X) = FIRST (Y1, Y2, Y3) = FIRST(Y1)
(c) If FIRST (Y1) contains ε, then.
FIRST (X) = FIRST (Y1, Y2, Y3) = FIRST(Y1) − {ε} 𝖴 FIRST(Y2, Y3)
Similarly, FIRST (Y2, Y3) = {Y2}, If Y2 is terminal otherwise if Y2 is Non-terminal then
• FIRST (Y2, Y3) = FIRST (Y2), if FIRST (Y2) does not contain ε.
• If FIRST (Y2) contain ε, then
• FIRST (Y2, Y3) = FIRST (Y2) − {ε} 𝖴 FIRST (Y3)
Why FOLLOW?
The parser faces one more problem. Let us consider below grammar to understand this
problem.
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

A -> aBb
B -> c | ε
And suppose the input string is “ab” to parse.
As the first character in the input is a, the parser applies the rule A->aBb.
A
/|\
a B b
Now the parser checks for the second character of the input string which is b, and the Non-
Terminal to derive is B, but the parser can’t get any string derivable from B that contains b
as first character.
But the Grammar does contain a production rule B -> ε, if that is applied then B will vanish,
and the parser gets the input “ab”, as shown below. But the parser can apply it only when it
knows that the character that follows B in the production rule is same as the current
character in the input.
In RHS of A -> aBb, b follows Non-Terminal B, i.e. FOLLOW(B) = {b}, and the current input
character read is also b. Hence the parser applies this rule. And it is able to get the string “ab”
from the given grammar.
A A
/ | \ / \
a B b => a b
|
ε
So FOLLOW can make a Non-terminal vanish out if needed to generate the string from the
parse tree.

Computing the Function FOLLOW


1. First, put $ (the end of input marker) in Follow(S) (S is the start symbol)
2. Suppose there is a production rule of A → aBB, (where a can be a whole
string) then everything in FIRST(B) except for ε is placed in FOLLOW(B).
3. Suppose there is a production rule of A → aB, then everything in FOLLOW(A) is in
FOLLOW(B)
4. Suppose there is a production rule of A → aBC, where FIRST(C) contains ε, then
everything in FOLLOW(A) is in FOLLOW(B)
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Program:
#include<stdio.h>
#include<math.h>
#include<string.h>
#include<ctype.h>
#include<stdlib.h>
int n,m=0,p,i=0,j=0;
char a[10][10],f[10];
void follow(char c);
void first(char c);
int main()
{
int i,z;
char c,ch;
clrscr();
printf("Enter the no of prooductions:\n");
scanf("%d",&n);
printf("Enter the productions:\n");
for(i=0;i<n;i++)
scanf("%s%c",a[i],&ch);
do
{
m=0;
printf("Enter the elemets whose fisrt & follow is to be found:");
scanf("%c",&c);
first(c);
printf("First(%c)={",c);
for(i=0;i<m;i++)
printf("%c",f[i]);
printf("}\n");
strcpy(f," ");
//flushall();
m=0;
follow(c);
printf("Follow(%c)={",c);
for(i=0;i<m;i++)
printf("%c",f[i]);
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

printf("}\n");
printf("Continue(0/1)?");
scanf("%d%c",&z,&ch);
}while(z==1);
return(0);
}
void first(char c)
{
int k;
if(!isupper(c))
f[m++]=c;
for(k=0;k<n;k++)
{
if(a[k][0]==c)
{
if(a[k][2]=='$')
follow(a[k][0]);
else if(islower(a[k][2]))
f[m++]=a[k][2];
else first(a[k][2]);
}
}
}
void follow(char c)
{
if(a[0][0]==c)
f[m++]='$';
for(i=0;i<n;i++)
{
for(j=2;j<strlen(a[i]);j++)
{
if(a[i][j]==c)
{
if(a[i][j+1]!='\0')
first(a[i][j+1]);
if(a[i][j+1]=='\0' && c!=a[i][0])
follow(a[i][0]);
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

}
}
}
}
SAMPLE INPUT

Consider the expression grammar (4.11), repeated below:


A -> aAaB
B -> bBaB
A-> a
B -> b

SAMPLE OUTPUT

VIVA QUESTIONS
1. What is the need of calculating FIRST?
2. What is the need for FOLLOW?
3. Compare Top Down and Bottom-up parser?
4. Does Top-Down Parser handle Left Recursive Grammar?
5. State the rule for eliminating Left Recursion.

RESULT:
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

EVALUATION
Assessment Marks Scored

Understanding Problem statement (10)


Efficiency of understanding algorithm (15)
Efficiency of program (40)
Output (15)
Viva (20)
(Technical – 10 and Communications - 10)
Total (100)
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Ex.No.: 5 IMPLEMENT A SHIFT REDUCE PARSER


DATE:
AIM
To implement the shift reduce parsing technique using array implementation.

THEORY:
There are two types of parsing techniques in compiler design: Top Down
Parsing and Bottom Up Parsing.
• Top-Down Parsing- It is a parsing strategy that starts at the top of the parse tree and
works its way down by applying grammatical rules.
• Bottom-up Parsing- It is a parsing strategy that starts with the lowest level of the parse
tree and works its way up using grammatical rules.
Shift Reduce Parsing
Shift-reduce parsing is an example of bottom-up-parsing .It attempts to construct a
parse tree for an input string beginning at the leaves and working up towards the root. In
other words, it is a process of “reducing” a string w to the start symbol of a grammar. At every
(reduction) step, a particular substring matching the RHS of a production rule is replaced by
the symbol on the LHS of the production.

Fig: Shift Reduce Parser


A general form of shift-reduce parsing is LR (scanning from Left to right and using
Right-most derivation in reverse) parsing, which is used in a number of automatic parser
generators like Yacc, Bison, etc.
The shift-reduce parsing is a type of bottom-up parsing as it generates a parse tree from the
leaves (bottom) to the root(up).
• In shift-reduce parsing, the input string is reduced to the starting symbol.
• This reduction can be achieved by directly handling the rightmost derivation from
the starting symbol to the input string.
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

• Two data structures are required to perform shift-reduce parsing-


o An input buffer to hold the input string.
o A stack to keep the grammar symbols for accessing the production rules.
Now, we will discuss the basic operations performed in shift-reduce parsing.
Basic Operations
There are four basic operations a shift-reduce parser can perform:
1. Shift- This operation involves moving the current symbol from the input buffer onto the
stack.
2. Reduce- When the parser knows the right hand of the handle is at the top of the stack, the
reduce operation applies the applicable production rules, i.e., pops out the RHS of the
production rule from the stack and pushes the LHS of the production rule onto the stack.
3. Accept- After repeating the shift and reduce operations, if the stack contains the starting
symbol of the input string and the input buffer is empty, i.e., includes the $ symbol, the
input string is said to be accepted.
4. Error- If the parser cannot perform the shift or the reduce operation, also the string is not
accepted, then it is said to be in the error state.
Before coming to examples, we must remember some rules.
Rule 1: If the priority of the incoming operator is higher than the operator's priority at the
top of the stack, then we perform the shift action.
Rule 2: If the priority of the incoming operator is equal to or less than the operator's
priority at the top of the stack, then we perform the reduce action.
Example:
Stack Input Action
$ id1 + id2 * id3 $ Shift
$ id1 + id2 * id3 $ reduce by E id
$E + id2 * id3 $ Shift
$ E+ id2 * id3 $ Shift
$ E+id2 * id3 $ reduce by E id
$ E+E * id3 $ Shift
$ E+E* id3 $ Shift
$ E+E* id3 $ reduce by E id
$ E+E*E $ reduce by E E*E
$ E+E $ reduce by E E+E
$E $ Accept
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Program:
#include<stdio.h>
#include<string.h>
int k=0,z=0,i=0,j=0,c=0;
char a[16],ac[20],stk[15],act[10];
void check();
int main()
{

puts("GRAMMAR is E->E+E \n E->E*E \n E->(E) \n E->id");


puts("enter input string ");
gets(a);
c=strlen(a);
strcpy(act,"SHIFT->");
puts("stack \t input \t action");
for(k=0,i=0; j<c; k++,i++,j++)
{
if(a[j]=='i' && a[j+1]=='d')
{
stk[i]=a[j];
stk[i+1]=a[j+1];
stk[i+2]='\0';
a[j]=' ';
a[j+1]=' ';
printf("\n$%s\t%s$\t%sid",stk,a,act);
check();
}
else
{
stk[i]=a[j];
stk[i+1]='\0';
a[j]=' ';
printf("\n$%s\t%s$\t%ssymbols",stk,a,act);
check();
}
}
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

}
void check()
{
strcpy(ac,"REDUCE TO E");
for(z=0; z<c; z++)
if(stk[z]=='i' && stk[z+1]=='d')
{
stk[z]='E';
stk[z+1]='\0';
printf("\n$%s\t%s$\t%s",stk,a,ac);
j++;
}
for(z=0; z<c; z++)
if(stk[z]=='E' && stk[z+1]=='+' && stk[z+2]=='E')
{
stk[z]='E';
stk[z+1]='\0';
stk[z+2]='\0';
printf("\n$%s\t%s$\t%s",stk,a,ac);
i=i-2;
}
for(z=0; z<c; z++)
if(stk[z]=='E' && stk[z+1]=='*' && stk[z+2]=='E')
{
stk[z]='E';
stk[z+1]='\0';
stk[z+1]='\0';
printf("\n$%s\t%s$\t%s",stk,a,ac);
i=i-2;
}
for(z=0; z<c; z++)
if(stk[z]=='(' && stk[z+1]=='E' && stk[z+2]==')')
{
stk[z]='E';
stk[z+1]='\0';
stk[z+1]='\0';
printf("\n$%s\t%s$\t%s",stk,a,ac);
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

i=i-2;
}
}

Output :

GRAMMAR is E->E+E
E->E*E
E->(E)
E->id
enter input string
id+id*id+id

stack input action


$id +id*id+id$ SHIFT->id
$E +id*id+id$ REDUCE TO E
$E+ id*id+id$ SHIFT->symbols
$E+id *id+id$ SHIFT->id
$E+E *id+id$ REDUCE TO E
$E *id+id$ REDUCE TO E
$E* id+id$ SHIFT->symbols
$E*id +id$ SHIFT->id
$E*E +id$ REDUCE TO E
$E +id$ REDUCE TO E
$E+ id$ SHIFT->symbols
$E+id $ SHIFT->id
$E+E $ REDUCE TO E
$E $ REDUCE TO E

VIVA QUESTIONS
1. What do you meant by Shift -reduce conflict?
2. What do you meant by reduce -reduce conflict?
3. List the operations of Shift-Reduce parser.
4. Define LR parser and its types.
5. State the two rules of shift reduce parser.
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

EVALUATION
Assessment Marks Scored

Understanding Problem statement (10)


Efficiency of understanding algorithm (15)
Efficiency of program (40)
Output (15)
Viva (20)
(Technical – 10 and Communications - 10)
Total (100)
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Ex.No.: 6 IMPLEMENT AN OPERATOR PRECEDENCE PARSER FOR THE


GIVEN OPERATOR GRAMMAR
DATE:
AIM
To Write a C program to implement operator precedence parsing

THEORY

An operator precedence parsing is a bottom-up parsing that interprets an


operator-precedence grammar.
Operator precedence grammar is kinds of shift reduce parsing method. It is applied
to a small class of operator grammars.
A grammar is said to be operator precedence grammar if it has two properties:
• No R.H.S. of any production has a∈.
• No two non-terminals are adjacent.
Operator precedence can only be established between the terminals of the grammar.
It ignores the non-terminal.
There are the three operator precedence relations:
• a ⋗ b means that terminal "a" has the higher precedence than terminal "b".
• a ⋖ b means that terminal "a" has the lower precedence than terminal "b".
• a ≐ b means that the terminal "a" and "b" both have same precedence.
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Operator Precedence Table:

Parsing Action
• Both end of the given input string, add the $ symbol.
• Now scan the input string from left right until the ⋗ is encountered.
• Scan towards left over all the equal precedence until the first left most ⋖ is
encountered.
• Everything between left most ⋖ and right most ⋗ is a handle.
• $ on $ means parsing is successful.

Algorithm:
Initialize: Set ip to point to the first symbol of w$
Repeat:
Let X be the top stack symbol, and a the symbol pointed to by ip
if $ is on the top of the stack and ip points to $ then return
else
Let a be the top terminal on the stack, and b the symbol pointed to by ip
if a <· b or a =· b then
push b onto the stack
advance ip to the next input symbol
else if a ·> b then
repeat
pop the stack
until the top stack terminal is related by <·
to the terminal most recently popped
else error()
end

ALGORITHM FOR CONSTRUCTING PRECEDENCE GRAPH


1. Create functions fa for each grammar terminal a and for the end of string symbol;
2. Partition the symbols in groups so that fa and gb are in the same group if a =· b ( there can be
symbols in the same group even if they are not connected by this relation);
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

3. Create a directed graph whose nodes are in the groups, next for each symbols a and b do:
place an edge from the group of gb to the group of fa if a <· b, otherwise if a ·> b place an edge
from the group of fa to that of gb;

If the constructed graph has a cycle then no precedence functions exist. When there are no
cycles collect the length of the longest paths from the groups of fa and gb respectively.

Program:
#include<stdio.h>
#include<conio.h>
void main()
{
char stack[20], ip[20], opt[10][10][1], ter[10];
int i, j, k, n, top = 0, col, row;
clrscr();
for (i = 0; i < 10; i++)
{
stack[i] = NULL;
ip[i] = NULL;
for (j = 0; j < 10; j++)
{
opt[i][j][1] = NULL;
}
}
printf("Enter the no.of terminals :\n");
scanf("%d", & n);
printf("\nEnter the terminals :\n");
scanf("%s", & ter);
printf("\nEnter the table values :\n");
for (i = 0; i < n; i++)
{
for (j = 0; j < n; j++)
{
printf("Enter the value for %c %c:", ter[i], ter[j]);
scanf("%s", opt[i][j]);
}
}
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

printf("\n**** OPERATOR PRECEDENCE TABLE ****\n");


for (i = 0; i < n; i++)
{
printf("\t%c", ter[i]);
}
printf("\n");
for (i = 0; i < n; i++)
{
printf("\n%c", ter[i]);
for (j = 0; j < n; j++)
{
printf("\t%c", opt[i][j][0]);
}
}
stack[top] = '$';
printf("\nEnter the input string:");
scanf("%s", ip);
i = 0;
printf("\nSTACK\t\t\tINPUT STRING\t\t\tACTION\n");
printf("\n%s\t\t\t%s\t\t\t", stack, ip);
while (i <= strlen(ip))
{
for (k = 0; k < n; k++)
{
if (stack[top] == ter[k])
col = k;
if (ip[i] == ter[k])
row = k;
}
if ((stack[top] == '$') && (ip[i] == '$'))
{
printf("String is accepted\n");
break;
}
else if ((opt[col][row][0] == '<') || (opt[col][row][0] == '='))
{
stack[++top] = opt[col][row][0];
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

stack[++top] = ip[i];
printf("Shift %c", ip[i]);
i++;
}
else
{
if (opt[col][row][0] == '>')
{
while (stack[top] != '<')
{
--top;
}
top = top - 1;
printf("Reduce");
}
else
{
printf("\nString is not accepted");
break;
}
}
printf("\n");
for (k = 0; k <= top; k++)
{
printf("%c", stack[k]);
}
printf("\t\t\t");
for (k = i; k < strlen(ip); k++)
{
printf("%c", ip[k]);
}
printf("\t\t\t");
}
getch();
}
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Output:
Enter the value for * *:>
Enter the value for * $:>
Enter the value for $ i:<
Enter the value for $ +:<
Enter the value for $ *:<
Enter the value for $ $:accept

**** OPERATOR PRECEDENCE TABLE ****


i + * $

i e > > >


+ < > < >
* < > > >
$ < < < a
*/
Enter the input string:
i*i

STACK INPUT STRING ACTION

$ i*i Shift i
$<i *i Reduce
$ *i Shift *
$<* i Shift i
$<*<i
String is not accepted

VIVA QUESTIONS
1. What do you mean by operator grammar? Give Example.
2. State the rules of operator grammar.
3. What are the limitations of precedence table?
4. State the advantages and disadvantages of precedence graph.
5. Does operator precedence accept ambiguous grammar?
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

EVALUATION
Assessment Marks Scored

Understanding Problem statement (10)


Efficiency of understanding algorithm (15)
Efficiency of program (40)
Output (15)
Viva (20)
(Technical – 10 and Communications - 10)
Total (100)
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Ex.No.:07 IMPLEMENT ANY ONE BOTTOM-UP PARSER FOR THE


GIVEN GRAMMAR (SLR)
DATE:
AIM

To write a program for implementing any one bottom up parser for the given
grammar (SLR)

THEORY:
LR parsers:

Fig: LR Parser
It is an efficient bottom-up syntax analysis technique that can be used to parse large classes
of context free grammar is called LR(0) parsing.
L stands for the left to right scanning
R stands for rightmost derivation in reverse
0 stands for no. of input symbols of lookahead
Advantages of LR parsing:
• It recognizes virtually all programming language constructs for which CFG can be
written
• It is able to detect syntactic errors
• It is an efficient non-backtracking shift reducing parsing method.
Types of LR parsing methods:
1. SLR
2. CLR
3. LALR
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

SLR Parser:
SLR is simple LR. It is the smallest class of grammar having few number of states. SLR is very
easy to construct and is similar to LR parsing. The only difference between SLR parser and
LR(0) parser is that in LR(0) parsing table, there’s a chance of ‘shift reduced’ conflict because
we are entering ‘reduce’ corresponding to all terminal states. We can solve this problem by
entering ‘reduce’ corresponding to FOLLOW of LHS of production in the terminating state.
This is called SLR (1) collection of items

Steps for constructing the SLR parsing table:


1. Writing augmented grammar
2. LR (0) collection of items to be found
3. Find FOLLOW of LHS of production
4. Defining 2 functions: goto[list of terminals] and action[list of non-terminals] in the
parsing table

EXAMPLE – Construct LR parsing table for the given context-free grammar


S–>AA
A–>aA|b
Solution:
STEP1 – Find augmented grammar
The augmented grammar of the given grammar is:-
S’–>.S [0th production]
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

S–>.AA [1st production]


A–>.aA [2nd production]
A–>.b [3rd production]

STEP2 – Find LR(0) collection of items


Below is the figure showing the LR(0) collection of items. We will understand everything one
by one.
Closure − For a Context-Free Grammar G, if I is the set of items or states of grammar G,
then
• Every item in I is in the closure (I).
• If rule A → α. B β is a rule in closure (I) and there is another rule for B such as B → γ
then closure (I) will consist of A → α. Bβ and B → . γ

goto (I, X) − If there is a production A → α ∙ X β in I then goto (I, X) is defined as closure of


the set of items of A → α X ∙ β where I is set of items and X is grammar symbol (non-terminal).
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

STEP3 – Find FOLLOW of LHS of production


FOLLOW(S)=$
FOLLOW(A)=a,b,$
STEP 4-
Defining 2 functions: goto[list of non-terminals] and action[list of terminals] in the parsing
table. Below is the SLR parsing table.

Algorithm
Input − An Augmented Grammar G′
Output − SLR Parsing Table
Method
• Initially construct set of items
C = {I0, I1, I2 … … In} where C is a set of LR (0) items for Grammar.
• Parsing actions are based on each item or state I1.
Various Actions are −
• If A → α ∙ a β is in Ii and goto (Ii, a) = Ij then set Action [i, a] = shift j".
• If A → α ∙ is in Ii then set Action [i, a] to "reduce A → α" for all symbol a, where a ∈ FOLLOW (A).
• If S′ → S ∙ is in Ii then the entry in action table Action [i, $] = accept".
• The goto part of the SLR table can be filled as− The goto transition for the state i is considered for
non-terminals only. If goto (Ii, A) = Ij then goto [i, A] = j
• All entries not defined by rules 2 and 3 are considered to be "error. "

Program:
#include<stdio.h>
#include<string.h>
int axn[][6][2]={
{{100,5},{-1,-1},{-1,-1},{100,4},{-1,-1},{-1,-1}},
{{-1,-1},{100,6},{-1,-1},{-1,-1},{-1,-1},{102,102}},
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

{{-1,-1},{101,2},{100,7},{-1,-1},{101,2},{101,2}},

{{-1,-1},{101,4},{101,4},{-1,-1},{101,4},{101,4}}
{{100,5},{-1,-1},{-1,-1},{100,4},{-1,-1},{-1,-1}},
{{-1,-1},{101,6},{101,6},{-1,-1},{101,6},{101,6}},
{{100,5},{-1,-1},{-1,-1},{100,4},{-1,-1},{-1,-1}},
{{100,5},{-1,-1},{-1,-1},{100,4},{-1,-1},{-1,-1}},
{{-1,-1},{100,6},{-1,-1},{-1,-1},{100,1},{-1,-1}},
{{-1,-1},{101,1},{100,7},{-1,-1},{101,1},{101,1}},
{{-1,-1},{101,3},{101,3},{-1,-1},{101,3},{101,3}},
{{-1,-1},{101,5},{101,5},{-1,-1},{101,5},{101,5}}
};//Axn Table
int gotot[12][3]={1,2,3,-1,-1,-1,-1,-1,-1,-1,-1,-1,8,2,3,-1,-1,-1,
-1,9,3,-1,-1,10,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1}; //GoTo table
int a[10];
char b[10];
int top=-1,btop=-1,i;
void push(int k)
{
if(top<9)
a[++top]=k;
}
void pushb(char k)
{
if(btop<9)
b[++btop]=k;
}
char TOS()
{
return a[top];
}
void pop()
{
if(top>=0)
top--;
}
void popb()
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

if(btop>=0)
b[btop--]='\0';
}
void display()
{
for(i=0;i<=top;i++)
printf("%d%c",a[i],b[i]);
}
void display1(char p[],int m) //Displays The Present Input String
{
int l;
printf("\t\t");
for(l=m;p[l]!='\0';l++)
printf("%c",p[l]);
printf("\n");
}
void error()
{
printf("Syntax Error");
}
void reduce(int p)
{
int len,k,ad;
char src,*dest;
switch(p)
{
case 1:dest="E+T";
src='E';
break;
case 2:dest="T";
src='E';
break;
case 3:dest="T*F";
src='T';
break;
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

case 4:dest="F";

src='T';
break;
case 5:dest="(E)";
src='F';
break;
case 6:dest="i";
src='F';
break;
default:dest="\0";
src='\0';
break;
}
for(k=0;k<strlen(dest);k++)
{
pop();
popb();
}
pushb(src);
switch(src)
{
case 'E':ad=0;
break;
case 'T':ad=1;
break;
case 'F':ad=2;
break;
default: ad=-1;
break;
}
push(gotot[TOS()][ad]);
}
int main()
{
int j,st,ic;
char ip[20]="\0",an;
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

// clrscr();

printf("Enter any String\n");


scanf("%s",ip);
push(0);
display();
printf("\t%s\n",ip);
for(j=0;ip[j]!='\0';)
{
st=TOS();
an=ip[j];
if(an>='a'&&an<='z') ic=0;
else if(an=='+') ic=1;
else if(an=='*') ic=2;
else if(an=='(') ic=3;
else if(an==')') ic=4;
else if(an=='$') ic=5;
else {
error();
break;
}
if(axn[st][ic][0]==100)
{
pushb(an);
push(axn[st][ic][1]);
display();
j++;
display1(ip,j);
}
if(axn[st][ic][0]==101)
{
reduce(axn[st][ic][1]);
display();
display1(ip,j);
}
if(axn[st][ic][1]==102)
{
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

printf("Given String is accepted \n");

getch();
break;
}
/* else
{
printf ("Given String is rejected \n");
break;
}*/
}
return 0;
}

OUTPUT:

deepti@Inspiron-3542:~$ gcc slr.c

deepti@Inspiron-3542:~$ ./a.out

Enter any Stringv a+a*a$

0 a+a*a$
0a5 +a*a$
0F3 +a*a$
0T2 +a*a$
0E1 +a*a$
0E1+6 a*a$
0E1+6a5 *a$
0E1+6F3 *a$
0E1+6T9 *a$
0E1+6T9*7 a$
0E1+6T9*7a5 $
0E1+6T9*7F10 $
0E1+6T9 $
0E1 $
Given String is accepted
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

RESULT:

VIVA QUESTIONS
1. Define LR Parser.
2. State the advantages of LR Parser.
3. List the types of LR Parser.
4. Compare LR and LL Parser.
5. What do you mean by GOTO operation?

NOTE:
1. Even though CLR parser does not have RR conflict but LALR may contain RR conflict.
2. If number of states LR(0) = n1,
number of states SLR = n2,
number of states LALR = n3,
number of states CLR = n4 then,
n1 = n2 = n3 <= n4
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

EVALUATION
Assessment Marks Scored

Understanding Problem statement (10)


Efficiency of understanding algorithm (15)
Efficiency of program (40)
Output (15)
Viva (20)
(Technical – 10 and Communications - 10)
Total (100)
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Ex.No.:08 IMPLEMENT ANY ONE BOTTOM-UP PARSER FOR THE


GIVEN GRAMMAR (CLR)
DATE:

CLR refers to canonical lookahead. CLR parsing use the canonical collection of LR (1) items
to build the CLR (1) parsing table. CLR (1) parsing table produces the more number of
states as compare to the SLR (1) parsing.
In the CLR (1), we place the reduce node only in the lookahead symbols.
Various steps involved in the CLR (1) Parsing:
o For the given input string write a context free grammar
o Check the ambiguity of the grammar
o Add Augment production in the given grammar
o Create Canonical collection of LR (0) items
o Draw a data flow diagram (DFA)
o Construct a CLR (1) parsing table
o

LR (1) item
LR (1) item is a collection of LR (0) items and a look ahead symbol.
LR (1) item = LR (0) item + look ahead
The look ahead is used to determine that where we place the final item.
The look ahead always add $ symbol for the argument production.
Example
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

CLR ( 1 ) Grammar
1. S → AA
2. A → aA
3. A→b

STEP 1: Find augmented grammar


Add Augment Production, insert '•' symbol at the first position for every production in G
and also add the lookahead.

1. S` → •S, $
2. S → •AA, $
3. A → •aA, a/b
4. A → •b, a/b

Let’s apply the rule of lookahead to the above productions


• The initial look ahead is always $
• Now, the 1st production came into existence because of ‘ . ‘ Before ‘S’ in 0th
production.There is nothing after ‘S’, so the lookahead of 0th production will be the
lookahead of 1st production. ie: S–>.AA ,$
• Now, the 2nd production came into existence because of ‘ . ‘ Before ‘A’ in the 1st
production.After ‘A’, there’s ‘A’. So, FIRST(A) is a,b
Therefore,the look ahead for the 2nd production becomes a|b.
• Now, the 3rd production is a part of the 2nd production.So, the look ahead will be the same.

STEP 2 – Find LR(1) collection of items


Below is the figure showing the LR(1) collection of items. We will understand everything
one by one.
Closure
procedure closure (I)
begin
Repeat
for each item A → α ∙ B β, a in I,
each production B → γ and
each terminal b ∈ FIRST (β a)
If B → ∙ γ is not in I
Add B → ∙ γ, b to I
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Until no more elements can be joined to I;


End

G𝐨𝐭𝐨(𝐈, 𝐗):
If there is a production 𝐀 → 𝑎 ∙ 𝐗 𝛽, 𝐚 𝐢𝐧 𝐈 then 𝐠𝐨𝐭𝐨(𝐈, 𝐗) is the closure of the set of items of
𝐀 → 𝑎 𝐗 ∙ 𝛽, 𝐚.
Algorithm for Construction of LR (1) Set of Items
begin
C = {closure(S′→∙S,$)}
Repeat
for each set of items 𝐈 in C and each grammar symbol X (terminal or non-terminal)
Add 𝐠𝐨𝐭𝐨(𝐈, 𝐗) 𝐭𝐨 𝐂
Until no more sets of elements can be added to C.
end.

The terminals of this grammar are {a,b}


The non-terminals of this grammar are {S,A}
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

RULE-
1. If any non-terminal has ‘ . ‘ preceding it, we have to write all its production and add ‘ . ‘
preceding each of its production.
2. from each state to the next state, the ‘ . ‘ shifts to one place to the right.
3. All the rules of lookahead apply here.
STEP 3- defining 2 functions: goto [list of terminals] and action[list of non-terminals] in
the parsing table. Below is the CLR parsing table

Construction of Canonical LR Parsing Table Algorithm


Input − An Augmented Grammar G.′
Output − CLR Parsing Table
Method
• Initially construct set of items C = {I0, I1, I2 … … In} where C is a collection of LR (1) items
for G.
• Parsing actions are based on each item or state I1.
Various Actions are −
• If A → α ∙ a β is in Ii and goto (Ii, a) = Ij then create an entry in Action table Action [i, a] =
shift j".
• If A → α ∙, a is in Ii then set in Action table Action [i, a] to reduce A→α. " Here, A should not
be S′.
• If S′ → S ∙ is in Ii then Action [i, $] = accept".
• The goto part of the SLR table can be filled as −
• If goto (Ii, A) = Ij then goto [i, A] = j
• All entries not defined by rules 2 and 3 are considered to be "error. "
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

RESULT:

VIVA QUESTIONS
1. Define CLR Parser
2. Which is more powerful? SLR, CLR or LALR.
3. What do you meant by Look ahead?
4. Compare SLR and CLR.
5. In which LR parser, the number of states will be less? SLR, CLR or LALR.

SLR Parser LALR Parser CLR Parser

It is very easy and cheap to It is also easy and cheap to It is expensive and
implement. implement. difficult to implement.

SLR Parser is the smallest LALR and SLR have the same CLR Parser is the largest.
in size. size. As they have less number As the number of states is
of states. very large.

Error detection is not Error detection is not Error detection can be


immediate in SLR. immediate in LALR. done immediately in CLR
Parser.

SLR fails to produce a It is intermediate in power It is very powerful and


parsing table for a certain between SLR and CLR i.e., works on a large class of
class of grammars. SLR ≤ LALR ≤ CLR. grammar.

It requires less time and It requires more time and It also requires more time
space complexity. space complexity. and space complexity.
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

EVALUATION
Assessment Marks Scored

Understanding Problem statement (10)


Efficiency of understanding algorithm (15)
Efficiency of program (40)
Output (15)
Viva (20)
(Technical – 10 and Communications - 10)
Total (100)
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Ex.No.:09 CONSTRUCT THE THREE ADDRESS CODE


FOR THE GIVEN EXPRESSION
DATE:
AIM

To write program to construct the three-address code and parse tree


for the given expression

THEORY:
Three address code
o Three-address code is an intermediate code. It is used by the Code Optimizer.
o In three-address code, the given expression is broken down into several separate
instructions. These instructions can easily translate into assembly language.
o Each Three address code instruction has at most three operands. It is a combination of
assignment and a binary operator.
They use maximum three addresses to represent any statement. They are implemented as a
record with the address fields.
General Form-
In general, Three Address instructions are represented as-

a = b op c

Here,
• a, b and c are the operands.
• Operands may be constants, names, or compiler generated temporaries.
• op represents the operator.
Examples-
Examples of Three Address instructions are-
• a=b+c
• c=axb
Common Three Address Instruction Forms-
The common forms of Three Address instructions are-
1. Assignment Statement-

x = y op z and x = op y

Here,
• x, y and z are the operands.
• op represents the operator.
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

It assigns the result obtained after solving the right side expression of the assignment
operator to the left side operand.
2. Copy Statement-

x=y

Here,
• x and y are the operands.
• = is an assignment operator.
It copies and assigns the value of operand y to operand x.
3. Conditional Jump-

If x relop y goto X

Here,
• x & y are the operands.
• X is the tag or label of the target statement.
• relop is a relational operator.
If the condition “x relop y” gets satisfied, then-
• The control is sent directly to the location specified by label X.
• All the statements in between are skipped.
If the condition “x relop y” fails, then-
• The control is not sent to the location specified by label X.
• The next statement appearing in the usual sequence is executed.
4. Unconditional Jump-

goto X

Here, X is the tag or label of the target statement.


On executing the statement,
• The control is sent directly to the location specified by label X.
• All the statements in between are skipped.

5. Procedure Call-

param x call p return y

Here, p is a function which takes x as a parameter and returns y.

Example:
1. Write Three Address Code for the following expression-
(a x b) + (c + d) – (a + b + c + d)
Three Address Code for the given expression is-
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

(1) T1 = a x b
(2) T2 = uminus T1
(3) T3 = c + d
(4) T4 = T2 + T3
(5) T5 = a + b
(6) T6 = T3 + T5
(7) T7 = T4 – T6
2. Write Three Address Code for the following expression
If A < B and C < D then t = 1 else t = 0
(1) If (A < B) goto (3)
(2) goto (4)
(3) If (C < D) goto (6)
(4) t = 0
(5) goto (7)
(6) t = 1
(7)

Program:
#include<stdio.h>
#include<string.h>
void pm();
void plus();
void div();
int i,ch,j,l,addr=100;
char ex[10], exp[10] ,exp1[10],exp2[10],id1[5],op[5],id2[5];
void main()
{
clrscr();
while(1)
{
printf("\n1.assignment\n2.arithmetic\n3.relational\n4.Exit\nEnter the choice:");
scanf("%d",&ch);
switch(ch)
{
case 1:
printf("\nEnter the expression with assignment operator:");
scanf("%s",exp);
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

l=strlen(exp);
exp2[0]='\0';
i=0;
while(exp[i]!='=')
{
i++;
}
strncat(exp2,exp,i);
strrev(exp);
exp1[0]='\0';
strncat(exp1,exp,l-(i+1));
strrev(exp1);
printf("Three address code:\ntemp=%s\n%s=temp\n",exp1,exp2);
break;

case 2:
printf("\nEnter the expression with arithmetic operator:");
scanf("%s",ex);
strcpy(exp,ex);
l=strlen(exp);
exp1[0]='\0';
for(i=0;i<l;i++)
{
if(exp[i]=='+'||exp[i]=='-')
{
if(exp[i+2]=='/'||exp[i+2]=='*')
{
pm();
break;
}
else
{
plus();
break;
}
}
else if(exp[i]=='/'||exp[i]=='*')
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

{
div();
break;
}
}
break;

case 3:
printf("Enter the expression with relational operator");
scanf("%s%s%s",&id1,&op,&id2);
if(((strcmp(op,"<")==0)||(strcmp(op,">")==0)||(strcmp(op,"<=")==0)||(strcmp(op,">=")
==0)||(strcmp(op,"==")==0)||(strcmp(op,"!=")==0))==0)
printf("Expression is error");
else
{
printf("\n%d\tif %s%s%s goto %d",addr,id1,op,id2,addr+3);
addr++;
printf("\n%d\t T:=0",addr);
addr++;
printf("\n%d\t goto %d",addr,addr+2);
addr++;
printf("\n%d\t T:=1",addr);
}
break;
case 4:
exit(0);
}
}
}
void pm()
{
strrev(exp);
j=l-i-1;
strncat(exp1,exp,j);
strrev(exp1);
printf("Three address code:\ntemp=%s\ntemp1=%c%ctemp\n",exp1,exp[j+1],exp[j]);
}
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

void div()
{
strncat(exp1,exp,i+2);
printf("Three address code:\ntemp=%s\ntemp1=temp%c%c\n",exp1,exp[i+2],exp[i+3]);
}
void plus()
{
strncat(exp1,exp,i+2);
printf("Three address code:\ntemp=%s\ntemp1=temp%c%c\n",exp1,exp[i+2],exp[i+3]);
}

Output
1. assignment
2. arithmetic
3. relational
4. Exit
Enter the choice:1
Enter the expression with assignment operator:
a=b
Three address code:
temp=b
a=temp

1.assignment
2.arithmetic
3.relational
4.Exit
Enter the choice:2
Enter the expression with arithmetic operator:
a+b-c
Three address code:
temp=a+b
temp1=temp-c

1.assignment
2.arithmetic
3.relational
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

4.Exit
Enter the choice:2
Enter the expression with arithmetic operator:
a-b/c
Three address code:
temp=b/c
temp1=a-temp

1.assignment
2.arithmetic
3.relational
4.Exit
Enter the choice:2
Enter the expression with arithmetic operator:
a*b-c
Three address code:
temp=a*b
temp1=temp-c

1.assignment
2.arithmetic
3.relational
4.Exit
Enter the choice:2
Enter the expression with arithmetic operator:a/b*c
Three address code:
temp=a/b
temp1=temp*c
1.assignment
2.arithmetic
3.relational
4.Exit
Enter the choice:3
Enter the expression with relational operator
a
<=
b
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

100 if a<=b goto 103


101 T:=0
102 goto 104
103 T:=1

1.assignment
2.arithmetic
3.relational
4.Exit
Enter the choice:4

RESULT:

VIVA QUESTIONS
1. Define Three Address Code with Example.
2. What is the need for intermediate code?
3. What are the types of three address code?
4. Is three address code machine dependent? Justify.
5. Write the three address code for the following.
If A < B then 1 else 0

Generate three address code for the following code-


c=0
do
{
if (a < b) then
x++;
else
x–;
c++;
} while (c < 5)
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Solution:
Three address code for the given code is-
1. c = 0
2. if (a < b) goto (4)
3. goto (7)
4. T1 = x + 1
5. x = T1
6. goto (9)
7. T2 = x – 1
8. x = T2
9. T3 = c + 1
10. c = T3
11. if (c < 5) goto (2)

EVALUATION
Assessment Marks Scored

Understanding Problem statement (10)


Efficiency of understanding algorithm (15)
Efficiency of program (40)
Output (15)
Viva (20)
(Technical – 10 and Communications - 10)
Total (100)
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Ex.No.:10 IMPLEMENT A CODE THAT CONVERTS THE GIVEN


EXPRESSION TO TRIPLES, QUADRUPLES AND INDIRECT TRIPLES
DATE:
AIM
To write a program to implement a code that converts the given expression to triples,
quadruples and indirect triples

THEORY:
The commonly used representations for implementing Three Address Code are-
1. Quadruples
2. Triples
3. Indirect Triples
1. Quadruple –
It is structure with consist of 4 fields namely op, arg1, arg2 and result. op denotes the
operator and arg1 and arg2 denotes the two operands and result is used to store the result
of the expression.
Advantage –
• Easy to rearrange code for global optimization.
• One can quickly access value of temporary variables using symbol table.
Disadvantage –
• Contain lot of temporaries.
• Temporary variable creation increases time and space complexity.
Example:
a+bxc/e↑f+bxc
Three Address Code for the given expression is-
T1 = e ↑ f
T2 = b x c
T3 = T2 / T1
T4 = b x a
T5 = a + T3
T6 = T5 + T4
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Location Op Arg1 Arg2 Result

(0) ↑ e f T1

(1) x b c T2

(2) / T2 T1 T3

(3) x b a T4

(4) + a T3 T5

(5) + T5 T4 T6

3. Triples –
This representation doesn’t make use of extra temporary variable to represent a single
operation instead when a reference to another triple’s value is needed, a pointer to that
triple is used. So, it consist of only three fields namely op, arg1 and arg2.
Disadvantage –
• Temporaries are implicit and difficult to rearrange code.
• It is difficult to optimize because optimization involves moving intermediate code.
When a triple is moved, any other triple referring to it must be updated also. With
help of pointer one can directly access symbol table entry.

Location Op Arg1 Arg2

(0) ↑ e f

(1) x b c

(2) / (1) (0)

(3) x b a

(4) + a (2)

(5) + (4) (3)

3. Indirect Triples-
This representation is an enhancement over triples representation.
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

• It uses an additional instruction array to list the pointers to the triples in the desired
order.
• This representation makes use of pointer to the listing of all references to
computations which is made separately and stored.
• Thus, instead of position, pointers are used to store the results.
Advantages
• It allows the optimizers to easily re-position the sub-expression for producing the
optimized code
• Its similar in utility as compared to quadruple representation but requires less space
than it.
• Temporaries are implicit and easier to rearrange code.

Statement

35 (0)

36 (1)

37 (2)

38 (3)

39 (4)

40 (5)

Location Op Arg1 Arg2

(0) ↑ e f

(1) x b e

(2) / (1) (0)

(3) x b a

(4) + a (2)

(5) + (4) (3)


PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Program:
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

RESULT:

VIVA QUESTIONS
1. Compare Triples and Indirect Triples.
2. What are the ways of representation of Intermediate code? (Postfix notation, Syntax
tree, Three-address code)
3. State the advantages and disadvantages of Quadruples.
4. Translate the following expression to quadruple, triple and indirect triple-
a=bx–c+bx–c
5. State the advantages of Indirect Triples.

EVALUATION
Assessment Marks Scored

Understanding Problem statement (10)


Efficiency of understanding algorithm (15)
Efficiency of program (40)
Output (15)
Viva (20)
(Technical – 10 and Communications - 10)
Total (100)
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Ex .No 11 IMPLEMENT A SIMPLE CODE GENERATOR FOR THE


GIVEN
INTERMEDIATE CODE
DATE:
AIM

To write a C program for implementing back end of the compiler which takes three
address codes as input and produces 8086 assembly language instruction.

THEORY:
A code generator generates target code for a sequence of three- address statements
and effectively uses registers to store operands of the statements.
Code Generator determines the values that are to be stored in the registers.
• For example: consider the three-address statement a := b+c It can have the following
sequence of codes:

ADD Rj, Ri Cost = 1


(or)
ADD c, Ri Cost = 2
(or)
MOV c, Rj Cost = 3
ADD Rj, Ri

Register and Address Descriptors:


• A register descriptor is used to keep track of what is currently in each registers. The register
descriptors show that initially all the registers are empty.
• An address descriptor stores the location where the current value of the name can be found
at run time.

A code-generation algorithm:

The algorithm takes as input a sequence of three-address statements constituting a


basic block. For each three-address statement of the form x : = y op z, perform the following
actions:
1. Invoke a function getreg to determine the location L where the result of the computation
y op z should be stored.
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

2. Consult the address descriptor for y to determine y’, the current location of y. Prefer the
register for y’ if the value of y is currently both in memory and a register. If the value of y is
not already in L, generate the instruction MOV y’ , L to place a copy of y in L.
3. Generate the instruction OP z’ , L where z’ is a current location of z. Prefer a register to a
memory location if z is in both. Update the address descriptor of x to indicate that x is in
location L. If x is in L, update its descriptor and remove x from all other descriptors.
4. If the current values of y or z have no next uses, are not live on exit from the block, and are
in registers, alter the register descriptor to indicate that, after execution of x : = y op z , those
registers will no longer contain y or z

Example:
Generating Code for Assignment Statements:
The assignment statement d:= (a-b) + (a-c) + (a-c) can be translated into the following
sequence of three address code:
1. t:= a-b
2. u:= a-c
3. v:= t +u
4. d:= v+u

Code sequence for the example is as follows:


PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Statement Code Register Address


Generated descriptor descriptor
Register empty

t:= a - b MOV a, R0 R0 contains t t in R0


SUB b, R0

u:= a - c MOV a, R1 R0 contains t t in R0


SUB c, R1 R1 contains u u in R1

v:= t + u ADD R1, R0 R0 contains v u in R1


R1 contains u v in R1

d:= v + u ADD R1, R0 R0 contains d d in R0


MOV R0, d d in R0 and memory

Target Machine
o The target computer is a type of byte-addressable machine. It has 4 bytes to a word.
o The target machine has n general purpose registers, R0, R1,.... , Rn-1. It also has two-
address instructions of the form:
1. op source, destination
Where, op is used as an op-code and source and destination are used as a data field.
o It has the following op-codes:
ADD (add source to destination)
SUB (subtract source from destination)
MOV (move source to destination)
o The source and destination of an instruction can be specified by the combination of
registers and memory location with address modes.

MODE FORM ADDRESS EXAMPLE ADDED


COST

absolute M M Add R0, R1 1

register R R Add temp, R1 0

indexed c(R) C+ contents(R) ADD 100 (R2), R1 1


PREPARED BY R.Raja Sekar, AP/CSE. KARE.

indirect register *R contents(R) ADD * 100 0

indirect indexed *c(R) contents(c+ (R2), R1 1


contents(R))

literal #c c ADD #3, R1 1

o Here, cost 1 means that it occupies only one word of memory.


o Each instruction has a cost of 1 plus added costs for the source and destination.
o Instruction cost = 1 + cost is used for source and destination mode.

ALGORITHM:

1. Start the program


2. Include the necessary header files.
3. Get the number of statements from the user.
4. For each variable allocate a separate register using Load or Move Instructions LD R,a or
Mov R,a
5. If the expression contains operator “+”, then generate the assembly code as ADD
6. If the expression contains operator “-”, then generate the assembly code as SUB
7. If the expression contains operator “*”, then generate the assembly code as MUL
8. If the expression contains operator “/”, then generate the assembly code as DIV
9. Result of the operand is stored to any variables ST x, Ro.
10. Stop the program.
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

PROGRAM:

#include<stdio.h>
#include<string.h>
int main()
{
char inp[100][100];
int n,i,j,len;
int reg = 1;
printf("Enter the no of statements");
scanf("%d",&n);
for(i = 0; i < n; i++)
scanf("%s",&inp[i]);
for(i = 0; i < n; i++)
{
len = strlen(inp[i]);
for(j=2; j < len; j++)
{
if(inp[i][j] >= 97 && inp[i][j] <= 122)
{
printf("LOAD R%d %c \n",reg++,inp[i][j]);
}
if(j == len-1 && inp[i][len-j] =='=')
{
j=3; if(inp[i][j] == '+')
{
printf("ADD R%d R%d\n",reg-2,reg-1);
printf("STORE %c R%d\n",inp[i][0],reg-2);
}
else if(inp[i][j]=='-')
{
printf("SUB R%d R%d\n",reg-2,reg-1);
printf("STORE %c R%d\n",inp[i][0],reg-2);
}
else if(inp[i][j]=='*')
{
printf("MUL R%d R%d\n",reg-2,reg-1);
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

printf("STORE %c R%d\n",inp[i][0],reg-2);
}
else if(inp[i][j]=='/')
{
printf("DIV R%d R%d\n",reg-2,reg-1);
printf("STORE %c R%d\n",inp[i][0],reg-2);
}
break;
}
}
}
return 0;
}

OUTPUT:

RESULT:
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

VIVA QUESTIONS
1. What is the purpose of code generator?
2. List the issues in code generator.
3. Compare Register and Address descriptor.
4. What do you mean by next use information?
5. Write the assembly code for the given expression and find the cost. C=a+b*6
6. Name the technique used for allocating registers efficiently.
Linear Scan Algorithm

EVALUATION
Assessment Marks Scored

Understanding Problem statement (10)


Efficiency of understanding algorithm (15)
Efficiency of program (40)
Output (15)
Viva (20)
(Technical – 10 and Communications - 10)
Total (100)
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Ex .No 12 IMPLEMENT A CODE OPTIMIZER TO PERFORM POSSIBLE


OPTIMIZATION LIKE ELIMINATION OF DEAD CODE,
COMMON SUB EXPRESSION ETC.
DATE:
AIM

To write a C program to implement a code optimizer to perform possible optimization like


dead code elimination, common sub expression elimination, etc,.

THEORY:
Reasons for Optimizing the Code
• Code optimization is essential to enhance the execution and efficiency of a source
code.
• It is mandatory to deliver efficient target code by lowering the number of instructions
in a program.

When to Optimize?
Code optimization is an important step that is usually performed at the last stage of
development.

Role of Code Optimization


• It is the fifth stage of a compiler, and it allows you to choose whether or not to
optimize your code, making it really optional.
• It aids in reducing the storage space and increases compilation speed.
• It takes source code as input and attempts to produce optimal code.
• Functioning the optimization is tedious; it is preferable to employ a code optimizer to
accomplish the assignment.

Different Types of Optimization


Optimization is classified broadly into two types:
• Machine-Independent
• Machine-Dependent

Machine-Independent Optimization
It positively affects the efficiency of intermediate code by transforming a part of code that
does not employ hardware parts. It usually optimises code by eliminating tediums and
removing unneeded code.
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Machine-Dependent Optimization
After the target code has been constructed and transformed according to the target machine
architecture, machine-dependent optimization is performed. It makes use of CPU registers
and may utilise absolute rather than relative memory addresses. Machine-dependent
optimizers work hard to maximise the perks of the memory hierarchy.
Loop Optimization
• Invariant code/Code Motion or Frequency Reduction
• Induction analysis
• Strength reduction

ALGORITHM:
Step1:Start the program.
Step2: Get the coding from the user.
Step3: Find the operators, arguments and results from the coding.
Step4: Display the value in the table.
Step5:Stop the program

PROGRAM:
#include<stdio.h>
#include<conio.h>
#include<string.h>
struct op
{
char l;
char r[20];
}op[10],pr[10];
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

void main()
{
int a,i,k,j,n,z=0,m,q;

char *p,*l;
char temp,t;
char *tem;
clrscr();
printf("enter no of values");
scanf("%d",&n);
for(i=0;i<n;i++)
{
printf("left\t");
op[i].l=getche();
printf("right:\t");
scanf("%s",op[i].r);
}
printf("intermediate Code\n") ;
for(i=0;i<n;i++)
{
printf("%c=",op[i].l);
printf("%s\n",op[i].r);
}
for(i=0;i<n-1;i++)
{
temp=op[i].l;
for(j=0;j<n;j++)
{
p=strchr(op[j].r,temp);
if(p)
{
pr[z].l=op[i].l;
strcpy(pr[z].r,op[i].r);
z++ ;

}} }
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

pr[z].l=op[n-1].l;
strcpy(pr[z].r,op[n-1].r);
z++;
printf("\nafter dead code elimination\n");
for(k=0;k<z;k++)
{

printf("%c\t=",pr[k].l);
printf("%s\n",pr[k].r);
}

//sub expression elimination


for(m=0;m<z;m++)
{
tem=pr[m].r;
for(j=m+1;j<z;j++)
{
p=strstr(tem,pr[j].r);
if(p)
{
t=pr[j].l;
pr[j].l=pr[m].l ;
for(i=0;i<z;i++)
{
l=strchr(pr[i].r,t) ;
if(l)
{
a=l-pr[i].r;
//printf("pos: %d",a);
pr[i].r[a]=pr[m].l;
}}}}}
printf("eliminate common expression\n");
for(i=0;i<z;i++)
{
printf("%c\t=",pr[i].l);
printf("%s\n",pr[i].r);
}
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

// duplicate production elimination

for(i=0;i<z;i++)
{
for(j=i+1;j<z;j++)
{
q=strcmp(pr[i].r,pr[j].r);
if((pr[i].l==pr[j].l)&&!q)

{
pr[i].l='\0';
strcpy(pr[i].r,'\0');
}}
}
printf("optimized code");
for(i=0;i<z;i++)
{
if(pr[i].l!='\0')
{
printf("%c=",pr[i].l);
printf("%s\n",pr[i].r);
}
}
getch();
}
OUTPUT:
enter no of values 5
left a right: 9
left b right: c+d
left e right: c+d
left f right: b+e
left r right: f
intermediate Code
a=9
b=c+d
e=c+d
f=b+e
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

r=f

after dead code elimination


b =c+d
e =c+d
f =b+e
r =f
eliminate common expression
b =c+d
b =c+d
f =b+b
r =f
optimized code
b=c+d
f=b+b
r=f

RESULT:
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

VIVA QUESTIONS
1. State the role of Optimizer
2. Compare Machine dependent and independent optimizer.
3. List the machine dependent optimization techniques.
4. List the machine independent optimization techniques.
5. What do you mean by common sub expression? Give Example.
6. What is code motion and Dead code?

EVALUATION
Assessment Marks Scored

Understanding Problem statement (10)


Efficiency of understanding algorithm (15)
Efficiency of program (40)
Output (15)
Viva (20)
(Technical – 10 and Communications - 10)
Total (100)
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Ex.No.13 IMPLEMENT A LEXICAL ANALYZER USING LEX TOOL


DATE:
AIM
To write a program to identify tokens in the source program using LEX tool.

THEORY:
Introduction:
LEX stands for Lexical Analyzer. LEX is a UNIX utility which generates the lexical analyzer.
LEX is a tool for generating scanners. Scanners are programs that recognize lexical patterns
in text. These lexical patterns (or regular expressions) are defined in a particular syntax. A
matched regular expression may have an associated action. This action may also include
returning a token. When Lex receives input in the form of a file or text, it attempts to match
the text with the regular expression. It takes input one character at a time and continues until
a pattern is matched. If a pattern can be matched, then Lex performs the associated action
(which may include returning a token). If, on the other hand, no regular expression can be
matched, further processing stops and Lex displays an error message. Lex and C are tightly
coupled. A lex file (files in Lex have the .l extension eg: first.l ) is passed through the lex utility,
and produces output files in C (lex.yy.c). The program lex.yy.c basically consists of a
transition diagram constructed from the regular expressions of first.l These file is then
compiled object program a.out, and lexical analyzer transforms an input streams into a
sequence of tokens as show in figure. To generate a lexical analyzer two important things are
needed. Firstly it will need a precise specification of the tokens of the language. Secondly it
will need a specification of the action to be performed on identifying each token
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

LEX Specifications:

The Structure of lex programs consists of three parts:

Definition Section:

The Definition Section includes declarations of variables, start conditions regular


definitions, and manifest constants (A manifest constant is an identifier that is declared to
represent a constant e.g.# define PIE 3.14).

C code: Any indented code between %{ and %} is copied to the C file. This is typically used
for defining file variables, and for prototypes of routines that are defined in the code segment.

Definitions: A definition is very much like # define cpp directive. For example

letter [a-zA-Z]+

digit [0-9]+
These definitions can be used in the rules section: one could start a rule
{letter}{printf("n Wordis = %s",yytext);}
State definitions: If a rule depends on context, it‟s possible to introduce states and
incorporate those in the rules. A state definition looks like %s STATE, and by default a
state INITIAL is already given.

Rule Section:
Second section is for translation rules which consist of regular expression and action with
respect to it. The translation rules of a Lex program are statements of the form:
p1 {action 1}

p2 {action 2}

p3 {action 3}

... ...

... ...

pn {action n}
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Where, each p is a regular expression and each action is a program fragment describing what
action the lexical analyzer should take when a pattern p matches a lexeme. In Lex the actions
are written in C.

Auxiliary Function (User Subroutines):

Third section holds whatever auxiliary procedures are needed by the actions. If the lex
program is to be used on its own, this section will contain a main program. If you leave this
section empty, you will get the default main as follow:

int main()
{
yylex();
return 0;
}
In this section we can write a user subroutines its option to user e.g. yylex() is a unction
automatically get called by compiler at compilation and execution of lex program or we can
call that function from the subroutine section.

2. Built - in Functions:

2. Built - in Variables:
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Regular Expression:

Steps to Execute the program:


$ lex filename.l (eg: first.l)
$cc lex.yy.c–ll or gcc lex.yy.c–ll
$./a .out

ALGORITHM:

1. Lex program contains three sections: definitions, rules, and user subroutines. Each
section must be separated from the others by a line containing only the delimiter, %%.

The format is as follows:

definitions
%%
rules
%%
user_subroutines
2. In definition section, the variables make up the left column, and their definitions
make up the right column. Any C statements should be enclosed in %{..}%. Identifier
is defined such that the first letter of an is alphabet and remaining letters are
alphanumeric.

3. In rules section, the left column contains the pattern to be recognized in an input
file to yylex(). The right column contains the C program fragment executed when that
pattern is recognized. The various patterns are keywords, operators, new line
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

character, number, string, identifier, beginning and end of block, comment


statements, preprocessor directive statements etc.

4. Each pattern may have a corresponding action, that is, a fragment of C source code
to execute when the pattern is matched.

5. When yylex() matches a string in the input stream, it copies the matched text to an
external character array, yytext, before it executes any actions in the rules section.

6. In user subroutine section, main routine calls yylex(). yywrap() is used to get more
input.

7. The lex command uses the rules and actions contained in file to generate a program,
lex.yy.c, which can be compiled with the cc command. That program can then receive
input, break the input into the logical pieces defined by the rules in file, and run
program fragments contained in the actions in file.

PROGRAM: (lexid.l)
%{
#include<stdio.h>
int e,k,c,d,i,s;
%}

%%
include|void|main|int|float|double|scanf|char|printf {printf("keyword"); i++;}
[a-z][a-zA-Z0-9]* {printf("Identifier"); k++;}
[0-9]* {printf("digit"); e++;}
[+|-|*|/|=]* {printf("operator"); c++;}
[;|:|(|)|{|}|"|'|,|\n|\t]* {printf("delimeter"); d++;}
[#|<|>|%]* {printf("symbols"); s++;}
%%

int main(void)
{
yyin=fopen("lexy.txt","r");
yylex();
printf("\nidentifier %d\n",k);
printf("Symbols %d\n",s);
printf("digits %d\n",e);
printf(" Operator %d\n",c);
printf(" keywords %d\n",i);
printf("delimeter %d\n",d);
return 1;
}
int yywrap()
{
return 1;
}
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

INPUT:
Lexyi.txt
int a=10;

OUTPUT:
C:/FlexWindow:/EditPlusPortable> lex lexid.l
C:/FlexWindow:/EditPlusPortable> cc lex.yy.c
C:/FlexWindow:/EditPlusPortable> a

Identifier 1
Digit 1
Keyword 1
Operator 1
Delimiter 1
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

PROGRAM TO RECOGNIZE A VALID ARITHMETIC EXPRESSION


THAT USESOPERATOR +, - , * AND / USING YACC

ALGORITHM:

1. Declare the variables which are used for C programs in the declaration
section %{...%}.
2. Define the tokens, precedence and associativity of operators used in yacc.
3. Include the pattern (Context Free Grammar) in the transition rule section
for validating the expression between %%..%%
4. In main function get the expression from the user for validating it.
5. Call the yyparse() function to parse the given expression and it construct
the LALR parsing table using the grammar defined in transition rule .
6. Then call the yylex() function, it get the current token and store its value
in yylval variable and it is repeated until the value for given expression is
computed.
7. Then it validates the expression with constructed LALR parser.
8. Print the expression is VALID if the expression given by user is derived
by the grammar, else print INVALID.
9. Stop the program.

PROGRAM:
%{
#include<ctype.h>
#include<stdlib.h>
#include<string.h>
#define YYSTYPE double
%}

%token num
%left '+' '-'
%left '*' '/'

%%
st: st expr '\n' {printf("VALID");}
|st '\n'
|
|error '\n' {printf("INVALID");}
;
expr: num
|expr '+' expr
|expr '/' expr
%%

main()
{
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

printf(" ENTER AN EXPRESSION TO VALIDATE");


yyparse();
}
yylex()
{
int ch;
while((ch=getchar())==' ');
if(isdigit(ch)|ch=='.')
{
ungetc(ch,stdin);
scanf("%lf",&yylval);
return num;
}
return ch;
}
yyerror(char *s)
{
printf("%S",s);
}

OUTPUT:

C:\Flex Windows\EditPlusPortable>yacc -d arithval.y

C:\Flex Windows\EditPlusPortable>cc y.tab.c

C:\Flex Windows\EditPlusPortable>a

ENTER AN EXPRESSION TO VALIDATE 5+9

VALID

4+6

VALID

5-

INVALID
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

PROGRAM TO RECOGNIZE A VALID VARIABLE WHICH STARTS


WITH A LETTER FOLLOWED BY ANY NUMBER OF LETTERS OR
DIGITS

ALGORITHM:

1. Declare the variables which are used for C programs in the declaration section
%{...%}.
2. Define the tokens let, dig used in yacc.
3. Include the pattern (Context Free Grammar) in the transition rule section for
validating the variable between %%..%%
4. In main function get the variable from the user for validating it.
5. Call the yyparse() function to parse the given expression and it construct the
LALR parsing table using the grammar defined in transition rule .
6. Then call the yylex() function, it get the current token and store its value in
yylval variable and it is repeated until the value for given variable.
7. Then it validates the variable with constructed LALR parser.
8. Print the variable is “accepted” if the variable given by user is derived by the
grammar, else print “rejected”.
9. Stop the program.

PROGRAM:

%{
#include<stdio.h>
#include<ctype.h>
%}

%token let dig

%%
sad: let recld '\n' {printf("accepted\n"); return 0;}
| let '\n' {printf("accepted\n"); return 0;}
|
|error {yyerror("rejected\n");return 0;}
;
recld: let recld
| dig recld
| let
| dig
;
%%
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

yylex()
{
char ch;
while((ch=getchar())==' ');
if(isalpha(ch))
return let;
if(isdigit(ch))
return dig;
return ch;
}
yyerror(char *s)
{
printf("%s",s);
}
main()
{
printf("ENTER A variable : ");
yyparse();
}

OUTPUT:
C:\Flex Windows\EditPlusPortable>yacc -d valid.y

C:\Flex Windows\EditPlusPortable> cc y.tab.c

C:\Flex Windows\EditPlusPortable> a

A1
accepted
10a
rejected
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

VIVA QUESTIONS
1. Define LEX.
2. Give the syntax of LEX program.
3. List the built-in functions in LEX.
4. List the built-in variables in LEX.
5. Give the regular expression for the identifiers.

EVALUATION
Assessment Marks
Scored
Understanding Problem statement (10)
Efficiency of understanding algorithm (15)
Efficiency of program (40)
Output (15)
Viva (20)
(Technical – 10 and Communications - 10)
Total (100)
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Ex. No. 14 IMPLEMENT A DESKTOP CALCULATOR USING LEX


AND YAAC TOOL
DATE:

AIM
To write a lex program to implement a desktop calculator without giving the priority
to operators.

THEORY:

Parser generator facilitates the construction of the front end of a compiler. YACC is LALR
parser generator. It is used to implement hundreds of compilers. YACC is command (utility)
of the UNIX system. YACC stands for “Yet Another Compiler Complier”.

File in which parser generated is with .y extension. e.g. parser.y, which is containing YACC
specification of the translator. After complete specification UNIX command. YACC
transforms parser.y into a C program called y.tab.c using LR parser. The program y.tab.c is
automatically generated. We can use command with –d option as

yacc –d parser.y

By using –d option two files will get generated namely y.tab.c and y.tab.h. The header file
y.tab.h will store all the token information and so you need not have to create y.tab.h
explicitly.

The program y.tab.c is a representation of an LALR parser written in C, along with other C
routines that the user may have prepared. By compiling y.tab.c with the ly library that
contains the LR parsing program using the command.

cc y tab c – ly

we obtain the desired object program a out that perform the translation specified by the
original program.

If procedure is needed, they can be compiled or loaded with y.tab.c, just as with any C
program.
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

Fig- Parser Generator Model

LEX recognizes regular expressions, whereas YACC recognizes entire grammar. LEX
divides the input stream into tokens, while YACC uses these tokens and groups them
together logically. LEX and YACC work together to analyze the program syntactically.
The YACC can report conflicts or ambiguities (if at all) in the form of error messages.

1. YACC Specifications:
The Structure of YACC programs consists of three parts:

Definition Section:
The definitions and programs section are optional. Definition section handles control
information for the YACC-generated parser and generally set up the execution environment
in which the parser will operate.

Declaration part:
In declaration section, %{ and %} symbol used for C declaration. This section is used for
definition of token, union, type, start, associativity and precedence of operator. Token
declared in this section can then be used in second and third parts of Yacc specification.

Translation Rule Section:


In the part of the Yacc specification after the first %% pair, we put the translation rules. Each
rule consists of a grammar production and the associated semantic action. A set of
productions that we have been writing:
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

<left side> <alt 1> | <alt 2> | … <alt n>


Would be written in YACC as
<left side> : <alt 1> {action 1}
| <alt 2> {action 2}
……………
| <alt n> {action n}
;
In a YACC production, unquoted strings of letters and digits not declared to be tokens are
taken to be non-terminals. A quoted single character, e.g. 'c', is taken to be the terminal
symbol c, as well as the integer code for the token represented by that character (i.e., Lex
would return the character code for ' c' to the parser, as an integer). Alternative bodies can
be separated by a vertical bar, and a semicolon follows each head with its alternatives and
their semantic actions.

The first head is taken to be the start symbol.


A Yacc semantic action is a sequence of C statements. In a semantic action, the symbol $$
refers to the attribute value associated with the nonterminal of the head, while $i refers to
the value associated with the ith grammar symbol (terminal or nonterminal) of the body. The
semantic action is performed whenever we reduce by the associated production, so normally
the semantic action computes a value for $$ in terms of the $i's. In the YACC specification,

we have written the two E-productions.


E E + T/T
and their associated semantic action as:
exp : exp „+‟ term {$$ = $1 + $3;}
| term
;
In above production exp is $1, „+‟ is $2 and term is $3. The semantic action associated with
first production adds values of exp and term and result of addition copying in $$ (exp) left
hand side. For above second number production, we have omitted the semantic action since
it is just copying the value. In general {$$ = $1;} is the default semantic action.

Supporting C-Routines Section:


The third part of a Yacc specification consists of supporting C-routines. YACC generates a
single function called yyparse(). This function requires no parameters and returns either a 0
on success, or 1 on failure. If syntax error over its return 1.The special function yyerror() is
called when YACC encounters an invalid syntax. The yyerror() is passed a single string (char
) argument. This function just prints user defined message like:
yyerror (char err)
{
printf (“Divide by zero”);
}
When LEX and YACC work together lexical analyzer using yylex () produce pairs consisting
of a token and its associated attribute value. If a token such as DIGIT is returned, the token
value associated with a token is communicated to the parser through a YACC defined variable
yylval. We have to return tokens from LEX to YACC, where its declaration is in YACC. To link
this LEX program include a y.tab.h file, which is generated after YACC compiler the program
using – d option.
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

2. Built-in Functions:

3. Built-in Types:
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

4. Special Characters:

5. Steps to Execute the program

$ lex filename.l (eg: cal.l)

$ yacc -d filename.y (eg: cal.y)

$cc lex.yy.c y.tab.c –ll –ly –lm

$./a .out

ALGORITHM:

1. Declare the variables and header files which are used for C programs in the declaration
section %{...%} of lex and yacc file.

2. Include the pattern (regular expression in lex and CFG in yacc) in the transition rule
section %%..%%
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

3. Define the tokens, precedence and associativity of operators used in yacc.

4. In main function get the expression from the user for calculation.

5. Call the yyparse() function to parse the given expression and it construct the LALR parsing
table using the grammar defined in transition rule of yacc.

6. Then call the yylex() function, it get the current token and store its value in yylval variable
and it is repeated until the value for given expression is computed.

7. Then it computes the expression with constructed LALR parser.

8. Print the value of expression.

9. Stop the program.

PROGRAM

Cal.l

%{
#include <stdio.h>
#include "y.tab.h"
int c;
extern int yylval;
%}

%%
"" ;
[a-z] {
c = yytext[0];
yylval = c - 'a';
return (LETTER);
}
[0-9] {
c = yytext[0];
yylval = c - '0';
return (DIGIT);
}
[^a-z0-9\b] {
c = yytext[0];
return(c);
}
%%

Cal.Y
%{
#include <stdio.h>
int regs[26];
int base;
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

%}

%start list
%token DIGIT LETTER
%left '|'
%left '&'
%left '+' '-'
%left '*' '/' '%'
%left UMINUS /*supplies precedence for unary minus */

%% /* beginning of rules section */


list: /*empty */
|
list stat '\n'
|
list error '\n {yyerrok }
stat: expr {printf("%d\n",$1);}
| LETTER '=' expr{ regs[$1] = $3; }
;
expr: '(' expr ')' {$$ = $2;}
|expr '*' expr {$$ = $1 * $3;}
| expr '/' expr {$$ = $1 / $3;}
| expr '%' expr {$$ = $1 % $3;}
| expr '+' expr { $$ = $1 + $3;}
| expr '-' expr {$$ = $1 - $3;}
| expr '&' expr {$$ = $1 & $3;}
| expr '|' expr {$$ = $1 | $3;}
| '-' expr %prec UMINUS {$$ = -$2;}
| LETTER {$$ = regs[$1];}
| number;
number: DIGIT {$$ = $1;base = ($1==0) ? 8 : 10;}
| number DIGIT {$$ = base * $1 + $2;}
;
%%

main()
{
return(yyparse());
}
yyerror(s)
char *s;
{
fprintf(stderr, "%s\n",s);
}

yywrap()
{
return(1);
}

(OR)
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

%{
#include<stdio.h>
int op=0,i;
float a,b;
%}

dig[0-9]+|([0-9]*)"."([0-9]+)
add "+"
sub "-"
mul"*"
div "/"
pow "^"
ln \n

%%
{dig}{digi();}
{add}{op=1;}
{sub}{op=2;}
{mul}{op=3;}
{div}{op=4;}
{pow}{op=5;}
{ln}{printf("\n the result:%f\n\n",a);}
%%

digi()
{
if(op==0)
a=atof(yytext);
else
{
b=atof(yytext);
switch(op)
{
case 1: a=a+b;
break;
case 2: a=a-b;
break;
case 3: a=a*b;
break;
case 4: a=a/b;
break;
case 5: for(i=a;b>1;b--)
a=a*i;
break;
}
op=0;
}
}
main(int argv,char *argc[])
{
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

yylex();
}
yywrap()
{
return 1;
}
OGRAM:

OUTPUT:

C:\Flex Windows\EditPlusPortable>lex cal.l

C:\Flex Windows\EditPlusPortable>yacc -d cal.y

C:\Flex Windows\EditPlusPortable> cc y.tab.c lex.yy.c

C:\Flex Windows\EditPlusPortable> a

5+2

8*2

16

5-3

7/2

3
PREPARED BY R.Raja Sekar, AP/CSE. KARE.

VIVA QUESTIONS
1. Define YACC.
2. Which parser is generated by YACC.
3. List the built-in functions in YACC.
4. List the built-in types in YACC.
5. Compare YACC and LEX.

EVALUATION
Assessment Marks
Scored
Understanding Problem statement (10)
Efficiency of understanding algorithm (15)
Efficiency of program (40)
Output (15)
Viva (20)
(Technical – 10 and Communications - 10)
Total (100)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy