Cs3501 Compiler Design Laboratory 2021r - Lab Manual
Cs3501 Compiler Design Laboratory 2021r - Lab Manual
DEPARTMENT OF
COMPUTER SCIENCE AND ENGINEERING
ACADEMIC YEAR
2023-24
REGULATION 2021
ODD SEMESTER
LAB MANUAL
V SEM
CS3501
To be world class nodal centre committed to advanced learning, research and training to serve the
nation, meeting the national / international standards
To be a premier Engineering College, much sought after by the industries and society offering
professional education and training blended with ethical values to convert student resource into
strong assets of our nation
To impart knowledge in Computer Science and Engineering with advanced learning to meet
international standards and make ethically and emotionally strong enough students to meet
the technological challenges for the welfare of the country in the coming years.
M1: To excel in teaching by offering professional education imbibing ethical and moral
values.
M2: To produce technically strong Computer engineers who could excel in research,
development and software related fields.
CS3501 COMPILER DESIGN LAB MANUAL
LIST OF EXPERIMENTS:
1. Using the LEX tool, Develop a lexical analyzer to recognize a few patterns in C. (Ex. identifiers,
constants, comments, operators etc.). Create a symbol table, while recognizing identifiers.
2. Implement a Lexical Analyzer using LEX Tool
3. Generate YACC specification for a few syntactic categories.
a. Program to recognize a valid arithmetic expression that uses operator +, -, * and /.
b. Program to recognize a valid variable which starts with a letter followed by any number of letters
or digits.
c. Program to recognize a valid control structures syntax of C language (For loop, while loop, if-else,
if-else-if, switch-case, etc.).
d. Implementation of calculator using LEX and YACC
4. Generate three address code for a simple program using LEX and YACC.
5. Implement type checking using Lex and Yacc.
6. Implement simple code optimization techniques (Constant folding, Strength reduction and
Algebraic transformation)
7. Implement back-end of the compiler for which the three address code is given as input and the 8086
assembly language code is produced as output.
COURSE OUTCOMES:
To execute the experiments, we should have the following hardware /softwares at minimum
HARDWARE REQUIREMENTS:
1. Intel based desktop PC with minimum of 166MHz or faster processor with at least 64 MB RAM
and 100 MB free disk space.
Compiler Lex or Flex and YACC tools ( Unix/Linux utilities )
SOFTWARE REQUIREMENTS:
Compiler Lex or Flex (Flex 2.5.4a)and YACC (Bison 2.4.1) tools (Windows/Unix/Linux utilities )
1. Open Command prompt and switch to your working directory where you have stored your
lex file (“.l“) and yacc file (“.y“)
2. Let your lex and yacc files be “hello.l” and “hello.y“. Now, follow the preceding steps to
compile and run your program.
A. For Compiling Lex file only:
i. flex hello.l
ii. gcc lex.yy.c
B. For Compiling Lex & Yacc file both:
i. flex hello.l
ii. bison -dy hello.y
iii. gcc lex.yy.c y.tab.c
C. For Executing the Program
i. a.exe
INDEX
AIM:
ALGORITHM:
PROGRAM 1:
Program Name: cdpgm1.l
%{
#undef yywrap
#include yywrap() 1
#include<stdio.h>
%}
%%
bool|int|float printf("Keyword");
[-,+]?[0-9]+ printf("Constants");
[,.'"]+ printf("Punctuation Chars");
[!@#$%^&*()]+ printf("Special Chars");
[a-zA-Z]+ printf("Identifiers");
%%
main(){
yylex();
}
Output:
REULT:
Thus the program for lexical analyzer to recognize a few patterns in C. (Ex. identifiers,
constants, comments, operators etc.) has been executed successfully.
EX.NO.2
Implement a Lexical Analyzer using LEX Tool
DATE:
AIM:
To write a program for implementing a Lexical analyser using the LEX tool in Linux
platform.
ALGORITHM:
Step1: Lex program contains three sections: definitions, rules, and user subroutines. Each
section must be separated from the others by a line containing only the delimiter, %%. The
format is as follows: definitions %% rules %% user_subroutines
Step2: In the definition section, the variables make up the left column, and their definitions
make up the right column. Any C statements should be enclosed in %{..}%. Identifier is
defined such that the first letter of an identifier is alphabet and remaining letters are
alphanumeric.
Step3: In the rules section, the left column contains the pattern to be recognized in an input
file to yylex(). The right column contains the C program fragment executed when that pattern
is recognized. The various patterns are keywords, operators, new line character, number,
string, identifier, beginning and end of block, comment statements, preprocessor directive
statements etc.
Step4: Each pattern may have a corresponding action, that is, a fragment of C source code to
execute when the pattern is matched.
Step5: When yylex() matches a string in the input stream, it copies the matched text to an
external character array, yytext, before it executes any actions in the rules section.
Step6: In the user subroutine section, the main routine is called yylex(). yywrap() is used to
get more input.
Step7: The lex command uses the rules and actions contained in the file to generate a
program, lex.yy.c, which can be compiled with the cc command. That program can then
receive input, break the input into the logical pieces defined by the rules in the file, and run
program fragments contained in the actions in the file.
PROGRAM CODE:
Program Name:cdpgm2.l
//Implementation of Lexical Analyzer using Lex tool
%{
int COMMENT=0;
%}
identifier [a-zA-Z][a-zA-Z0-9]*
%%
#.* {printf("\n%s is a preprocessor directive",yytext);}
int |
float |
char |
double |
while |
for |
struct |
typedef |
do |
if |
break |
continue |
void |
switch |
return |
else |
goto {printf("\n\t%s is a keyword",yytext);}
"/*" {COMMENT=1;}{printf("\n\t %s is a COMMENT",yytext);}
{identifier}\( {if(!COMMENT)printf("\nFUNCTION \n\t%s",yytext);}
\{ {if(!COMMENT)printf("\n BLOCK BEGINS");}
\} {if(!COMMENT)printf("BLOCK ENDS ");}
{identifier}(\[[0-9]*\])? {if(!COMMENT) printf("\n %s IDENTIFIER",yytext);}
\".*\" {if(!COMMENT)printf("\n\t %s is a STRING",yytext);}
[0-9]+ {if(!COMMENT) printf("\n %s is a NUMBER ",yytext);}
\)(\:)? {if(!COMMENT)printf("\n\t");ECHO;printf("\n");}
\( ECHO;
= {if(!COMMENT)printf("\n\t %s is an ASSIGNMENT OPERATOR",yytext);}
\<= |
\>= |
\< |
== |
\> {if(!COMMENT) printf("\n\t%s is a RELATIONAL OPERATOR",yytext);}
%%
int main(int argc, char **argv)
{
FILE *file;
file=fopen("input.c","r");
if(!file)
{
printf("could not open the file");
exit(0);
}
yyin=file;
yylex();
printf("\n");
return(0);
}
int yywrap()
{
return(1);
}
INPUT:
File Name:input.c
#include<stdio.h>
#include<conio.h>
void main()
{
int a,b,c;
a=1;
b=2;
c=a+b;
printf("Sum:%d",c);
}
OUTPUT:
RESULT:
Thus the program for implementation of Lexical Analyzer using Lex tool has been executed
successfully.
EX.NO.3a
Generate YACC specification for a few syntactic categories.
DATE: a. Program to recognize a valid arithmetic expression that uses operator +, -, * and
/.
AIM:
ALGORITHM:
Step3: Checking the validation of the given expression according to the rule using yacc.
Step4: Using expression rule print the result of the given values
Step5: Stop the program.
PROGRAM:
LEX PART:
Filename:cdpgm3a.l
%{
#include "cdpgm3a.tab.h"
%}
%%
. return yytext[0];
\n return 0;
%%
int yywrap()
return 1;
YACC PART:
File NAme: cdpgm3a.y
%{
#include<stdio.h>
int valid=1;
%}
%token num id op
%%
s: id x
| num x
| '-' num x
| '(' s ')' x
x: op s
| '-' s
%%
int yyerror()
valid=0;
printf("\nInvalid expression!\n");
return 0;
int main()
yyparse();
if(valid)
printf("\nValid expression!\n");
}
OUTPUT:
RESULT:
AIM:
To Program to recognize a valid variable which starts with a letter followed by any
number of letters or digits.
ALGORITHM:
Step3: Checking the validation of the given valid variable according to the rule using yacc.
PROGRAM:
LEX PART:
File Name: cdpgm3b.l
%{
#include "cdpgm3b.tab.h"
%}
%%
\n return 0;
%%
int yywrap()
return 1;
YACC PART:
File Name:cdpgm3b.y
%{
#include<stdio.h>
int valid=1;
%}
%%
start : letter s
s: letter s
| digit s
%%
int yyerror()
{
valid=0;
return 0;
int main()
yyparse();
if(valid)
printf("\nIt is a identifier!\n");
}
OUTPUT:
RESULT:
AIM:
To write a program to recognize a valid control structures syntax of C language
ALGORITHM:
Step3: Checking the validation of the given control structures according to the rule using
yacc.
Program : .
ifelse.l
%{
#include"y.tab.h"
%}
%%
if return IF;
else return ELSE;
%{
#include <math.h>
#include <ctype.h>
#include <string.h>
#include "lex.yy.c"
%}
%token ID
%token NUM
%token IF ELSE
%right '='
%left '+' '-'
%left '*' '/'
%left UMINUS
%%
start : IF '(' condi ')' statement ';' ELSE '{' statement ';' '}'
;
condi : var '<' expr
| var '>' expr
;
statement : var '=' expr
;
expr : expr '+' expr
| expr '-' expr
| expr '*' expr
| expr '/' expr
| '(' expr ')'
| var
| NUM
;
var : ID
;
%%
void main()
{
printf("enter the expression:");
yyparser();
printf("valid expression:");
}
int yywrap()
{
exit(0);
}
int yyerror(char *s)
{
printf("\n %s Error!",s);
exit(0);
}
Output :
RESULT:
AIM:
To write a program for implementing a calculator for computing the given expression using
semantic rules of the YACC tool and LEX.
PROCEDURE :
iv. Define the tokens used by the parser. v. Define the operators and their precedence.
Step3: Rules Section: The rules section defines the rules that parse the input stream. Each
rule of a grammar production and the associated semantic action.
Step4: Programs Section: The programs section contains the following subroutines. Because
these subroutines are included in this file, it is not necessary to use the yacc library when
processing this file.
Step5: Main- The required main program that calls the yyparse subroutine to start the
program.
Step6: yyerror(s) -This error-handling subroutine only prints a syntax error message.
Step7: yywrap -The wrap-up subroutine that returns a value of 1 when the end of input
occurs. The calc.lex file contains include statements for standard input and output, as
programmar file information if we use the -d flag with the yacc command. The y.tab.h file
contains definitions for the tokens that the parser program uses.
Step8: calc.lex contains the rules to generate these tokens from the input stream.
PROGRAM CODE:
LEX PART:
%{
#include<stdio.h>
#include "y.tab.h"
%}
%%
[0-9]+ {
yylval=atoi(yytext);
return NUMBER;
[\t] ;
[\n] return 0;
. return yytext[0];
%%
int yywrap()
return 1;
}
YACC PART:
%{
#include<stdio.h>
int flag=0;
%}
%token NUMBER
%%
ArithmeticExpression: E{
printf("\nResult=%d\n",$$);
return 0;
};
E:E'+'E {$$=$1+$3;}
|E'-'E {$$=$1-$3;}
|E'*'E {$$=$1*$3;}
|E'/'E {$$=$1/$3;}
|E'%'E {$$=$1%$3;}
|'('E')' {$$=$2;}
| NUMBER {$$=$1;}
%%
void main()
yyparse();
if(flag==0)
void yyerror()
{
flag=1;
}
OUTPUT:
RESULT:
Thus the program has been executed successfully.
EX.NO.4
YACC Generate three address code for a simple program using LEX and
DATE: YACC
AIM :
To write a program for Generate three address code for a simple program using LEX
and YACC
ALGORITHM :
LEX PART :
1. Define the tokens: Start by defining regular expressions for the tokens in the input
language. These tokens could be keywords, identifiers, constants, operators, and punctuation
symbols.
2. Write Lex rules: Create Lex rules that match these regular expressions and
associate each rule with an action. In this case, the action should generate a corresponding
token, possibly with the value associated with it.
3.Initialize and prepare for scanning: Set up any necessary data structures or variables
to help track the state of the lexical analysis. This may include maintaining a symbol table for
identifiers and constants.
4.Read and process input: Use the Lex-generated scanner to read the input program.
For each token identified by the Lex rules, apply the associated action. You can use variables
and data structures to store and manage the information associated with tokens.
5.Return tokens to Yacc: Send the generated tokens to the Yacc parser for further
processing. You may need to define a custom interface for communication between Lex and
Yacc.
YACC PART :
1. Define the grammar: Start by defining the context-free grammar for the input
language. This grammar should describe the structure of the language and how statements are
composed of expressions, operators, and other language constructs.
2.Write Yacc rules: Create Yacc rules that correspond to the grammar. Each rule
should specify how to reduce a set of symbols into a more abstract representation. In this
case, the reduction process should generate three-address code instructions.
3.Initialize and prepare for parsing: Set up any necessary data structures or variables
to track the state of the parsing process. This might include a symbol table for tracking
identifiers and constants, and a stack for managing the parse tree or intermediate code.
4.Parse the input: Use the Yacc-generated parser to process the input program based
on the defined grammar. As the parser reduces rules, generate three-address code instructions
and add them to your intermediate code representation.
5. Generate the output: After parsing is complete, you will have an intermediate code
representation of the input program. You can then perform any additional optimizations or
transformations on the intermediate code, and finally, produce the final three-address code.
PROGRAM :
%{
#include "ex4.tab.h"
%}
%%
\n {return 0;}
. {return yytext[0];}
%%
yywrap()
return 1;
%{
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
void ThreeAddressCode();
void triple();
void qudraple();
struct incod
char opd1;
char opd2;
char opr;
};
%}
%union
char sym;
%left '+'
%left '*''/'
%left '-'
%%
| expr ';'
;
expr:
%%
yyerror(char *s)
printf("%s",s);
exit(0);
code[ind].opd1=opd1;
code[ind].opd2=opd2;
code[ind].opr=opr;
ind++;
return temp++;
void ThreeAddressCode()
int cnt = 0;
while(cnt<ind)
if(code[cnt].opr != '=')
printf("t%c : = \t",temp++);
if(isalpha(code[cnt].opd1))
printf(" %c\t",code[cnt].opd1);
printf("t%c\t",code[cnt].opd1);
printf(" %c\t",code[cnt].opr);
if(isalpha(code[cnt].opd2))
printf(" %c\n",code[cnt].opd2);
cnt++;
void quadraple()
int cnt = 0;
while(cnt<ind)
printf(" %c\t",code[cnt].opr);
if(code[cnt].opr == '=')
if(isalpha(code[cnt].opd2))
printf("t%c\t \t",code[cnt].opd2);
printf(" %c\n",code[cnt].opd1);
cnt++;
continue;
}
if(isalpha(code[cnt].opd1))
printf(" %c\t",code[cnt].opd1);
printf("t%c\t",code[cnt].opd1);
if(isalpha(code[cnt].opd2))
printf(" %c\t",code[cnt].opd2);
printf("t%c\t",code[cnt].opd2);
printf("t%c\n",temp++);
cnt++;
void triple()
int cnt=0;
char temp='1';
while(cnt<ind)
printf("(%c) \t",temp);
printf(" %c\t",code[cnt].opr);
if(code[cnt].opr == '=')
if(isalpha(code[cnt].opd2))
printf(" %c \t \t",code[cnt].opd2);
printf("(%c)\n",code[cnt].opd2);
cnt++;
temp++;
continue;
if(isalpha(code[cnt].opd1))
printf(" %c \t",code[cnt].opd1);
printf("(%c)\t",code[cnt].opd1);
if(isalpha(code[cnt].opd2))
printf(" %c \n",code[cnt].opd2);
printf("(%c)\n",code[cnt].opd2);
cnt++;
temp++;
}
}
main()
yyparse();
ThreeAddressCode();
quadraple();
triple();
}
OUT PUT :
RESULT:
Thus the program has been executed successfully.
EX.NO.5
Implement type checking using Lex and Yacc.
DATE:
AIM:
ALGORITHM:
Step1: Track the global scope type information (e.g. classes and their members)
Step2: Determine the type of expressions recursively, i.e. bottom-up, passing the resulting
types upwards.
PROGRAM :
typecheck.l
%{
#include <stdio.h>
#include <stdbool.h>
#include <stdlib.h>
%}
%%
program: /* empty */
stmt: expr {
if ($1.is_real) {
$$ = "Real";
} else {
$$ = "Integer";
if ($1.is_real || $3.is_real) {
} else {
$$ = "Integer";
if ($1.is_real || $3.is_real) {
} else {
$$ = "Integer";
expr: NUM {
$$ = "Integer";
$$.is_real = false;
| REAL {
$$ = "Real";
$$.is_real = true;
%%
%union {
int num;
double real;
char* type;
bool is_real;
}
void yyerror(const char *s) {
int main() {
yyparse();
return 0;
typecheck.y
%{
#include <stdio.h>
#include <stdbool.h>
#include <stdlib.h>
%}
%%
program: /* empty */
;
stmt: expr {
if ($1.is_real) {
$$ = "Real";
} else {
$$ = "Integer";
if ($1.is_real || $3.is_real) {
} else {
$$ = "Integer";
if ($1.is_real || $3.is_real) {
} else {
$$ = "Integer";
expr: NUM {
$$ = "Integer";
$$.is_real = false;
| REAL {
$$ = "Real";
$$.is_real = true;
%%
%union {
int num;
double real;
char* type;
bool is_real;
int main() {
yyparse();
return 0;
}
Output :
RESULT:
Thus the program has been executed successfully.
EX.NO.6
Implement simple code optimization techniques (Constant folding, Strength
DATE: reduction and Algebraic transformation)
AIM:
ALGORITHM:
Step1: Generate the program for factorial program using for and do-while loop to specify
optimization technique.
Step2: In for loop variable initialization is activated first and the condition is checked next. If
the condition is true the corresponding statements are executed and specified increment /
decrement operation is performed.
Step3: The for loop operation is activated till the condition failure.
Step4: In a do-while loop the variable is initialized and the statements are executed then the
condition checking and increment / decrement operation is performed.
Step5: When comparing both for and do-while loop for optimization, do while is best
because first the statement execution is done then only the condition is checked. So, during
the statement execution itself we can find the inconvenience of the result and no need to wait
for the specified condition result.
Step6: Finally when considering Code Optimization in loop do-while is best with respect to
performance.
Program :
#include<stdio.h>
#include<string.h>
struct op
{
char l;
char r[20];
}
op[10],pr[10];
void main()
{
int a,i,k,j,n,z=0,m,q;
char *p,*l;
char temp,t;
char *tem;
printf("Enter the Number of Values:");
scanf("%d",&n);
for(i=0;i<n;i++)
{
printf("left: ");
scanf(" %c",&op[i].l);
printf("right: ");
scanf(" %s",&op[i].r);
}
printf("Intermediate Code\n") ;
for(i=0;i<n;i++)
{
printf("%c=",op[i].l);
printf("%s\n",op[i].r);
}
for(i=0;i<n-1;i++)
{
temp=op[i].l;
for(j=0;j<n;j++)
{
p=strchr(op[j].r,temp);
if(p)
{
pr[z].l=op[i].l;
strcpy(pr[z].r,op[i].
r);
z++;
}
}
}
pr[z].l=op[n-1].l;
strcpy(pr[z].r,op[n-1].r);
z++;
printf("\nAfter Dead Code Elimination\n");
for(k=0;k<z;k++)
{
printf("%c\t=",pr[k].l);
printf("%s\n",pr[k].r);
}
for(m=0;m<z;m++)
{
tem=pr[m].r;
for(j=m+1;j<z;j++)
{
p=strstr(tem,pr[j].r);
if(p)
{
t=pr[j].l;
pr[j].l=pr[m].l;
for(i=0;i<z;i++)
{
l=strchr(pr[i].r,t) ;
if(l)
{
a=l-pr[i].r;
printf("pos: %d\n",a);
pr[i].r[a]=pr[m].l;
}}}}}
printf("Eliminate Common Expression\n");
for(i=0;i<z;i++)
{
printf("%c\t=",pr[i].l);
printf("%s\n",pr[i].r);
}
for(i=0;i<z;i++)
{
for(j=i+1;j<z;j++)
{
q=strcmp(pr[i].r,pr[j].r);
if((pr[i].l==pr[j].l)&&!q)
{
pr[i].l='\0';
}
}
}
printf("Optimized Code\n");
for(i=0;i<z;i++)
{
if(pr[i].l!='\0')
{
printf("%c=",pr[i].l);
printf("%s\n",pr[i].r);
}
}
}
Output :
RESULT:
Thus the program has been executed successfully.
EX.NO.6 Implement back-end of the compiler for which the three address code is
given as input and the 8086 assembly language code is produced as output.
DATE:
Aim :
To Implement back-end of the compiler for which the three address code is given as
input and the 8086 assembly language code is produced as output.
Algorithm :
Step 1: Parse Three-Address Code:Read the input three-address code, which consists of a
sequence of instructions.
Step 2: Generate Assembly Skeleton:Set up the basic structure of the assembly code,
including data and code segments, data definitions, and other required boilerplate code.
Step 4: Manage Memory and Registers:Allocate registers for temporary variables and
intermediate results.
Step 5: Output Assembly Code:Write the generated 8086 assembly code, including the
translated instructions and any required data definitions, to an output file or stream.
Program :
File Name : exp7.c
#include<stdio.h>
#include<stdio.h>
#include<conio.h>
#include<string.h>
int main()
{
char icode[10][30],str[20],opr[10];
int i=0;
//clrscr();
printf("\n Enter the set of intermediate code (terminated by exit):\n");
do
{
scanf("%s",icode[i]);
} while(strcmp(icode[i++],"exit")!=0);
printf("\n target code generation");
printf("\n************************");
i=0;
do
{
strcpy(str,icode[i]);
switch(str[3])
{
case '+':
strcpy(opr,"ADD");
break;
case '-':
strcpy(opr,"SUB");
break;
case '*':
strcpy(opr,"MUL");
break;
case '/':
strcpy(opr,"DIV");
break;
}
printf("\n\tMov %c,R%d",str[2],i);
printf("\n\t%s%c,R%d",opr,str[4],i);
printf("\n\tMov R%d,%c",i,str[0]);
}while(strcmp(icode[++i],"exit")!=0);
//getch();
}
OUTPUT :
RESULT:
Thus the required C program to implement the back end of the compiler is done and the
required output is obtained and verified