CD Lab Kare With Solution With Header
CD Lab Kare With Solution With Header
SCHOOL OF COMPUTING
Department of Computer Science and
Engineering
(CSE18R274)
Section : ………………………………………………………..
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
TABLE OF CONTENTS
1 Bonafide Certificate 3
3 Course Plan 5
4 Introduction 11
Experiments
Implementation of Symbol Table
5
SCHOOL OF COMPUTING
BONAFIDE CERTIFICATE
REGISTER NUMBER
Marks Faculty
S.No. Date Experiment
(100) Signature
INTRODUCTION
COMPILER
There are two parts to compilation: Analysis and Synthesis. The analysis part
breaks up the source program into constituent pieces and creates an intermediate
representation of source program. The synthesis part constructs the desired target
program from the intermediate representation. Of the two parts, synthesis requires
the most specialize technique.
PHASES OF A COMPILER
LEXICAL ANALYSIS
SYNTAX ANALYSIS
SEMANTIC ANALYSIS
The semantic analysis phase checks the source program for semantic errors
and gathers type information for the subsequent code generation phase. It uses the
hierarchical structure determined by the syntax-analysis phase to identify the operators
and operands of expressions and statements. An important component of semantic
analysis is type checking. Here the compiler checks that each operator has operands that
are permitted by the source language specification.
Symbol table is a data structure containing the record of each identifier, with fields
for the attributes of the identifier. The data structure allows us to find the record for each
identifier quickly and store or retrieve data from that record quickly. When the lexical
analyzer detects an identifier in the source program, the identifier is entered into symbol
table. The remaining phases enter information about identifiers in to the symbol table.
ERROR DETECTION
Each phase can encounter errors. The syntax and semantic analysis phases usually
handle a large fraction of the errors detectable by compiler. The lexical phase can detect
errors where the characters remaining in the input do not form any token of language.
Errors where the token stream violates the structure rules of the language are determined
by the syntax analysis phase.
CODE OPTIMIZATION
The code optimization phase attempts to improve the intermediate code so that the
faster running machine code will result. There are simple optimizations that significantly
improve the running time of the target program without slowing down compilation too
much.
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
CODE GENERATION
The final phase of compilation is the generation of target code, consisting normally
of reloadable machine code or assembly code.
7
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
SCHOOL OF COMPUTING
COURSE PLAN
Course B.Tech
COURSE DESCRIPTION
This self-paced course will discuss the major ideas used today in the
implementation of programming language compilers, including lexical analysis, parsing,
syntax-directed translation, abstract syntax trees, types and type checking, intermediate
languages, dataflow analysis, program optimization, code generation, and runtime
systems. As a result, you will learn how a program written in a high-level language
designed for humans is systematically translated into a program written in low-level
assembly more suited to machines.
COURSE OBJECTIVES
To make acquainted the students about the functional units of computer and
how each unit works along with the architectural and performance issues.
COURSE OUTCOMES(COS)
CO2: Apply context free grammars to parsing and compare different parsing, Technique
CO3: Analyze the various LR parsing methods and evaluate the intermediate code
representation
CO4: Create the various code generation schemes.
CO5: Apply the various optimization techniques for the generated code
CO6: Create an Efficient algorithm for analysis and syntheis part of the compiler
CO7: Implement the problem statements in programming languages , Lex and Yacc tools
efficiently
PSOs DESCRIPTION
PSO1 Able to develop software solutions for real world problems using core computing
technologies .
PSO2 Able to apply technologies such as AIML and data science for effective decision-
making towards sustainable development of a smart city.
PROGRAMME OUTCOMES
POs DESCRIPTION
PO3 Capability to design and develop computing systems to meet the requirement
of industry and society with due consideration for public health, safety and
environment.
PO4 Ability to apply knowledge of design of experiment and data analysis to derive
solutions in complex computing problems and society with due consideration
for public health, safety and environment.
PO5 Ability to develop and apply modeling, simulation and prediction tools and
techniques to engineering problems.
PO6 Assess and understand the professional, legal, security and societal
responsibilities Relevant to computer engineering practice.
9
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
PO12 Understanding the need for technological changes and engage in life-long
learning.
CSO2 : Design, implement, and evaluates a computing-based solution to meet a given set
of computing requirements in the context of the program’s discipline.
ESO2 : Ability to apply engineering design to produce solutions that meet specified needs
with consideration of public health, safety, and welfare, as well as global, cultural, social,
10
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
ESO6 : Ability to develop and conduct appropriate experimentation, analyze and interpret
data, and use engineering judgment to draw conclusions.
POs PSOs
1 2 3 4 5 6 7 8 1. 10 11 12 1 2
CO1 S S
CO2 S S S S
CO3 S S S S M S S
CO4 S S S S S S
CO5 S S S S S M M M M S S
CO6 S S S S S S
CO7 S S S S S S
11
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
WEB RESOURCES
8 LR(0),LR(1),SLR,YACC, www.cs.gmu.edu
LEX
14 LR Parser www.cwi.nl/~jurgenv/publications/slides/cc2002.ppt
15 Intermediate www.hardcoreprocessing.com/articles/presentations/
Languages tiliaoc/TheDocument.html
12
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
LIST OF EXPERIMENTS
Number Cumulative
S.No Experiment Details of Number of
Periods Periods
Construct the three address code and parse tree for the
8 2 18
given expression
ADDITIONAL EXPERIMENTS:
1. Peephole optimization
13
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
ASSESSMENT METHOD:
S.No Assessment Split up
14
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
Discussion/Vi Answered for less Answered for Answered for Answered for
va than 40% of the 60% of the 60% of the more than 90%
questions indicating questions. but questions. Still of the questions
a lack of incomplete need some correctly, good
understanding of understanding improvements understanding
results of results is still (7) of results is
(2) evident(4) conveyed(10)
15
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
Program:
#include<stdio.h>
#include<math.h>
#include<string.h>
#include<ctype.h>
#include<stdlib.h>
void main ()
{
int x=0, n, i=0,j=0;
void *mypointer,*Sym_address[5];
char ch,Sym_Search,Sym_Array2[15],Sym_Array3[15],c;
printf("Input the expression ending with $ sign:");
while((c=getchar())!='$')
{
Sym_Array2[i]=c;
i++;
}
n=i-1;
printf("Given Expression:");
i=0;
while(i<=n)
{
printf("%c",Sym_Array2[i]);
i++;
}
printf("\n Symbol Table display\n");
printf("Symbol \t addr \t type");
while(j<=n)
{
c=Sym_Array2[j];
if(isalpha(toascii(c)))
{
mypointer=malloc(c);
Sym_address[x]=mypointer;
Sym_Array3[x]=c;
printf("\n%c \t %d \t identifier\n",c,mypointer);
x++;
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
j++;
}
else
{
ch=c;
if(ch=='+'||ch=='-'||ch=='*'||ch=='=')
{
mypointer=malloc(ch);
Sym_address[x]=mypointer;
Sym_Array3[x]=ch;
printf("\n %c \t %d \t operator\n",ch,mypointer);
x++;
j++;
}
}
}
}
OR
#include<stdio.h>
#include<conio.h>
#include<malloc.h>
#include<string.h>
#include<math.h>
#include<ctype.h>
void main()
{
int i=0,j=0,x=0,n,flag=0; void *p,*add[15];
char ch,srch,b[15],d[15],g[10],c;
clrscr();
printf("Expression terminated by $:");
while((c=getchar())!='$')
{
b[i]=c; i++;
}
n=i-1;
printf("Given expression:::");
i=0;
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
while(i<=n)
{
printf("%c",b[i]); i++;
}
printf("\n.....symbol table ... \n");
printf("symbol\taddr\ttype\n");
while(j<=n)
{
c=b[j];
if(isalpha(toascii(c)))
{
if(j<=n)
{
p=malloc(c); add[x]=p;
d[x]=c;
printf("%c\t%d\tidentifier\n",c,p); goto b;
}
else
{
b:
ch=b[j+1];
if(ch=='+'||ch=='-'||ch=='*'||ch=='='||ch==’/’)
{
p=malloc(c);
add[x]=p;
g[x]=ch;
printf("%c\t%p\t Operator \n",g[x],p);
x++;
}
}
} j++;
}
printf("the symbol is to be searched\n");
scanf("%s",&srch);
//srch=getch();
for(i=0;i<=x;i++)
{
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
if(srch==d[i]||srch==g[i])
{
printf("symbol found...");
printf("%c%s%p\n",srch,"@address",d[i]);
flag=1;
}
}
if(flag==0)
printf("symbol not found\n");
}
SAMPLE OUTPUT
VIVA QUESTIONS
RESULT:
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
EVALUATION
Assessment Marks Scored
THEORY:
Lexical analysis is the process of converting a sequence of characters (such as in a computer
program of web page) into a sequence of tokens (strings with an identified “meaning”). A
program that perform lexical analysis may be called a lexer, tokenizer or scanner.
2 Literal or constant
; End of the statement
ALGORITHM:
1. Start the program.
2. Include necessary header files.
3. Declare all the variables and file pointers.
4. Include the input program (input.c) using file function.
5. Read the file input file and display the tokens.
6. Display the header files of the input program.
7. Separate the operators of the input program and display it.
8. Print the punctuation marks.
9. Print the constant that are present in input program.
10. Print the identifiers of the input program.
11. Also count the numbers of each token that occurs in input file and print it.
12. Stop the program.
PROGRAM :
#include<stdio.h>
#include<ctype.h>
#include<string.h>
void keyw(char *p);
int i=0,id=0,kw=0,num=0,op=0;
char keys[32][10]={"auto","break","case","char","const","continue","default",
"do","double","else","enum","extern","float","for","goto","if","int","long","register","return"
,"short","signed","sizeof","static","struct","switch","typedef","union",
"unsigned","void","volatile","while"};
main()
{
char ch,str[25],seps[15]=" \t\n,;(){}[]#\"<>",oper[]="!%^&*-+=~|.<>/?";
int j;
char fname[50];
FILE *f1;
//clrscr();
printf("enter file path (drive:\\fold\\filename)\n");
scanf("%s",fname);
f1 = fopen(fname,"r");
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
//f1 = fopen("Input","r");
if(f1==NULL)
{
goto END;
}
while((ch=fgetc(f1))!=EOF)
{
for(j=0;j<=14;j++)
{
if(ch==oper[j])
{
printf("%c is an operator\n",ch);
op++;
str[i]='\0';
keyw(str);
}
}
for(j=0;j<=14;j++)
{
if(i==-1)
break;
if(ch==seps[j])
{
if(ch=='#')
{
while(ch!='>')
{
printf("%c",ch);
ch=fgetc(f1);
}
printf("%c is a header file\n",ch);
i=-1;
break;
}
if(ch=='"')
{
do
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
{
ch=fgetc(f1);
printf("%c",ch);
}while(ch!='"');
printf("\b is an argument\n");
i=-1;
break;
}
str[i]='\0';
keyw(str);
}
}
if(i!=-1)
{
str[i]=ch;
i++;
}
else
i=0;
}
printf("Keywords: %d\nIdentifiers: %d\nOperators: %d\nNumbers: %d\n",kw,id,op,num);
//getch();
END:
printf("file not found");
}
void keyw(char *p)
{
int k,flag=0;
for(k=0;k<=31;k++)
{
if(strcmp(keys[k],p)==0)
{
printf("%s is a keyword\n",p);
kw++;
flag=1;
break;
}
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
}
if(flag==0)
{
if(isdigit(p[0]))
{
printf("%s is a number\n",p);
num++;
}
else
{
//if(p[0]!=13&&p[0]!=10)
if(p[0]!='\0')
{
printf("%s is an identifier\n",p);
id++;
}
}
}
i=-1;
}
INPUT: (INPUT.C)
#include<stdio.h>
#include<conio.h>
void main()
{
Int a,b,c;
a=10;
b=5;
c=a+b;
printf(“The sum is %d”,c);
getch();
}
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
OUTPUT
VIVAQUESTIONS:
1. State the use of Lexical Analyzer.
2. Define Token with example.
3. What do you mean by Lexeme? Give example.
4. What is the use of pattern? Write the pattern for identifiers.
5. List the lexical errors.
RESULT:
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
EVALUATION
Assessment Marks Scored
THEORY
Regular Expression
o The language accepted by finite automata can be easily described by simple expressions
called Regular Expressions. It is the most effective way to represent any language.
o The languages accepted by some regular expression are referred to as Regular languages.
o A regular expression can also be described as a sequence of pattern that defines a string.
o Regular expressions are used to match character combinations in strings. String
searching algorithm used this pattern to find the operations on a string.
Operations on Regular Language
The various operations on regular language are:
Union: If L and M are two regular languages then their union L U M is also a union.
L U M = {s | s is in L or s is in M}
Intersection: If L and M are two regular languages then their intersection is also an
intersection.
L ⋂ M = {st | s is in L and t is in M}
Kleen closure: If L is a regular language then its Kleen closure L1* will also be a regular
language.
L* = Zero or more occurrence of language L
Finite Automata
o Finite automata are used to recognize patterns.
o It takes the string of symbol as input and changes its state accordingly. When the
desired symbol is found, then the transition occurs.
o At the time of transition, the automata can either move to the next state or stay in the
same state.
o Finite automata have two states, Accept state or Reject state. When the input string is
processed successfully, and the automata reached its final state, then it will accept.
Formal Definition of FA
A finite automaton is a collection of 5-tuple (Q, ∑, δ, q0, F), where:
1. Q: finite set of states
2. ∑: finite set of the input symbol
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
Types of Automata:
There are two types of finite automata:
1. DFA(deterministic finite automata)
2. NFA(non-deterministic finite automata)
1. DFA
DFA refers to deterministic finite automata. Deterministic refers to the uniqueness of the
computation. In the DFA, the machine goes to one state only for a particular input
character. DFA does not accept the null move.
2. NFA
NFA stands for non-deterministic finite automata. It is used to transmit any number of
states for a particular input. It can accept the null move.
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
ALGORITHM
By using Thompsons construction algorithm to find out a Finite Automaton from a
Regular Expression.
Input − A Regular Expression R
Output − NFA accepting language denoted by R
The steps are
Case 1 : The NFA representing the empty string is:
Case ii : If the regular expression is just a character, eg. a, then the corresponding NFA is :
Case iii: The union operator is represented by a choice of transitions from a node; thus a|b
can be represented as:
Case iv Concatenation simply involves connecting one NFA to the other; eg. ab is:
Case v : The Kleene closure must allow for taking zero or more instances of the letter from
the input; thus a* looks like:
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
Program:
#include<stdio.h>
#include<string.h>
int main()
{
char reg[20]; int q[20][3],i=0,j=1,len,a,b;
for(a=0;a<20;a++) for(b=0;b<3;b++) q[a][b]=0;
scanf("%s",reg);
printf("Given regular expression: %s\n",reg);
len=strlen(reg);
while(i<len)
{
if(reg[i]=='a'&®[i+1]!='|'&®[i+1]!='*') { q[j][0]=j+1; j++; }
if(reg[i]=='b'&®[i+1]!='|'&®[i+1]!='*') { q[j][1]=j+1; j++; }
if(reg[i]=='e'&®[i+1]!='|'&®[i+1]!='*') { q[j][2]=j+1; j++; }
if(reg[i]=='a'&®[i+1]=='|'&®[i+2]=='b')
{
q[j][2]=((j+1)*10)+(j+3); j++;
q[j][0]=j+1; j++;
q[j][2]=j+3; j++;
q[j][1]=j+1; j++;
q[j][2]=j+1; j++;
i=i+2;
}
if(reg[i]=='b'&®[i+1]=='|'&®[i+2]=='a')
{
q[j][2]=((j+1)*10)+(j+3); j++;
q[j][1]=j+1; j++;
q[j][2]=j+3; j++;
q[j][0]=j+1; j++;
q[j][2]=j+1; j++;
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
i=i+2;
}
if(reg[i]=='a'&®[i+1]=='*')
{
q[j][2]=((j+1)*10)+(j+3); j++;
q[j][0]=j+1; j++;
q[j][2]=((j+1)*10)+(j-1); j++;
}
if(reg[i]=='b'&®[i+1]=='*')
{
q[j][2]=((j+1)*10)+(j+3); j++;
q[j][1]=j+1; j++;
q[j][2]=((j+1)*10)+(j-1); j++;
}
if(reg[i]==')'&®[i+1]=='*')
{
q[0][2]=((j+1)*10)+1;
q[j][2]=((j+1)*10)+1;
j++;
}
i++;
}
printf("\n\tTransition Table \n");
printf(" \n");
printf("Current State |\tInput |\tNext State");
printf("\n \n");
for(i=0;i<=j;i++)
{
if(q[i][0]!=0) printf("\n q[%d]\t | a | q[%d]",i,q[i][0]);
if(q[i][1]!=0) printf("\n q[%d]\t | b | q[%d]",i,q[i][1]);
if(q[i][2]!=0)
{
if(q[i][2]<10) printf("\n q[%d]\t | e | q[%d]",i,q[i][2]);
else printf("\n q[%d]\t | e | q[%d] ,
q[%d]",i,q[i][2]/10,q[i][2]%10);
}
}
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
printf("\n \n");
return 0;
}
SAMPLE INPUT
Enter the regular expression : (a+b)*
SAMPLE OUTPUT
a
e e
2 3 b b
Start a
1 6 7 8 9
b
e 4 5 e
OR
Transition Table
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
VIVA QUESTIONS
1. Define Regular Expression.
2. List the operations of regular expressions.
3.Compare NFA and DFA.
4. Write the regular expression for the string that starts and ends with different symbol over
{a,b}.
5. Which is more powerful DFA,NFA and RE? Justify.
RESULT:
EVALUATION
Assessment Marks Scored
AIM
To find first and follow of a given context free grammar
THEORY:
Why FIRST?
To avoid backtracking in parsing we need to calculate first.
S -> cAd
A -> bc|a
And the input string is “cad”.
If the compiler would have come to know in advance, that what is the “first character of the
string produced when a production rule is applied”, and comparing it to the current character
or token in the input string it sees, it can wisely take decision on which production rule to
apply.
Computing the Function FIRST
If X is Grammar Symbol, then First (X) will be −
• If X is a terminal symbol, then FIRST(X) = {X}
• If X → ε, then FIRST(X) = {ε}
• If X is non-terminal & X → a α, then FIRST (X) = {a}
• If X → Y1, Y2, Y3, then FIRST (X) will be
(a) If Y is terminal, then
FIRST (X) = FIRST (Y1, Y2, Y3) = {Y1}
(b) If Y1 is Non-terminal and
If Y1 does not derive to an empty string i.e., If FIRST (Y1) does not contain ε then,
FIRST (X) = FIRST (Y1, Y2, Y3) = FIRST(Y1)
(c) If FIRST (Y1) contains ε, then.
FIRST (X) = FIRST (Y1, Y2, Y3) = FIRST(Y1) − {ε} 𝖴 FIRST(Y2, Y3)
Similarly, FIRST (Y2, Y3) = {Y2}, If Y2 is terminal otherwise if Y2 is Non-terminal then
• FIRST (Y2, Y3) = FIRST (Y2), if FIRST (Y2) does not contain ε.
• If FIRST (Y2) contain ε, then
• FIRST (Y2, Y3) = FIRST (Y2) − {ε} 𝖴 FIRST (Y3)
Why FOLLOW?
The parser faces one more problem. Let us consider below grammar to understand this
problem.
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
A -> aBb
B -> c | ε
And suppose the input string is “ab” to parse.
As the first character in the input is a, the parser applies the rule A->aBb.
A
/|\
a B b
Now the parser checks for the second character of the input string which is b, and the Non-
Terminal to derive is B, but the parser can’t get any string derivable from B that contains b
as first character.
But the Grammar does contain a production rule B -> ε, if that is applied then B will vanish,
and the parser gets the input “ab”, as shown below. But the parser can apply it only when it
knows that the character that follows B in the production rule is same as the current
character in the input.
In RHS of A -> aBb, b follows Non-Terminal B, i.e. FOLLOW(B) = {b}, and the current input
character read is also b. Hence the parser applies this rule. And it is able to get the string “ab”
from the given grammar.
A A
/ | \ / \
a B b => a b
|
ε
So FOLLOW can make a Non-terminal vanish out if needed to generate the string from the
parse tree.
Program:
#include<stdio.h>
#include<math.h>
#include<string.h>
#include<ctype.h>
#include<stdlib.h>
int n,m=0,p,i=0,j=0;
char a[10][10],f[10];
void follow(char c);
void first(char c);
int main()
{
int i,z;
char c,ch;
clrscr();
printf("Enter the no of prooductions:\n");
scanf("%d",&n);
printf("Enter the productions:\n");
for(i=0;i<n;i++)
scanf("%s%c",a[i],&ch);
do
{
m=0;
printf("Enter the elemets whose fisrt & follow is to be found:");
scanf("%c",&c);
first(c);
printf("First(%c)={",c);
for(i=0;i<m;i++)
printf("%c",f[i]);
printf("}\n");
strcpy(f," ");
//flushall();
m=0;
follow(c);
printf("Follow(%c)={",c);
for(i=0;i<m;i++)
printf("%c",f[i]);
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
printf("}\n");
printf("Continue(0/1)?");
scanf("%d%c",&z,&ch);
}while(z==1);
return(0);
}
void first(char c)
{
int k;
if(!isupper(c))
f[m++]=c;
for(k=0;k<n;k++)
{
if(a[k][0]==c)
{
if(a[k][2]=='$')
follow(a[k][0]);
else if(islower(a[k][2]))
f[m++]=a[k][2];
else first(a[k][2]);
}
}
}
void follow(char c)
{
if(a[0][0]==c)
f[m++]='$';
for(i=0;i<n;i++)
{
for(j=2;j<strlen(a[i]);j++)
{
if(a[i][j]==c)
{
if(a[i][j+1]!='\0')
first(a[i][j+1]);
if(a[i][j+1]=='\0' && c!=a[i][0])
follow(a[i][0]);
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
}
}
}
}
SAMPLE INPUT
SAMPLE OUTPUT
VIVA QUESTIONS
1. What is the need of calculating FIRST?
2. What is the need for FOLLOW?
3. Compare Top Down and Bottom-up parser?
4. Does Top-Down Parser handle Left Recursive Grammar?
5. State the rule for eliminating Left Recursion.
RESULT:
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
EVALUATION
Assessment Marks Scored
THEORY:
There are two types of parsing techniques in compiler design: Top Down
Parsing and Bottom Up Parsing.
• Top-Down Parsing- It is a parsing strategy that starts at the top of the parse tree and
works its way down by applying grammatical rules.
• Bottom-up Parsing- It is a parsing strategy that starts with the lowest level of the parse
tree and works its way up using grammatical rules.
Shift Reduce Parsing
Shift-reduce parsing is an example of bottom-up-parsing .It attempts to construct a
parse tree for an input string beginning at the leaves and working up towards the root. In
other words, it is a process of “reducing” a string w to the start symbol of a grammar. At every
(reduction) step, a particular substring matching the RHS of a production rule is replaced by
the symbol on the LHS of the production.
Program:
#include<stdio.h>
#include<string.h>
int k=0,z=0,i=0,j=0,c=0;
char a[16],ac[20],stk[15],act[10];
void check();
int main()
{
}
void check()
{
strcpy(ac,"REDUCE TO E");
for(z=0; z<c; z++)
if(stk[z]=='i' && stk[z+1]=='d')
{
stk[z]='E';
stk[z+1]='\0';
printf("\n$%s\t%s$\t%s",stk,a,ac);
j++;
}
for(z=0; z<c; z++)
if(stk[z]=='E' && stk[z+1]=='+' && stk[z+2]=='E')
{
stk[z]='E';
stk[z+1]='\0';
stk[z+2]='\0';
printf("\n$%s\t%s$\t%s",stk,a,ac);
i=i-2;
}
for(z=0; z<c; z++)
if(stk[z]=='E' && stk[z+1]=='*' && stk[z+2]=='E')
{
stk[z]='E';
stk[z+1]='\0';
stk[z+1]='\0';
printf("\n$%s\t%s$\t%s",stk,a,ac);
i=i-2;
}
for(z=0; z<c; z++)
if(stk[z]=='(' && stk[z+1]=='E' && stk[z+2]==')')
{
stk[z]='E';
stk[z+1]='\0';
stk[z+1]='\0';
printf("\n$%s\t%s$\t%s",stk,a,ac);
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
i=i-2;
}
}
Output :
GRAMMAR is E->E+E
E->E*E
E->(E)
E->id
enter input string
id+id*id+id
VIVA QUESTIONS
1. What do you meant by Shift -reduce conflict?
2. What do you meant by reduce -reduce conflict?
3. List the operations of Shift-Reduce parser.
4. Define LR parser and its types.
5. State the two rules of shift reduce parser.
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
EVALUATION
Assessment Marks Scored
THEORY
Parsing Action
• Both end of the given input string, add the $ symbol.
• Now scan the input string from left right until the ⋗ is encountered.
• Scan towards left over all the equal precedence until the first left most ⋖ is
encountered.
• Everything between left most ⋖ and right most ⋗ is a handle.
• $ on $ means parsing is successful.
Algorithm:
Initialize: Set ip to point to the first symbol of w$
Repeat:
Let X be the top stack symbol, and a the symbol pointed to by ip
if $ is on the top of the stack and ip points to $ then return
else
Let a be the top terminal on the stack, and b the symbol pointed to by ip
if a <· b or a =· b then
push b onto the stack
advance ip to the next input symbol
else if a ·> b then
repeat
pop the stack
until the top stack terminal is related by <·
to the terminal most recently popped
else error()
end
3. Create a directed graph whose nodes are in the groups, next for each symbols a and b do:
place an edge from the group of gb to the group of fa if a <· b, otherwise if a ·> b place an edge
from the group of fa to that of gb;
If the constructed graph has a cycle then no precedence functions exist. When there are no
cycles collect the length of the longest paths from the groups of fa and gb respectively.
Program:
#include<stdio.h>
#include<conio.h>
void main()
{
char stack[20], ip[20], opt[10][10][1], ter[10];
int i, j, k, n, top = 0, col, row;
clrscr();
for (i = 0; i < 10; i++)
{
stack[i] = NULL;
ip[i] = NULL;
for (j = 0; j < 10; j++)
{
opt[i][j][1] = NULL;
}
}
printf("Enter the no.of terminals :\n");
scanf("%d", & n);
printf("\nEnter the terminals :\n");
scanf("%s", & ter);
printf("\nEnter the table values :\n");
for (i = 0; i < n; i++)
{
for (j = 0; j < n; j++)
{
printf("Enter the value for %c %c:", ter[i], ter[j]);
scanf("%s", opt[i][j]);
}
}
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
stack[++top] = ip[i];
printf("Shift %c", ip[i]);
i++;
}
else
{
if (opt[col][row][0] == '>')
{
while (stack[top] != '<')
{
--top;
}
top = top - 1;
printf("Reduce");
}
else
{
printf("\nString is not accepted");
break;
}
}
printf("\n");
for (k = 0; k <= top; k++)
{
printf("%c", stack[k]);
}
printf("\t\t\t");
for (k = i; k < strlen(ip); k++)
{
printf("%c", ip[k]);
}
printf("\t\t\t");
}
getch();
}
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
Output:
Enter the value for * *:>
Enter the value for * $:>
Enter the value for $ i:<
Enter the value for $ +:<
Enter the value for $ *:<
Enter the value for $ $:accept
$ i*i Shift i
$<i *i Reduce
$ *i Shift *
$<* i Shift i
$<*<i
String is not accepted
VIVA QUESTIONS
1. What do you mean by operator grammar? Give Example.
2. State the rules of operator grammar.
3. What are the limitations of precedence table?
4. State the advantages and disadvantages of precedence graph.
5. Does operator precedence accept ambiguous grammar?
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
EVALUATION
Assessment Marks Scored
To write a program for implementing any one bottom up parser for the given
grammar (SLR)
THEORY:
LR parsers:
Fig: LR Parser
It is an efficient bottom-up syntax analysis technique that can be used to parse large classes
of context free grammar is called LR(0) parsing.
L stands for the left to right scanning
R stands for rightmost derivation in reverse
0 stands for no. of input symbols of lookahead
Advantages of LR parsing:
• It recognizes virtually all programming language constructs for which CFG can be
written
• It is able to detect syntactic errors
• It is an efficient non-backtracking shift reducing parsing method.
Types of LR parsing methods:
1. SLR
2. CLR
3. LALR
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
SLR Parser:
SLR is simple LR. It is the smallest class of grammar having few number of states. SLR is very
easy to construct and is similar to LR parsing. The only difference between SLR parser and
LR(0) parser is that in LR(0) parsing table, there’s a chance of ‘shift reduced’ conflict because
we are entering ‘reduce’ corresponding to all terminal states. We can solve this problem by
entering ‘reduce’ corresponding to FOLLOW of LHS of production in the terminating state.
This is called SLR (1) collection of items
Algorithm
Input − An Augmented Grammar G′
Output − SLR Parsing Table
Method
• Initially construct set of items
C = {I0, I1, I2 … … In} where C is a set of LR (0) items for Grammar.
• Parsing actions are based on each item or state I1.
Various Actions are −
• If A → α ∙ a β is in Ii and goto (Ii, a) = Ij then set Action [i, a] = shift j".
• If A → α ∙ is in Ii then set Action [i, a] to "reduce A → α" for all symbol a, where a ∈ FOLLOW (A).
• If S′ → S ∙ is in Ii then the entry in action table Action [i, $] = accept".
• The goto part of the SLR table can be filled as− The goto transition for the state i is considered for
non-terminals only. If goto (Ii, A) = Ij then goto [i, A] = j
• All entries not defined by rules 2 and 3 are considered to be "error. "
Program:
#include<stdio.h>
#include<string.h>
int axn[][6][2]={
{{100,5},{-1,-1},{-1,-1},{100,4},{-1,-1},{-1,-1}},
{{-1,-1},{100,6},{-1,-1},{-1,-1},{-1,-1},{102,102}},
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
{{-1,-1},{101,2},{100,7},{-1,-1},{101,2},{101,2}},
{{-1,-1},{101,4},{101,4},{-1,-1},{101,4},{101,4}}
{{100,5},{-1,-1},{-1,-1},{100,4},{-1,-1},{-1,-1}},
{{-1,-1},{101,6},{101,6},{-1,-1},{101,6},{101,6}},
{{100,5},{-1,-1},{-1,-1},{100,4},{-1,-1},{-1,-1}},
{{100,5},{-1,-1},{-1,-1},{100,4},{-1,-1},{-1,-1}},
{{-1,-1},{100,6},{-1,-1},{-1,-1},{100,1},{-1,-1}},
{{-1,-1},{101,1},{100,7},{-1,-1},{101,1},{101,1}},
{{-1,-1},{101,3},{101,3},{-1,-1},{101,3},{101,3}},
{{-1,-1},{101,5},{101,5},{-1,-1},{101,5},{101,5}}
};//Axn Table
int gotot[12][3]={1,2,3,-1,-1,-1,-1,-1,-1,-1,-1,-1,8,2,3,-1,-1,-1,
-1,9,3,-1,-1,10,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1}; //GoTo table
int a[10];
char b[10];
int top=-1,btop=-1,i;
void push(int k)
{
if(top<9)
a[++top]=k;
}
void pushb(char k)
{
if(btop<9)
b[++btop]=k;
}
char TOS()
{
return a[top];
}
void pop()
{
if(top>=0)
top--;
}
void popb()
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
if(btop>=0)
b[btop--]='\0';
}
void display()
{
for(i=0;i<=top;i++)
printf("%d%c",a[i],b[i]);
}
void display1(char p[],int m) //Displays The Present Input String
{
int l;
printf("\t\t");
for(l=m;p[l]!='\0';l++)
printf("%c",p[l]);
printf("\n");
}
void error()
{
printf("Syntax Error");
}
void reduce(int p)
{
int len,k,ad;
char src,*dest;
switch(p)
{
case 1:dest="E+T";
src='E';
break;
case 2:dest="T";
src='E';
break;
case 3:dest="T*F";
src='T';
break;
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
case 4:dest="F";
src='T';
break;
case 5:dest="(E)";
src='F';
break;
case 6:dest="i";
src='F';
break;
default:dest="\0";
src='\0';
break;
}
for(k=0;k<strlen(dest);k++)
{
pop();
popb();
}
pushb(src);
switch(src)
{
case 'E':ad=0;
break;
case 'T':ad=1;
break;
case 'F':ad=2;
break;
default: ad=-1;
break;
}
push(gotot[TOS()][ad]);
}
int main()
{
int j,st,ic;
char ip[20]="\0",an;
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
// clrscr();
getch();
break;
}
/* else
{
printf ("Given String is rejected \n");
break;
}*/
}
return 0;
}
OUTPUT:
deepti@Inspiron-3542:~$ ./a.out
0 a+a*a$
0a5 +a*a$
0F3 +a*a$
0T2 +a*a$
0E1 +a*a$
0E1+6 a*a$
0E1+6a5 *a$
0E1+6F3 *a$
0E1+6T9 *a$
0E1+6T9*7 a$
0E1+6T9*7a5 $
0E1+6T9*7F10 $
0E1+6T9 $
0E1 $
Given String is accepted
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
RESULT:
VIVA QUESTIONS
1. Define LR Parser.
2. State the advantages of LR Parser.
3. List the types of LR Parser.
4. Compare LR and LL Parser.
5. What do you mean by GOTO operation?
NOTE:
1. Even though CLR parser does not have RR conflict but LALR may contain RR conflict.
2. If number of states LR(0) = n1,
number of states SLR = n2,
number of states LALR = n3,
number of states CLR = n4 then,
n1 = n2 = n3 <= n4
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
EVALUATION
Assessment Marks Scored
CLR refers to canonical lookahead. CLR parsing use the canonical collection of LR (1) items
to build the CLR (1) parsing table. CLR (1) parsing table produces the more number of
states as compare to the SLR (1) parsing.
In the CLR (1), we place the reduce node only in the lookahead symbols.
Various steps involved in the CLR (1) Parsing:
o For the given input string write a context free grammar
o Check the ambiguity of the grammar
o Add Augment production in the given grammar
o Create Canonical collection of LR (0) items
o Draw a data flow diagram (DFA)
o Construct a CLR (1) parsing table
o
LR (1) item
LR (1) item is a collection of LR (0) items and a look ahead symbol.
LR (1) item = LR (0) item + look ahead
The look ahead is used to determine that where we place the final item.
The look ahead always add $ symbol for the argument production.
Example
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
CLR ( 1 ) Grammar
1. S → AA
2. A → aA
3. A→b
1. S` → •S, $
2. S → •AA, $
3. A → •aA, a/b
4. A → •b, a/b
G𝐨𝐭𝐨(𝐈, 𝐗):
If there is a production 𝐀 → 𝑎 ∙ 𝐗 𝛽, 𝐚 𝐢𝐧 𝐈 then 𝐠𝐨𝐭𝐨(𝐈, 𝐗) is the closure of the set of items of
𝐀 → 𝑎 𝐗 ∙ 𝛽, 𝐚.
Algorithm for Construction of LR (1) Set of Items
begin
C = {closure(S′→∙S,$)}
Repeat
for each set of items 𝐈 in C and each grammar symbol X (terminal or non-terminal)
Add 𝐠𝐨𝐭𝐨(𝐈, 𝐗) 𝐭𝐨 𝐂
Until no more sets of elements can be added to C.
end.
RULE-
1. If any non-terminal has ‘ . ‘ preceding it, we have to write all its production and add ‘ . ‘
preceding each of its production.
2. from each state to the next state, the ‘ . ‘ shifts to one place to the right.
3. All the rules of lookahead apply here.
STEP 3- defining 2 functions: goto [list of terminals] and action[list of non-terminals] in
the parsing table. Below is the CLR parsing table
RESULT:
VIVA QUESTIONS
1. Define CLR Parser
2. Which is more powerful? SLR, CLR or LALR.
3. What do you meant by Look ahead?
4. Compare SLR and CLR.
5. In which LR parser, the number of states will be less? SLR, CLR or LALR.
It is very easy and cheap to It is also easy and cheap to It is expensive and
implement. implement. difficult to implement.
SLR Parser is the smallest LALR and SLR have the same CLR Parser is the largest.
in size. size. As they have less number As the number of states is
of states. very large.
It requires less time and It requires more time and It also requires more time
space complexity. space complexity. and space complexity.
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
EVALUATION
Assessment Marks Scored
THEORY:
Three address code
o Three-address code is an intermediate code. It is used by the Code Optimizer.
o In three-address code, the given expression is broken down into several separate
instructions. These instructions can easily translate into assembly language.
o Each Three address code instruction has at most three operands. It is a combination of
assignment and a binary operator.
They use maximum three addresses to represent any statement. They are implemented as a
record with the address fields.
General Form-
In general, Three Address instructions are represented as-
a = b op c
Here,
• a, b and c are the operands.
• Operands may be constants, names, or compiler generated temporaries.
• op represents the operator.
Examples-
Examples of Three Address instructions are-
• a=b+c
• c=axb
Common Three Address Instruction Forms-
The common forms of Three Address instructions are-
1. Assignment Statement-
x = y op z and x = op y
Here,
• x, y and z are the operands.
• op represents the operator.
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
It assigns the result obtained after solving the right side expression of the assignment
operator to the left side operand.
2. Copy Statement-
x=y
Here,
• x and y are the operands.
• = is an assignment operator.
It copies and assigns the value of operand y to operand x.
3. Conditional Jump-
If x relop y goto X
Here,
• x & y are the operands.
• X is the tag or label of the target statement.
• relop is a relational operator.
If the condition “x relop y” gets satisfied, then-
• The control is sent directly to the location specified by label X.
• All the statements in between are skipped.
If the condition “x relop y” fails, then-
• The control is not sent to the location specified by label X.
• The next statement appearing in the usual sequence is executed.
4. Unconditional Jump-
goto X
5. Procedure Call-
Example:
1. Write Three Address Code for the following expression-
(a x b) + (c + d) – (a + b + c + d)
Three Address Code for the given expression is-
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
(1) T1 = a x b
(2) T2 = uminus T1
(3) T3 = c + d
(4) T4 = T2 + T3
(5) T5 = a + b
(6) T6 = T3 + T5
(7) T7 = T4 – T6
2. Write Three Address Code for the following expression
If A < B and C < D then t = 1 else t = 0
(1) If (A < B) goto (3)
(2) goto (4)
(3) If (C < D) goto (6)
(4) t = 0
(5) goto (7)
(6) t = 1
(7)
Program:
#include<stdio.h>
#include<string.h>
void pm();
void plus();
void div();
int i,ch,j,l,addr=100;
char ex[10], exp[10] ,exp1[10],exp2[10],id1[5],op[5],id2[5];
void main()
{
clrscr();
while(1)
{
printf("\n1.assignment\n2.arithmetic\n3.relational\n4.Exit\nEnter the choice:");
scanf("%d",&ch);
switch(ch)
{
case 1:
printf("\nEnter the expression with assignment operator:");
scanf("%s",exp);
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
l=strlen(exp);
exp2[0]='\0';
i=0;
while(exp[i]!='=')
{
i++;
}
strncat(exp2,exp,i);
strrev(exp);
exp1[0]='\0';
strncat(exp1,exp,l-(i+1));
strrev(exp1);
printf("Three address code:\ntemp=%s\n%s=temp\n",exp1,exp2);
break;
case 2:
printf("\nEnter the expression with arithmetic operator:");
scanf("%s",ex);
strcpy(exp,ex);
l=strlen(exp);
exp1[0]='\0';
for(i=0;i<l;i++)
{
if(exp[i]=='+'||exp[i]=='-')
{
if(exp[i+2]=='/'||exp[i+2]=='*')
{
pm();
break;
}
else
{
plus();
break;
}
}
else if(exp[i]=='/'||exp[i]=='*')
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
{
div();
break;
}
}
break;
case 3:
printf("Enter the expression with relational operator");
scanf("%s%s%s",&id1,&op,&id2);
if(((strcmp(op,"<")==0)||(strcmp(op,">")==0)||(strcmp(op,"<=")==0)||(strcmp(op,">=")
==0)||(strcmp(op,"==")==0)||(strcmp(op,"!=")==0))==0)
printf("Expression is error");
else
{
printf("\n%d\tif %s%s%s goto %d",addr,id1,op,id2,addr+3);
addr++;
printf("\n%d\t T:=0",addr);
addr++;
printf("\n%d\t goto %d",addr,addr+2);
addr++;
printf("\n%d\t T:=1",addr);
}
break;
case 4:
exit(0);
}
}
}
void pm()
{
strrev(exp);
j=l-i-1;
strncat(exp1,exp,j);
strrev(exp1);
printf("Three address code:\ntemp=%s\ntemp1=%c%ctemp\n",exp1,exp[j+1],exp[j]);
}
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
void div()
{
strncat(exp1,exp,i+2);
printf("Three address code:\ntemp=%s\ntemp1=temp%c%c\n",exp1,exp[i+2],exp[i+3]);
}
void plus()
{
strncat(exp1,exp,i+2);
printf("Three address code:\ntemp=%s\ntemp1=temp%c%c\n",exp1,exp[i+2],exp[i+3]);
}
Output
1. assignment
2. arithmetic
3. relational
4. Exit
Enter the choice:1
Enter the expression with assignment operator:
a=b
Three address code:
temp=b
a=temp
1.assignment
2.arithmetic
3.relational
4.Exit
Enter the choice:2
Enter the expression with arithmetic operator:
a+b-c
Three address code:
temp=a+b
temp1=temp-c
1.assignment
2.arithmetic
3.relational
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
4.Exit
Enter the choice:2
Enter the expression with arithmetic operator:
a-b/c
Three address code:
temp=b/c
temp1=a-temp
1.assignment
2.arithmetic
3.relational
4.Exit
Enter the choice:2
Enter the expression with arithmetic operator:
a*b-c
Three address code:
temp=a*b
temp1=temp-c
1.assignment
2.arithmetic
3.relational
4.Exit
Enter the choice:2
Enter the expression with arithmetic operator:a/b*c
Three address code:
temp=a/b
temp1=temp*c
1.assignment
2.arithmetic
3.relational
4.Exit
Enter the choice:3
Enter the expression with relational operator
a
<=
b
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
1.assignment
2.arithmetic
3.relational
4.Exit
Enter the choice:4
RESULT:
VIVA QUESTIONS
1. Define Three Address Code with Example.
2. What is the need for intermediate code?
3. What are the types of three address code?
4. Is three address code machine dependent? Justify.
5. Write the three address code for the following.
If A < B then 1 else 0
Solution:
Three address code for the given code is-
1. c = 0
2. if (a < b) goto (4)
3. goto (7)
4. T1 = x + 1
5. x = T1
6. goto (9)
7. T2 = x – 1
8. x = T2
9. T3 = c + 1
10. c = T3
11. if (c < 5) goto (2)
EVALUATION
Assessment Marks Scored
THEORY:
The commonly used representations for implementing Three Address Code are-
1. Quadruples
2. Triples
3. Indirect Triples
1. Quadruple –
It is structure with consist of 4 fields namely op, arg1, arg2 and result. op denotes the
operator and arg1 and arg2 denotes the two operands and result is used to store the result
of the expression.
Advantage –
• Easy to rearrange code for global optimization.
• One can quickly access value of temporary variables using symbol table.
Disadvantage –
• Contain lot of temporaries.
• Temporary variable creation increases time and space complexity.
Example:
a+bxc/e↑f+bxc
Three Address Code for the given expression is-
T1 = e ↑ f
T2 = b x c
T3 = T2 / T1
T4 = b x a
T5 = a + T3
T6 = T5 + T4
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
(0) ↑ e f T1
(1) x b c T2
(2) / T2 T1 T3
(3) x b a T4
(4) + a T3 T5
(5) + T5 T4 T6
3. Triples –
This representation doesn’t make use of extra temporary variable to represent a single
operation instead when a reference to another triple’s value is needed, a pointer to that
triple is used. So, it consist of only three fields namely op, arg1 and arg2.
Disadvantage –
• Temporaries are implicit and difficult to rearrange code.
• It is difficult to optimize because optimization involves moving intermediate code.
When a triple is moved, any other triple referring to it must be updated also. With
help of pointer one can directly access symbol table entry.
(0) ↑ e f
(1) x b c
(3) x b a
(4) + a (2)
3. Indirect Triples-
This representation is an enhancement over triples representation.
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
• It uses an additional instruction array to list the pointers to the triples in the desired
order.
• This representation makes use of pointer to the listing of all references to
computations which is made separately and stored.
• Thus, instead of position, pointers are used to store the results.
Advantages
• It allows the optimizers to easily re-position the sub-expression for producing the
optimized code
• Its similar in utility as compared to quadruple representation but requires less space
than it.
• Temporaries are implicit and easier to rearrange code.
Statement
35 (0)
36 (1)
37 (2)
38 (3)
39 (4)
40 (5)
(0) ↑ e f
(1) x b e
(3) x b a
(4) + a (2)
Program:
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
RESULT:
VIVA QUESTIONS
1. Compare Triples and Indirect Triples.
2. What are the ways of representation of Intermediate code? (Postfix notation, Syntax
tree, Three-address code)
3. State the advantages and disadvantages of Quadruples.
4. Translate the following expression to quadruple, triple and indirect triple-
a=bx–c+bx–c
5. State the advantages of Indirect Triples.
EVALUATION
Assessment Marks Scored
To write a C program for implementing back end of the compiler which takes three
address codes as input and produces 8086 assembly language instruction.
THEORY:
A code generator generates target code for a sequence of three- address statements
and effectively uses registers to store operands of the statements.
Code Generator determines the values that are to be stored in the registers.
• For example: consider the three-address statement a := b+c It can have the following
sequence of codes:
A code-generation algorithm:
2. Consult the address descriptor for y to determine y’, the current location of y. Prefer the
register for y’ if the value of y is currently both in memory and a register. If the value of y is
not already in L, generate the instruction MOV y’ , L to place a copy of y in L.
3. Generate the instruction OP z’ , L where z’ is a current location of z. Prefer a register to a
memory location if z is in both. Update the address descriptor of x to indicate that x is in
location L. If x is in L, update its descriptor and remove x from all other descriptors.
4. If the current values of y or z have no next uses, are not live on exit from the block, and are
in registers, alter the register descriptor to indicate that, after execution of x : = y op z , those
registers will no longer contain y or z
Example:
Generating Code for Assignment Statements:
The assignment statement d:= (a-b) + (a-c) + (a-c) can be translated into the following
sequence of three address code:
1. t:= a-b
2. u:= a-c
3. v:= t +u
4. d:= v+u
Target Machine
o The target computer is a type of byte-addressable machine. It has 4 bytes to a word.
o The target machine has n general purpose registers, R0, R1,.... , Rn-1. It also has two-
address instructions of the form:
1. op source, destination
Where, op is used as an op-code and source and destination are used as a data field.
o It has the following op-codes:
ADD (add source to destination)
SUB (subtract source from destination)
MOV (move source to destination)
o The source and destination of an instruction can be specified by the combination of
registers and memory location with address modes.
ALGORITHM:
PROGRAM:
#include<stdio.h>
#include<string.h>
int main()
{
char inp[100][100];
int n,i,j,len;
int reg = 1;
printf("Enter the no of statements");
scanf("%d",&n);
for(i = 0; i < n; i++)
scanf("%s",&inp[i]);
for(i = 0; i < n; i++)
{
len = strlen(inp[i]);
for(j=2; j < len; j++)
{
if(inp[i][j] >= 97 && inp[i][j] <= 122)
{
printf("LOAD R%d %c \n",reg++,inp[i][j]);
}
if(j == len-1 && inp[i][len-j] =='=')
{
j=3; if(inp[i][j] == '+')
{
printf("ADD R%d R%d\n",reg-2,reg-1);
printf("STORE %c R%d\n",inp[i][0],reg-2);
}
else if(inp[i][j]=='-')
{
printf("SUB R%d R%d\n",reg-2,reg-1);
printf("STORE %c R%d\n",inp[i][0],reg-2);
}
else if(inp[i][j]=='*')
{
printf("MUL R%d R%d\n",reg-2,reg-1);
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
printf("STORE %c R%d\n",inp[i][0],reg-2);
}
else if(inp[i][j]=='/')
{
printf("DIV R%d R%d\n",reg-2,reg-1);
printf("STORE %c R%d\n",inp[i][0],reg-2);
}
break;
}
}
}
return 0;
}
OUTPUT:
RESULT:
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
VIVA QUESTIONS
1. What is the purpose of code generator?
2. List the issues in code generator.
3. Compare Register and Address descriptor.
4. What do you mean by next use information?
5. Write the assembly code for the given expression and find the cost. C=a+b*6
6. Name the technique used for allocating registers efficiently.
Linear Scan Algorithm
EVALUATION
Assessment Marks Scored
THEORY:
Reasons for Optimizing the Code
• Code optimization is essential to enhance the execution and efficiency of a source
code.
• It is mandatory to deliver efficient target code by lowering the number of instructions
in a program.
When to Optimize?
Code optimization is an important step that is usually performed at the last stage of
development.
Machine-Independent Optimization
It positively affects the efficiency of intermediate code by transforming a part of code that
does not employ hardware parts. It usually optimises code by eliminating tediums and
removing unneeded code.
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
Machine-Dependent Optimization
After the target code has been constructed and transformed according to the target machine
architecture, machine-dependent optimization is performed. It makes use of CPU registers
and may utilise absolute rather than relative memory addresses. Machine-dependent
optimizers work hard to maximise the perks of the memory hierarchy.
Loop Optimization
• Invariant code/Code Motion or Frequency Reduction
• Induction analysis
• Strength reduction
ALGORITHM:
Step1:Start the program.
Step2: Get the coding from the user.
Step3: Find the operators, arguments and results from the coding.
Step4: Display the value in the table.
Step5:Stop the program
PROGRAM:
#include<stdio.h>
#include<conio.h>
#include<string.h>
struct op
{
char l;
char r[20];
}op[10],pr[10];
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
void main()
{
int a,i,k,j,n,z=0,m,q;
char *p,*l;
char temp,t;
char *tem;
clrscr();
printf("enter no of values");
scanf("%d",&n);
for(i=0;i<n;i++)
{
printf("left\t");
op[i].l=getche();
printf("right:\t");
scanf("%s",op[i].r);
}
printf("intermediate Code\n") ;
for(i=0;i<n;i++)
{
printf("%c=",op[i].l);
printf("%s\n",op[i].r);
}
for(i=0;i<n-1;i++)
{
temp=op[i].l;
for(j=0;j<n;j++)
{
p=strchr(op[j].r,temp);
if(p)
{
pr[z].l=op[i].l;
strcpy(pr[z].r,op[i].r);
z++ ;
}} }
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
pr[z].l=op[n-1].l;
strcpy(pr[z].r,op[n-1].r);
z++;
printf("\nafter dead code elimination\n");
for(k=0;k<z;k++)
{
printf("%c\t=",pr[k].l);
printf("%s\n",pr[k].r);
}
for(i=0;i<z;i++)
{
for(j=i+1;j<z;j++)
{
q=strcmp(pr[i].r,pr[j].r);
if((pr[i].l==pr[j].l)&&!q)
{
pr[i].l='\0';
strcpy(pr[i].r,'\0');
}}
}
printf("optimized code");
for(i=0;i<z;i++)
{
if(pr[i].l!='\0')
{
printf("%c=",pr[i].l);
printf("%s\n",pr[i].r);
}
}
getch();
}
OUTPUT:
enter no of values 5
left a right: 9
left b right: c+d
left e right: c+d
left f right: b+e
left r right: f
intermediate Code
a=9
b=c+d
e=c+d
f=b+e
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
r=f
RESULT:
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
VIVA QUESTIONS
1. State the role of Optimizer
2. Compare Machine dependent and independent optimizer.
3. List the machine dependent optimization techniques.
4. List the machine independent optimization techniques.
5. What do you mean by common sub expression? Give Example.
6. What is code motion and Dead code?
EVALUATION
Assessment Marks Scored
THEORY:
Introduction:
LEX stands for Lexical Analyzer. LEX is a UNIX utility which generates the lexical analyzer.
LEX is a tool for generating scanners. Scanners are programs that recognize lexical patterns
in text. These lexical patterns (or regular expressions) are defined in a particular syntax. A
matched regular expression may have an associated action. This action may also include
returning a token. When Lex receives input in the form of a file or text, it attempts to match
the text with the regular expression. It takes input one character at a time and continues until
a pattern is matched. If a pattern can be matched, then Lex performs the associated action
(which may include returning a token). If, on the other hand, no regular expression can be
matched, further processing stops and Lex displays an error message. Lex and C are tightly
coupled. A lex file (files in Lex have the .l extension eg: first.l ) is passed through the lex utility,
and produces output files in C (lex.yy.c). The program lex.yy.c basically consists of a
transition diagram constructed from the regular expressions of first.l These file is then
compiled object program a.out, and lexical analyzer transforms an input streams into a
sequence of tokens as show in figure. To generate a lexical analyzer two important things are
needed. Firstly it will need a precise specification of the tokens of the language. Secondly it
will need a specification of the action to be performed on identifying each token
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
LEX Specifications:
Definition Section:
C code: Any indented code between %{ and %} is copied to the C file. This is typically used
for defining file variables, and for prototypes of routines that are defined in the code segment.
Definitions: A definition is very much like # define cpp directive. For example
letter [a-zA-Z]+
digit [0-9]+
These definitions can be used in the rules section: one could start a rule
{letter}{printf("n Wordis = %s",yytext);}
State definitions: If a rule depends on context, it‟s possible to introduce states and
incorporate those in the rules. A state definition looks like %s STATE, and by default a
state INITIAL is already given.
Rule Section:
Second section is for translation rules which consist of regular expression and action with
respect to it. The translation rules of a Lex program are statements of the form:
p1 {action 1}
p2 {action 2}
p3 {action 3}
... ...
... ...
pn {action n}
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
Where, each p is a regular expression and each action is a program fragment describing what
action the lexical analyzer should take when a pattern p matches a lexeme. In Lex the actions
are written in C.
Third section holds whatever auxiliary procedures are needed by the actions. If the lex
program is to be used on its own, this section will contain a main program. If you leave this
section empty, you will get the default main as follow:
int main()
{
yylex();
return 0;
}
In this section we can write a user subroutines its option to user e.g. yylex() is a unction
automatically get called by compiler at compilation and execution of lex program or we can
call that function from the subroutine section.
2. Built - in Functions:
2. Built - in Variables:
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
Regular Expression:
ALGORITHM:
1. Lex program contains three sections: definitions, rules, and user subroutines. Each
section must be separated from the others by a line containing only the delimiter, %%.
definitions
%%
rules
%%
user_subroutines
2. In definition section, the variables make up the left column, and their definitions
make up the right column. Any C statements should be enclosed in %{..}%. Identifier
is defined such that the first letter of an is alphabet and remaining letters are
alphanumeric.
3. In rules section, the left column contains the pattern to be recognized in an input
file to yylex(). The right column contains the C program fragment executed when that
pattern is recognized. The various patterns are keywords, operators, new line
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
4. Each pattern may have a corresponding action, that is, a fragment of C source code
to execute when the pattern is matched.
5. When yylex() matches a string in the input stream, it copies the matched text to an
external character array, yytext, before it executes any actions in the rules section.
6. In user subroutine section, main routine calls yylex(). yywrap() is used to get more
input.
7. The lex command uses the rules and actions contained in file to generate a program,
lex.yy.c, which can be compiled with the cc command. That program can then receive
input, break the input into the logical pieces defined by the rules in file, and run
program fragments contained in the actions in file.
PROGRAM: (lexid.l)
%{
#include<stdio.h>
int e,k,c,d,i,s;
%}
%%
include|void|main|int|float|double|scanf|char|printf {printf("keyword"); i++;}
[a-z][a-zA-Z0-9]* {printf("Identifier"); k++;}
[0-9]* {printf("digit"); e++;}
[+|-|*|/|=]* {printf("operator"); c++;}
[;|:|(|)|{|}|"|'|,|\n|\t]* {printf("delimeter"); d++;}
[#|<|>|%]* {printf("symbols"); s++;}
%%
int main(void)
{
yyin=fopen("lexy.txt","r");
yylex();
printf("\nidentifier %d\n",k);
printf("Symbols %d\n",s);
printf("digits %d\n",e);
printf(" Operator %d\n",c);
printf(" keywords %d\n",i);
printf("delimeter %d\n",d);
return 1;
}
int yywrap()
{
return 1;
}
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
INPUT:
Lexyi.txt
int a=10;
OUTPUT:
C:/FlexWindow:/EditPlusPortable> lex lexid.l
C:/FlexWindow:/EditPlusPortable> cc lex.yy.c
C:/FlexWindow:/EditPlusPortable> a
Identifier 1
Digit 1
Keyword 1
Operator 1
Delimiter 1
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
ALGORITHM:
1. Declare the variables which are used for C programs in the declaration
section %{...%}.
2. Define the tokens, precedence and associativity of operators used in yacc.
3. Include the pattern (Context Free Grammar) in the transition rule section
for validating the expression between %%..%%
4. In main function get the expression from the user for validating it.
5. Call the yyparse() function to parse the given expression and it construct
the LALR parsing table using the grammar defined in transition rule .
6. Then call the yylex() function, it get the current token and store its value
in yylval variable and it is repeated until the value for given expression is
computed.
7. Then it validates the expression with constructed LALR parser.
8. Print the expression is VALID if the expression given by user is derived
by the grammar, else print INVALID.
9. Stop the program.
PROGRAM:
%{
#include<ctype.h>
#include<stdlib.h>
#include<string.h>
#define YYSTYPE double
%}
%token num
%left '+' '-'
%left '*' '/'
%%
st: st expr '\n' {printf("VALID");}
|st '\n'
|
|error '\n' {printf("INVALID");}
;
expr: num
|expr '+' expr
|expr '/' expr
%%
main()
{
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
OUTPUT:
C:\Flex Windows\EditPlusPortable>a
VALID
4+6
VALID
5-
INVALID
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
ALGORITHM:
1. Declare the variables which are used for C programs in the declaration section
%{...%}.
2. Define the tokens let, dig used in yacc.
3. Include the pattern (Context Free Grammar) in the transition rule section for
validating the variable between %%..%%
4. In main function get the variable from the user for validating it.
5. Call the yyparse() function to parse the given expression and it construct the
LALR parsing table using the grammar defined in transition rule .
6. Then call the yylex() function, it get the current token and store its value in
yylval variable and it is repeated until the value for given variable.
7. Then it validates the variable with constructed LALR parser.
8. Print the variable is “accepted” if the variable given by user is derived by the
grammar, else print “rejected”.
9. Stop the program.
PROGRAM:
%{
#include<stdio.h>
#include<ctype.h>
%}
%%
sad: let recld '\n' {printf("accepted\n"); return 0;}
| let '\n' {printf("accepted\n"); return 0;}
|
|error {yyerror("rejected\n");return 0;}
;
recld: let recld
| dig recld
| let
| dig
;
%%
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
yylex()
{
char ch;
while((ch=getchar())==' ');
if(isalpha(ch))
return let;
if(isdigit(ch))
return dig;
return ch;
}
yyerror(char *s)
{
printf("%s",s);
}
main()
{
printf("ENTER A variable : ");
yyparse();
}
OUTPUT:
C:\Flex Windows\EditPlusPortable>yacc -d valid.y
C:\Flex Windows\EditPlusPortable> a
A1
accepted
10a
rejected
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
VIVA QUESTIONS
1. Define LEX.
2. Give the syntax of LEX program.
3. List the built-in functions in LEX.
4. List the built-in variables in LEX.
5. Give the regular expression for the identifiers.
EVALUATION
Assessment Marks
Scored
Understanding Problem statement (10)
Efficiency of understanding algorithm (15)
Efficiency of program (40)
Output (15)
Viva (20)
(Technical – 10 and Communications - 10)
Total (100)
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
AIM
To write a lex program to implement a desktop calculator without giving the priority
to operators.
THEORY:
Parser generator facilitates the construction of the front end of a compiler. YACC is LALR
parser generator. It is used to implement hundreds of compilers. YACC is command (utility)
of the UNIX system. YACC stands for “Yet Another Compiler Complier”.
File in which parser generated is with .y extension. e.g. parser.y, which is containing YACC
specification of the translator. After complete specification UNIX command. YACC
transforms parser.y into a C program called y.tab.c using LR parser. The program y.tab.c is
automatically generated. We can use command with –d option as
yacc –d parser.y
By using –d option two files will get generated namely y.tab.c and y.tab.h. The header file
y.tab.h will store all the token information and so you need not have to create y.tab.h
explicitly.
The program y.tab.c is a representation of an LALR parser written in C, along with other C
routines that the user may have prepared. By compiling y.tab.c with the ly library that
contains the LR parsing program using the command.
cc y tab c – ly
we obtain the desired object program a out that perform the translation specified by the
original program.
If procedure is needed, they can be compiled or loaded with y.tab.c, just as with any C
program.
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
LEX recognizes regular expressions, whereas YACC recognizes entire grammar. LEX
divides the input stream into tokens, while YACC uses these tokens and groups them
together logically. LEX and YACC work together to analyze the program syntactically.
The YACC can report conflicts or ambiguities (if at all) in the form of error messages.
1. YACC Specifications:
The Structure of YACC programs consists of three parts:
Definition Section:
The definitions and programs section are optional. Definition section handles control
information for the YACC-generated parser and generally set up the execution environment
in which the parser will operate.
Declaration part:
In declaration section, %{ and %} symbol used for C declaration. This section is used for
definition of token, union, type, start, associativity and precedence of operator. Token
declared in this section can then be used in second and third parts of Yacc specification.
2. Built-in Functions:
3. Built-in Types:
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
4. Special Characters:
$./a .out
ALGORITHM:
1. Declare the variables and header files which are used for C programs in the declaration
section %{...%} of lex and yacc file.
2. Include the pattern (regular expression in lex and CFG in yacc) in the transition rule
section %%..%%
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
4. In main function get the expression from the user for calculation.
5. Call the yyparse() function to parse the given expression and it construct the LALR parsing
table using the grammar defined in transition rule of yacc.
6. Then call the yylex() function, it get the current token and store its value in yylval variable
and it is repeated until the value for given expression is computed.
PROGRAM
Cal.l
%{
#include <stdio.h>
#include "y.tab.h"
int c;
extern int yylval;
%}
%%
"" ;
[a-z] {
c = yytext[0];
yylval = c - 'a';
return (LETTER);
}
[0-9] {
c = yytext[0];
yylval = c - '0';
return (DIGIT);
}
[^a-z0-9\b] {
c = yytext[0];
return(c);
}
%%
Cal.Y
%{
#include <stdio.h>
int regs[26];
int base;
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
%}
%start list
%token DIGIT LETTER
%left '|'
%left '&'
%left '+' '-'
%left '*' '/' '%'
%left UMINUS /*supplies precedence for unary minus */
main()
{
return(yyparse());
}
yyerror(s)
char *s;
{
fprintf(stderr, "%s\n",s);
}
yywrap()
{
return(1);
}
(OR)
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
%{
#include<stdio.h>
int op=0,i;
float a,b;
%}
dig[0-9]+|([0-9]*)"."([0-9]+)
add "+"
sub "-"
mul"*"
div "/"
pow "^"
ln \n
%%
{dig}{digi();}
{add}{op=1;}
{sub}{op=2;}
{mul}{op=3;}
{div}{op=4;}
{pow}{op=5;}
{ln}{printf("\n the result:%f\n\n",a);}
%%
digi()
{
if(op==0)
a=atof(yytext);
else
{
b=atof(yytext);
switch(op)
{
case 1: a=a+b;
break;
case 2: a=a-b;
break;
case 3: a=a*b;
break;
case 4: a=a/b;
break;
case 5: for(i=a;b>1;b--)
a=a*i;
break;
}
op=0;
}
}
main(int argv,char *argc[])
{
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
yylex();
}
yywrap()
{
return 1;
}
OGRAM:
OUTPUT:
C:\Flex Windows\EditPlusPortable> a
5+2
8*2
16
5-3
7/2
3
PREPARED BY R.Raja Sekar, AP/CSE. KARE.
VIVA QUESTIONS
1. Define YACC.
2. Which parser is generated by YACC.
3. List the built-in functions in YACC.
4. List the built-in types in YACC.
5. Compare YACC and LEX.
EVALUATION
Assessment Marks
Scored
Understanding Problem statement (10)
Efficiency of understanding algorithm (15)
Efficiency of program (40)
Output (15)
Viva (20)
(Technical – 10 and Communications - 10)
Total (100)