0% found this document useful (0 votes)
9 views83 pages

SPCC Lab Manual

The document is a laboratory manual for the System Programming and Compiler Construction Lab (CSL601) for the academic year 2021-2022. It outlines the department's vision, mission, program outcomes, educational objectives, specific outcomes, and course outcomes, along with a list of experiments to be conducted. The manual includes detailed descriptions of various programming tasks related to compiler construction, such as implementing a lexical analyzer and eliminating left recursion from grammars.

Uploaded by

SHREYA BHUVAD
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views83 pages

SPCC Lab Manual

The document is a laboratory manual for the System Programming and Compiler Construction Lab (CSL601) for the academic year 2021-2022. It outlines the department's vision, mission, program outcomes, educational objectives, specific outcomes, and course outcomes, along with a list of experiments to be conducted. The manual includes detailed descriptions of various programming tasks related to compiler construction, such as implementing a lexical analyzer and eliminating left recursion from grammars.

Uploaded by

SHREYA BHUVAD
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 83

Department of Computer Engineering

Laboratory Manual

Subject Name
System Programming and Compiler Construction Lab

Subject Code:- ( CSL601 )

Year: - 2021-2022 Sem:-VI

(REV- 2019 ‘C’ Scheme)

SPCC(2021-2022) 1
Department of Computer Engineering

VISION

To become a leading department committed to nurture student centric learning through outcome and skill
based transformative IT education to create Technocrats and leaders for the service of society.

MISSION

 To provide contemporary and cutting-edge technical education.


 To provide an ambience which nurtures research ideas in futuristic domains
 To initiate project-based learnings and practical exposures
 To direct faculties in research and consultancy / advisory roles
 To establish strong linkages with well-known national and international technical institutes and
industry
 To promote a culture of imbibing environmental care
 To aim to become an institute of aspiration and choice

SPCC(2021-2022) 2
Department of Computer Engineering

PROGRAM OUTCOMES (POs) OF UNDERGRADUATE PROGRAM


PO.1 Engineering knowledge: Apply the knowledge of mathematics, science, engineering fundamentals,

and an engineering specialization to the solution of complex engineering problems.

PO.2 Problem analysis: Identify, formulate, review research literature, and analyze complex engineering

problems reaching substantiated conclusions using first principles of mathematics, natural sciences,

and engineering sciences.

PO.3 Design/development of solutions: Design solutions for complex engineering problems and design

system components or processes that meet the specified needs with appropriate consideration for

the public health and safety, and the cultural, societal, and environmental considerations.

PO.4 Conduct investigations of complex problems: Use research-based knowledge and research

methods including design of experiments, analysis and interpretation of data, and synthesis of the

information to provide valid conclusions.

PO.5 Modern tool usage: Create, select, and apply appropriate techniques, resources, and modern

engineering and IT tools including prediction and modeling to complex engineering activities with

an understanding of the limitations.

PO.6 The engineer and society: Apply reasoning informed by the contextual knowledge to assess

societal, health, safety, legal and cultural issues and the consequent responsibilities relevant to the

professional engineering practice.

PO.7 Environment and sustainability: Understand the impact of the professional engineering solutions

in societal and environmental contexts, and demonstrate the knowledge of, and need for sustainable

development.

PO.8 Ethics: Apply ethical principles and commit to professional ethics and responsibilities and norms of

the engineering practice.

SPCC(2021-2022) 3
Department of Computer Engineering
PO.9 Individual and team work: Function effectively as an individual, and as a member or leader in

diverse teams, and in multidisciplinary settings.

PO.10 Communication: Communicate effectively on complex engineering activities with the engineering

community and with society at large, such as, being able to comprehend and write effective reports

and design documentation, make effective presentations, and give and receive clear instructions.

PO.11 Project management and finance: Demonstrate knowledge and understanding of the engineering

and management principles and apply these to one’s own work, as a member and leader in a team,

to manage projects and in multidisciplinary environments.

PO.12 Life-long learning: Recognize the need for, and have the preparation and ability to engage in

independent and life-long learning in the broadest context of technological change.

Program Educational Objectives (PEOs)


PEO 1. To prepare the Learner with a sound foundation in the mathematical, scientific
and engineering fundamentals.
PEO 2. To motivate the Learner in the art of self-learning and to use modern tools for
solving real life problems.

PEO 3. To equip the Learner with broad education necessary to understand the impact

of Computer Science and Engineering in a global and social context.

PEO 4. To encourage, motivate and prepare the Learners for Life long learning.

PEO 5. To inculcate professional and ethical attitude, good leadership qualities and
commitment to social responsibilities in the Learners thought process.

Program Specific Outcomes (PSOs)


1. An ability to apply the knowledge of computer engineering for the design and development of
IoT system.
2. An ability to solve complex computer Engineering problems using latest technical tools
to achieve optimized solutions.

SPCC(2021-2022) 4
Department of Computer Engineering

COURSE OUTCOMES:

1. Generate machine code by implementing two pass assemblers.

2. Implement Two pass macro processor.

3. Parse the given input string by constructin Topdown/Bottom-up parser.

4. Identify and Validate tokens for given high level language and Implement
synthesis phase of compiler.
5. Explore LEX & YACC tools.

SPCC(2021-2022) 5
Department of Computer Engineering

LIST OF EXPERIMENTS

Sr. No EXPERIMENT NAME CO PO


1 Program to implement Lexical Analyzer. 4,5
2 Program to eliminate Left Recursion from given grammar. 3
3 Program to implement Parser (Any one). 3
4 Program to implement two pass Assembler. 1
5 Program to implement Intermediate code generation phase of 3
compiler. (Quadruple)
6 Program to implement Code Optimization technique (any 3
one).
7 Program to implement code generation phase of compiler. 3
8 Program to generate Macro Name Table. 2
9 Program to implement two pass Macro Processor. 2
10 Program to study and implement LEX and YACC tools. 5

SPCC(2021-2022) 6
Department of Computer Engineering
Date:

Program No-1

Problem Definition:

Program to implement Lexical Analyzer.

Theory:
In computer science, lexical analysis, lexing or tokenization is the process of converting a
sequence of characters into a sequence of tokens. A program that performs lexical analysis may
be termed a lexer, tokenizer, or scanner, althoughscannerisalsoa termfor the firststageof
a lexer. A lexer is generally combined with a parser, which together analyze the syntax of
programminglanguages, web pages, and so forth.
Lexical analysis is the first phase of a compiler. It takes the modified source code from
language preprocessors that are written in the form of sentences. The lexical analyzer breaks
these syntaxes into a series of tokens, by removing any whitespace or comments in the source
code.
If the lexical analyzer finds a token invalid, it generates an error. The lexical analyzer
works closely with the syntax analyzer. It reads character streams from the source code,
checks forlegal tokens, and passes the data to the syntax analyzer when it demands.
Lexemes are said to be a sequence of characters (alphanumeric) in a token. There are some
predefined rules for every lexeme to be identified as a validtoken. These rules are defined by
grammar rules, by means of a pattern. Apattern explains what can be a token, and these patterns
are defined by means of regularexpressions.
In programming language, tokens can be defined as keywords, constants,identifiers, strings,
numbers, operatorsandpunctuationssymbols.

SPCC(2021-2022) 7
Department of Computer Engineering

For example:
int num=5;
contains the tokens:
int(keyword), num(identifier), =(operator), 5(constant) and ;(symbol).

Alphabets
Any finite set of symbols {0,1} is a set of binary alphabets,
{0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F} is a set of Hexadecimal alphabets, {a-z, A-Z} isa set of
English language alphabets.

Strings
Anyfinitesequenceofalphabetsiscalledastring. Lengthofthestringisthetotal number of
occurrences of alphabets.

Language
A language is considered as a finite set of strings over some finite set of alphabets.
Computer languages are considered as finite sets, and mathematically set operations can
beperformedonthem. Finitelanguagescan be described by means of regular expressions.

Example: sequence of tokens.

Lexeme Token

int Keyword

maximum Identifier

( Operator

int Keyword

SPCC(2021-2022) 8
Department of Computer Engineering

x Identifier

, Operator

int Keyword

Y Identifier
Algorithm:
) Operator
1. Define rules
to identify a
{ Operator
particular
token using
If Keyword bool
functions.

2. Input the
relevant code for analysis.

3. For every bool function that returns true, recognise tokens and move to step 4
4. Display input and type of token.
5. Continue with the same procedure for other possible tokens

Code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <stdbool.h>

bool operators(char ch)


{
if (ch == '+' || ch == '-' || ch == '*' ||
ch == '/' || ch == '>' || ch ==
'<' ||ch == '=')
return
(true);return
(false);
}

SPCC(2021-2022) 9
Department of Computer Engineering
bool delim(char ch)
{
if (ch == ' ' || ch == '+' || ch == '-' || ch == '*' ||
ch == '/' || ch == ',' || ch == ';' || ch ==
'>' || ch == '<' || ch == '=' || ch == '(' ||
ch == ')' ||ch == '[' || ch == ']' || ch ==
'{' || ch == '}') return (true);
return (false);
}

bool idenval(char* str)


{
if (str[0] == '0' || str[0] == '1' || str[0] == '2' ||
str[0] == '3' || str[0] == '4' || str[0] == '5' ||
str[0] == '6' || str[0] == '7' || str[0]
== '8' ||str[0] == '9' || delim(str[0])
== true) return (false);
return (true);
}

bool keyw(char* str)


{
if (!strcmp(str, "if") || !strcmp(str, "else") ||
!strcmp(str, "while") || !strcmp(str, "do") ||
!strcmp(str, "break") ||
!strcmp(str, "continue") || !strcmp(str, "int")
|| !strcmp(str, "double") || !strcmp(str, "float")
|| !strcmp(str, "return") || !strcmp(str, "char")

|| !strcmp(str, "case") || !strcmp(str, "char")


|| !strcmp(str, "sizeof") || !strcmp(str, "long")
|| !strcmp(str, "short") || !strcmp(str, "typedef")
|| !strcmp(str, "switch") || !strcmp(str, "unsigned")
|| !strcmp(str, "void") || !strcmp(str, "static")
|| !strcmp(str, "struct") || !strcmp(str,
"goto"))return (true);
return (false);
}

SPCC(2021-2022) 10
Department of Computer Engineering
bool inte(char* str)
{
int i, len = strlen(str);

if (len == 0)
return
(false); for (i = 0; i
< len; i++) {
if (str[i] != '0' && str[i] != '1' && str[i] != '2'
&& str[i] != '3' && str[i] != '4' && str[i] != '5'
&& str[i] != '6' && str[i] != '7' && str[i] != '8'
&& str[i] != '9' || (str[i] == '-'
&& i > 0))return (false);
}
return (true);
}

bool realno(char* str)


{
int i, len =
strlen(str); bool
hasDecimal = false;

if (len == 0)
return
(false); for (i = 0; i
< len; i++) {
if (str[i] != '0' && str[i] != '1' && str[i] != '2'
&& str[i] != '3' && str[i] != '4' && str[i] != '5'
&& str[i] != '6' && str[i] != '7' && str[i] != '8'
&& str[i] != '9' && str[i] != '.' ||
(str[i] == '-' &&
i > 0))return
(false);
if (str[i] == '.')
hasDecimal = true;
}
return (hasDecimal);
}
char* substr(char* str, int left, int right)
{
int i;
SPCC(2021-2022) 11
Department of Computer Engineering
char* subStr = (char*)malloc(
sizeof(char) * (right - left + 2));

for (i = left; i <= right; i++)


subStr[i - left] =
str[i];
subStr[right - left + 1]
= '\0';return (subStr);
}

void inppa(char* str)


{
int left = 0,
right = 0;int len
= strlen(str);

while (right <= len && left <=


right) {if
(delim(str[right]) ==
false)
right++;

if (delim(str[right]) == true && left ==


right) {if (operators(str[right])
== true)
printf("'%c' is an operator \n", str[right]);

right+
+; left
=
right;
} else if (delim(str[right]) == true && left !=right
|| (right == len && left !=
right)) {char* subStr = substr(str,
left, right - 1);

if (keyw(subStr) == true)
printf("'%s' is a keyword\n", subStr);

else if (inte(subStr) == true)


printf("'%s' is an integer\n", subStr);

SPCC(2021-2022) 12
Department of Computer Engineering
else if (realno(subStr) == true)
printf("'%s' is a real number \n", subStr);

else if (idenval(subStr) == true


&& delim(str[right - 1]) == false)
printf("'%s' is a valid identifier\n", subStr);

else if (idenval(subStr) == false


&& delim(str[right - 1]) == false)
printf("'%s' is not a valid identifier \n",
subStr);

SPCC(2021-2022) 13
left = right;
}
}
return;
}

int main()
{

char str[300];
printf("\nPlease
enter code:\n");
gets(str);
inppa(str);

return (0);
}
Output:

SPCC(2021-2022) 14
Program No-2

Problem Definition:

Write a program to eliminate left recursion from the given grammar.

Theory:

A production of grammar is said to have left recursion if the leftmost variable of its RHSis the
same as the variable of its LHS. A grammar containing a production having left recursion is
called a Left Recursive Grammar.

Example:

E->E+T|T

T->T*F|F

F->E|id

Then,

E-> TE’

E’-> +TE’|Epsilon

T->FT’

T’->*FT’|Epsilon

F->(E)|id

SPCC(2021-2022) 15
SPCC(2021-2022) 16
Rule to remove Left Recursion:

The top-down parsing method cannot handle left recursive grammar so some transformation to
eliminate the left recursion is needed. A grammar is left recursive if ithas non-terminal 'A' such
that there is a production of the form:
o A → Aα|β
For some string α
A left recursive pair of production A→Aα|β be transformed as:
o A→
βA’A’→αA’ | ε

·Without changing the set of strings derivable from A.We can remove left recursion fromany
number of A-productions by using the following techniques:
o First we group all A-productions as
o A->Aα1 | Aα2 | ............ | Aαm | β1 | β2 …… | βn
o Where no βi begins with A.
o Then we replace A productions as,
§ A->β1A’ | β2A’ |........ | βnA’
§ A’ ->α1A’ | α2A’ | ........ | αmA’ | ε
o Here, A’ is a non-terminal added to the grammar.

Algorithm:

Steps:

1. Start
2. We take the number of productions as an input from the user. We then ask themto enter
the productions. The grammar is stored in a 2D array.
3. We set the index as 3 because of the way we ask them to enter the grammar i.ein the
form of E->E+E.
4. The non terminal is then stored in nt, taken from the production array.
5. If the grammar is left recursive, we proceed with eliminating the left recursion ofthe
grammar. If the grammar can’t be reduced or isn’t notified, a message for each
scenario is displayed and it skips to step 10.
6. The while loop runs as long as a 0 or | is not encountered.
7. Value of alpha and beta are stored in different variables.
8. The grammar is then displayed after eliminating left recursion.
9. Steps 4 to 7 are repeated for each entered production
10. Exit

SPCC(2021-2022) 17
Program:
#include<stdio.h>
#include<string.h
>#define SIZE 50
int main () {
char nt;
char beta,alpha;
int num;
char
prod[10][SIZE];int
index=3;
printf("Total number of productions :
");scanf("%d",&num);
printf("Enter the
grammar(s):\n");for(int
i=0;i<num;i++){
scanf("%s",prod[i]);
}
for(int i=0;i<num;i++){
printf("\nThe grammar : : :
%s",prod[i]);nt=prod[i][0];
if(nt==prod[i][index]) {
alpha=prod[i][index+1];
printf(" It is left
recursive.\n");
while(prod[i][index]!=0 &&
prod[i][index]!='|')index++;
if(prod[i][index]!=0) {
beta=prod[i][index+1
];
printf("After eliminating left
recursion:\n");printf("%c-
>%c%c\'",nt,beta,nt); printf("\n%c\'-
>%c%c\'|E\n",nt,alpha,nt);
}
else
printf(" It can't be reduced\n");

SPCC(2021-2022) 18
}
else
printf(" It is not left
recursive.\n");index=3;
}
}

SPCC(2021-2022) 19
Output:

SPCC(2021-2022) 20
Program No-3

Problem Definition:

Write a program to implement Parser (recursive descent)

Theory:
The process to determine whether the start symbol can derive the program or not is known as
parsing. If the Parsing is successful, the program is a valid program otherwisethe program is
invalid.

There are generally two kinds of Parsers:


1. Top-Down Parsers
2. Bottom-Up Parsers

Recursive Descent Parser:


It is a top-down method of syntax analysis in which a set recursive procedures to process the
input is executed. There is a procedure that is associated with each non- terminal of a grammar.

Top-down parsing can be viewed as an attempt to find a leftmost derivation for an input string. It
then attempts to construct a parse tree for the input starting from the root and creating the nodes
of the parse tree in preorder. Recursive descent parsing involves backtracking.

Example:

E → TE'

E' → +TE' | ε

T → FT'

T' → *FT' | ε

SPCC(2021-2022) 21
F → (E) | id

Recursive descent parser with backtracking for the grammar S→aSa | aa

Algorithm:

Procedure S

procedure s()

begin

if input symbol=’c’ then

begin

ADVANCE();

if A() then

if input symbol=’c’ then

begin

ADVANCE();

return true;
end

end

return false;

end

SPCC(2021-2022) 22
Program:
//Recursive Descent Parsing
#include "stdio.h"
#include "conio.h"
char input[200]; char
prod[200][100];int
pos=-1,l,st=-1; char
id,num;

//Functions needed
void E();
void T();
void F();
void advance();
void Td();
void Ed();

//Advance function
void advance()
{
pos++;
if(pos<l)
{
if(input[pos]>='0'&& input[pos]<='9')
{
num=input[pos];
id='\0';
}
if((input[pos]>='a' || input[pos]>='A')&&(input[pos]<='z' || input[pos]<='Z'))
{id=input[pos];
num='\0';
}
}
}

void E()
{
strcpy(prod[++st],"E->TE'");
T();

SPCC(2021-2022) 23
Ed();
}

void Ed()
{
int p=1;
if(input[pos]=='+')
{
p=0;
strcpy(prod[++st],"E'->+TE'");
advance();
T();
Ed();
}
if(input[pos]=='-')
{ p=0;
strcpy(prod[++st],"E'->-TE'");
advance();
T();
Ed();
}

// Recursive Descent Parser


if(p==1)
{
strcpy(prod[++st],"E'->null");
}
}

void T()
{
strcpy(prod[++st],"T->FT'");
F();
Td();
}
void Td()
{
int p=1;
if(input[pos]=='*')

SPCC(2021-2022) 24
{
p=0;
strcpy(prod[++st],"T'->*FT'");
advance();
F();
Td();
}
if(input[pos]=='/')
{ p=0;
strcpy(prod[++st],"T'->/FT'");
advance();
F();
Td();}

if(p==1)
strcpy(prod[++st],"T'->null");
}

void F()
{
if(input[pos]==id) {
strcpy(prod[++st],"F->id");
advance(); }
if(input[pos]=='(')
{
strcpy(prod[++st],"F->(E)");
advance();
E();
if(input[pos]==')') {
advance();

}
}
if(input[pos]==num)
{
strcpy(prod[++st],"F->num");
advance();
}
}
int main()
{
SPCC(2021-2022) 25
int i;
printf("Enter Input String ");
scanf("%s",input);
l=strlen(input);
input[l]='$';
advance();
E();
if(pos==l)
{
printf("String Accepted\n");
for(i=0;i<=st;i++)
{
printf("%s\n",prod[i]);
}
}
else
{
printf("String rejected\n");
}
return 0;
}

Output:

SPCC(2021-2022) 26
Program No-4

Problem Definition:
Program to implement Two Pass Assembler

Theory:
Assemblers are a third type of translator. The purpose of an assembler is to translate assembly
language into object code. Whereas compilers and interpreters generate many machine code
instructions for each high-level instruction, assemblers create onemachine code instruction for
each assembly instruction.

Types of assemblers

1. One-Pass Assembler

These assemblers perform the whole conversion of assembly code to machine code in one go.

2. Multi-Pass/Two-Pass Assembler

These assemblers first process the assembly code and store values in the opcode tableand symbol
table. And then in the second step, they generate the machine code using these tables.

a) Pass 1
● Symbol tables and opcode tables are defined.

SPCC(2021-2022) 27
● keep the record of the location counter.

● Also, processes the pseudo instructions.

b) Pass 2
● Finally, converts the opcode into the corresponding numeric opcode.

● Generates machine code according to values of literals and symbols.

Pseudo Opcode Table (POT)

● POT is the fixed length table.


● In pass 1, using Pseudo Opcode, POT is consulted for processing somepseudo
opcode like DS, DC, START, END, etc.
● In pass 2 using Pseudo Opcode, POT is consulted for processing somepseudo
opcodes like DS,DC,USING DROP.

START-

SPCC(2021-2022) 28
● It is used to specify the starting execution of a program.

USING –

● It specifies the base table register that is being used.

DROP –

● It is used to remove the register used in the base table. EQU-

● It is used for making programs more readable. Whenever symbols are defined
with an EQU statement no memory would be allocated; only anentry would be
made in the symbol table.

Declare Constant (DC) -

● It is used to declare or define the values.

Declare Storage (DS) -

● It is used to store the value at a specified address.

Machine Opcode Table (MOT)

● MOT is a fixed length table i.e. we make no entry in either of the passes
● It is used to accept the instructions and convert/gives its binary opcode.
● In pass 1, using mnemonic Opcode, MOT is consulted to update location
Counter (LC).
● In pass 2, using mnemonic opcode ,MOT is consulted to obtain 1) Binaryopcode
(to generate the instruction).
2) Instruction length (to update the instruction).
3) Instruction Format (to assemble the instruction).

SPCC(2021-2022) 29
● L 1, data
Load the data into the register.
● A 1, data
Add the data into the register.

● ST 1, temp

Store the content of the register to temp (constant).

Instruction:

● RR

○ The first operand is register and second operator is also a register

● RX
○ The first operand is register and the second operator is an address.
○ The address is generated by considering Base + index + displacement.

Address = (Base + index + displacement)

Symbol Table (ST)

● Symbol table is used for keeping track of symbols that are defined in the
program.

SPCC(2021-2022) 30
● It is used to give a location for a symbol specified.

SPCC(2021-2022) 31
● Symbol is said to be defined and it appears in a label field. ● In pass 1,
whenever a symbol is defined and an entry is made in the symbol table.
● In pass2, the symbol table is used for generating the address of a symbol.

Literal Table (LT)

● Literal table is used for keeping track of literals that are encountered in the
programs.
● We directly specify the value, literal is used to give a location for thevalue.
● Literals are always encountered in the operand field of an instruction
● In pass 1, whenever a Literal is defined and an entry is made in the Literaltable.
● In pass2, a Literal table is used for generating addresses of a Literal.

SPCC(2021-2022) 32
Flowcharts
Pass – I

SPCC(2021-2022) 33
Pass--II

SPCC(2021-2022) 34
Code:

#include<stdio.h>
#include<string.h>
void chk_label();
void chk_opcode();
void
READ_LINE();
struct optab
{ char code[10],objcode[10];
}
moptb[3]={
{"LDA","00"},
{"JMP","01"},
{"STA","02"}
};
struct symtab{
char
symbol[10];
intaddr;
}
mysymtab[10];
Int startaddr,locctr,symcount=0,length;
char line[20],label[8],opcode[8],operand[8],programname[10];

void P1()
{
FILE *input,*inter;
input=fopen("input.txt","r");
inter=fopen("inter.txt","w");
printf("LOCATION LABEL\tOPERAND\tOPCODE\n");
printf("
")
;fgets(line,20,input); READ_LINE();
if(!strcmp(opcode,"START"))
{ startaddr=atoi(operand);
locctr=startaddr;
strcpy(programname,label);
fprintf(inter,"%s",line);
fgets(line,20,input);
}
else
{

SPCC(2021-2022) 35
programname[0]='\0';
startaddr=0; locctr=0;
}
printf("\n %d\t %s\t%s\t
%s",locctr,label,opcode,operand);
while(strcmp(line,"END")!=0)

SPCC(2021-2022) 36
{
READ_LINE(
);
printf("\n %d\t %s \t%s\t
%s",locctr,label,opcode,operand);
if(label[0]!='\0')chk_label();
chk_opcode();
fprintf(inter,"%s %s
%s\n",label,opcode,operand);fgets(line,20,input);
}
printf("\n
%d\t\t%s",locctr,line);
fprintf(inter,"%s",line);
fclose(inter);
fclose(input);
}

void P2()
{
FILE *inter,*output;
char record[30],part[6],value[5];
int currtxtlen=0,foundopcode,foundoperand,chk,operandaddr,recaddr=0;
inter=fopen("inter.txt","r");
output=fopen("output.txt","w");

fgets(line,20,inter);
READ_LINE();
if(!strcmp(opcode,"START"))
fgets(line,20,inter);

printf("\n\nCorresponding Object code is..\n");


printf("\nH^ %s ^ %d ^ %d ",programname,startaddr,length);
fprintf(output,"\nH^ %s ^ %d ^ %d ",programname,startaddr,length);
recaddr=startaddr; record[0]='\0';

while(strcmp(line,"END")!=0)
{
operandaddr=foundoperand=foundopcode=0;
part[0]= value[0]='\0';
READ_LINE();
for(chk=0;chk<3;chk++)
{
if(!strcmp(opcode,moptb[chk].code))
{
SPCC(2021-2022) 37
foundopcode=1; strcpy(part,moptb[chk].objcode);
if(operand[0]!='\0')
{
for(chk=0;chk<symcount;chk++)
if(!strcmp(mysymtab[chk].symbol,operand))
{
itoa(mysymtab[chk].addr,value,10);
foundoperand=1;
}
if(!foundoperand)
strcpy(value,"err")
;
}
}
}
if(!foundopcode)
{
if(strcmp(opcode,"BYTE")==0 ||strcmp(opcode,"WORD")||strcmp(opcode,"RESB"))
{
strcpy(value,operand);
}}
if((currtxtlen+strlen(value)+strlen(part))<=8)
{
strcat(record,"^");
strcat(record,part);
strcat(record,value);
currtxtlen+=(strlen(value)+strlen(part));
}
else
{
printf("\nT^ %d ^%d ^%s",recaddr,currtxtlen,record);
fprintf(output,"\nT^ %d ^%d ^%s",recaddr,currtxtlen,record);
recaddr+=currtxtlen;
currtxtlen=strlen(value)+strlen(part);
strcpy(record,part);
strcat(record,value);
}
fgets(line,20,inter);
}
printf("\nT^ %d ^%d %s",recaddr,currtxtlen,record);
fprintf(output,"\nT^ %d ^%d
%s",recaddr,currtxtlen,record);printf("\nE^
%d\n",startaddr);
fprintf(output,"\nE^

SPCC(2021-2022) 38
%d\n",startaddr);fclose(inter);
fclose(output);
}
void READ_LINE()
{ char
buff[8],word1[8],word2[8],word3[8];int
i,j=0,count=0;
label[0]=opcode[0]=operand[0]=word1[0]=word2[0]=word3[0]='\0';

for(i=0;line[i]!='\0';i++)
{
if(line[i]!=' ')buff[j++]=line[i];

else
{
buff[j]='\0'; strcpy(word3,word2);strcpy(word2,word1);strcpy(word1,buff); j=0;count++;
}

}
buff[j-1]='\0';
strcpy(word3,word2)
;
strcpy(word2,word1)
;strcpy(word1,buff);

switch(count)
{ case 0:strcpy(opcode,word1);
break;
case 1:{strcpy(opcode,word2);strcpy(operand,word1);}
break;
case
2:{strcpy(label,word3);
strcpy(opcode,word2);
strcpy(operand,word1);}
break;
}
}
void chk_label()
{ int k,dupsym=0;

for(k=0;k<symcount;k++)
if(!strcmp(label,mysymtab[k].symbol))
{ mysymtab[k].addr=-
1;dupsym=1;
break;
SPCC(2021-2022) 39
}

if(!dupsym)
{
strcpy(mysymtab[symcount].symbol,label);
mysymtab[symcount++].addr=locctr;
}

}
void chk_opcode()
{
int k=0,found=0;

for(k=0;k<3;k++)
if(!strcmp(opcode,moptb[k].code))
{
locctr+=3; found=1;
break;
}
if(!found)
{
if(!strcmp(
opcode,"WORD"))
locctr+=3;
else
if
(!strcmp(opcode,"RESW"))
locctr+=(3*atoi(operand));
else
if(!strcmp(opcode,"RESB"))
locctr+=atoi(operand); }
}
int main()
{
P1();
length=locctr-startaddr;
P2();
}

SPCC(2021-2022) 40
Output with input files

Output file:

SPCC(2021-2022) 41
Output

SPCC(2021-2022) 42
Program No 5

Problem Definition:

Write a program to implement immediate code generation (Quadruple)Theory:


During the translation of a source program into the object code for a target machine, acompiler
may generate a middle-level language code, which is known as intermediatecode

Commonly used intermediate code representation include:

1. Syntax tree
2. Postfix Notation
3. Three-Address Code

Syntax Tree:
Syntax tree is a condensed form of a parse tree. The operator and keyword nodes of the parse
tree are moved to their parents and a chain of single productions is replaced by a single link in
the syntax tree, the internal nodes are operators and child nodes are operands.
Example:

x = (a + b * c) / (a – b * c)

Postfix Notation:

If we have an expression such as x+y, then the postfix notation for the same expressionplaces the
operator at the right end as xy +. In general, if e1 and e2 are any postfix expressions, and + is any
binary operator, the result of applying + to the values denotedby e1 and e2 is postfix notation by
e1e2 +. No parentheses are needed in postfix notation and in this, the operator follows the
operand.

Example: The postfix representation of the expression (a – b) * (c + d) is : ab – cd +

SPCC(2021-2022) 43
Three Address Code:

Three-address code is an intermediate code. It is used by the optimizing compilers. In three-

address code, the given expression is broken down into several separate instructions. These

instructions can easily translate into assembly language. Each Three address code instruction has

at most three operands. It is a combination of assignment and a binary operator.

Example:

For: a := (-c * b) + (-c * d)

Three address code is given as:

t1 := -c

t2 := b*t1

t3 := -c

t4 := d * t3

t5 := t2 + t4

a := t5

Quadruples

The quadruples have four fields to implement the three address code. The field of quadruples
contains the name of the operator, the first source operand, the second source operand and the
result respectively.

For example:

a := -b * c + d

The three address code is:

SPCC(2021-2022) 44
t1 := -b

t2 := c + d

t3 := t1 * t2

a := t3

And these are further represented by the following quadruple

SPCC(2021-2022) 45
Program:

//Quadruples program

#include<stdio.h>

int main()

{
char inp[20][20]; int len[10];

int op[10][10],a1[10][10],a2[10][10],res[10][10];
char temp; FILE * f1; int

i=0,j=0,a,b,fop=0,fa1=0,fres=0,t;

f1=fopen("input1.txt","r");

w hile(1)

{
temp=fgetc(f1);

if (temp=='\n')

{len[i]=j;
j=0
; i++;
}
else
if

(feof(f1))

break; else

{
inp[i][j]=tem

p;j++;
}
}

SPCC(2021-2022) 46
a=i;

b=j;

SPCC(2021-2022) 47
printf("Input file

:\n");for (i=0;i<a;i++)

{
for (j=0;j<len[i];j++)

printf("%c",inp[i][j]);

printf("\n");
}
//Start of the table
printf("Result \tArg1 \tOpt \tArg2

\n");for (i=0;i<a;i++)

t=0;

for (j=0;j<len[i];j++)
{
temp=inp[i][j

];if (fres==0)
{
if (temp=='=')
{
fres=1; t=0;

printf("\t");

}
else
{
printf("%c",temp);
op[i][t]=temp; t++;
}
SPCC(2021-2022) 48
}
else
{
if (fa1==0)
{
if ((temp=='+')||(temp=='-')||(temp=='*')||(temp=='/'))
{
fa1=1; printf("\t");
printf("%c",temp);
printf("\t");
}
else

printf("%c",temp);
if (j==(len[i]-1))
{
printf("\t");
printf("=");
printf("\t");
}
}
else
{
printf("%c",temp);
}
}
}
fres=fa1=fop=0;

SPCC(2021-2022) 49
Department of Computer

printf("\n");
}
fclose(f1); return 0;
}

Output:

SPCC(2021-2022) 50
Department of Computer

Program No-6

Problem Definition:

Program to implement Code Optimization technique (any one).

Theory:
Code Optimization: The code optimization in the synthesis phase is a program transformation technique,
which tries to improve the intermediate code by making it consume fewer resources (i.e. CPU, Memory) so
that faster-running machine code will result. Compiler optimizing process should meet the following
objectives :
• • The optimization must be correct, it must not, in any way, change the meaning of the program.
• • Optimization should increase the speed and performance of the program.
• • The compilation time must be kept reasonable.
• • The optimization process should not delay the overall compiling process.

When to Optimize? Optimization of the code is often performed at the end of the development stage since
it reduces readability and adds code that is used to increase the performance.
Dead Code Elimination:
Variable propagation often leads to making assignment statement into dead code.Dead code is one or more
than one code statements, which are:
• • Either never executed or unreachable,
• • Or if executed, their output is never used.

Thus, dead code plays no role in any program operation and therefore it can simply be eliminated.
There are some code statements whose computed values are used only under certain circumstances, i.e.,
sometimes the values are used and sometimes they are not. Such codes are known as partially dead-code.

SPCC(2021-2022) 51
Department of Computer
The above control flow graph depicts a chunk of program where variable ‘a’ is used to assign the output of
expression ‘x * y’. Let us assume that the value assigned to ‘a’ is never used inside the loop .Immediately
after the control leaves the loop, ‘a’ is assigned the value of variable ‘z’, which would be used later.
Here, as the name indicates, the codes that do not affect the program results are eliminated. It has a lot of
benefits including reduction of program size and running time. It also simplifies the program structure.
Dead code elimination is also known as DCE, dead code removal, dead code stripping, or dead code strip.
In this technique,
• • As the name suggests, it involves eliminating the dead code.
• • The statements of the code which either never executes or are unreachable or their output is never
used are eliminated.

EXAMPLE: C =
a*b
x=a
till
d=a*b+4
//After
elimination :
c=a*b
till

Common sub expression elimination:


In compiler theory, common subexpression elimination (CSE) is a compiler optimization that searches for
instances of identical expressions (i.e., they all evaluate to the same value), and analyzes whether it is
worthwhile replacing them with a single variable holding the computed value. The possibility to perform
CSE is based on available expression analysis (a data flow analysis). An expression b*c is available at a
point p in a program if:
• • every path from the initial node to p evaluates b*c before reaching p,
• • and there are no assignments to b or c after the evaluation but before p.

The cost/benefit analysis performed by an optimizer will calculate whether the cost of the store to tmp is
less than the cost of the multiplication; in practice other factors such as which values are held in which
registers are also significant.
Compiler writers distinguish two kinds of CSE:
• • local common subexpression elimination works within a single basic block
• • global common subexpression elimination works on an entire procedure,

Both kinds rely on data flow analysis of which expressions are available at which points in a program.
In this technique,
• • As the name suggests, it involves eliminating the common sub expressions.
• • The redundant expressions are eliminated to avoid their re-computation.
• • The already computed result is used in the further program when required.

SPCC(2021-2022) 52
Department of Computer

Example-

Code Before Optimization Code After Optimization


S1 = 4 x I S1 = 4 x i
S2 = a[S1] S2 = a[S1]
S3 = 4 x j S3 = 4 x j
S4 = 4 x i // Redundant Expression S5 = n
S5 = n S6 = b[S1] + S5
S6 = b[S4] + S5

Code:
#include <stdio.h>
#include <conio.h>
#include <string.h>
struct op
{
char l;
char r[20]; }
op[10], pr[10];
void main()
{
int a,i,k,j,n,z = 0,m,q;
char *p,*l;
char temp, t;
char *tem;
printf("Enter no of values:");
scanf("%d",&n);
for (i = 0;i < n;i++) {
printf("\tleft\t");
op[i].l = getche();
printf("\tright:\t");
scanf("%s",op[i].r);
}

SPCC(2021-2022) 53
Department of Computer
printf("Intermediate Code\n");
for (i = 0; i < n; i++)
{
printf("%c=",op[i].l);
printf("%s\n",op[i].r);
}
for (i = 0;i < n-1;i++)
{
temp = op[i].l;
for (j = 0;j < n;j++)
{
p = strchr(op[j].r, temp);
if (p)
{
pr[z].l = op[i].l;
strcpy(pr[z].r, op[i].r);
z++;
}}}
pr[z].l = op[n - 1].l;
strcpy(pr[z].r, op[n - 1].r);
z++;
printf("\nAfter dead code elimination\n");
for (k = 0;k < z;k++)
{
printf("%c\t=", pr[k].l);
printf("%s\n", pr[k].r);
}
//sub expression elimination
for (m = 0;m < z;m++)
{
tem = pr[m].r;
for (j = m + 1;j < z;j++)
{
p = strstr(tem, pr[j].r);
if (p)
{
t = pr[j].l;
pr[j].l = pr[m].l;
for (i = 0;i < z;i++)
{
l = strchr(pr[i].r, t);
if (l)
{
a = l - pr[i].r;
//printf("pos: %d",a);
pr[i].r[a] = pr[m].l;
}}}}}
printf("Eliminate common expression\n");
for (i = 0; i < z; i++)

SPCC(2021-2022) 54
Department of Computer
{
printf("%c\t=", pr[i].l);
printf("%s\n", pr[i].r);}
// duplicate production elimination
for (i = 0;i < z;i++)
{
for (j = i + 1;j < z;j++)
{
q = strcmp(pr[i].r, pr[j].r);
if ((pr[i].l == pr[j].l) && !q)
{
pr[i].l = '\0';
strcpy(pr[i].r, '\0');
}}}
printf("Optimized code");
for (i = 0;i < z;i++)
{
if (pr[i].l != '\0')
{
printf("%c=", pr[i].l);
printf("%s\n", pr[i].r);
}}
getch();
}
Output:

SPCC(2021-2022) 55
Department of Computer

Program No-7

Problem Definition:

Program to implement code generation phase of compiler.

Theory:
Code generation can be considered as the final phase of compilation. Through post code generation,
optimization process can be applied on the code, but that can be seen as a part of code generation phase
itself. The code generated by the compiler is an object code of some lower-level programming language, for
example, assembly language. We have seen that the source code written in a higher-level language is
transformed into a lower-level language that results in a lower-level object code, which should have the
following minimum properties:
• • It should carry the exact meaning of the source code.
• • It should be efficient in terms of CPU usage and memory management.

We will now see how the intermediate code is transformed into target object code (assembly code, in this
case).
Example:
For input C Code :
t2=c+t1;
The Machine Code is :
MOV AX,c
ADD AX,t1
MOV t2,AX
Code:
#include<stdio.h>
int main()
{
char c[100];
char oper,arg1[10],arg2[10],res[10];
int i=0,r=0,a1p=1,rp=1;
printf("Enter C Code ");
scanf("%s",c);
while (c[i]!='\0')
{
if (c[i]==';')
break;
if (rp==1)
{
if (c[i]!='=')
{
res[r]=c[i];
r++;
}

SPCC(2021-2022) 56
Department of Computer
else
{
rp=0;
res[r]='\0';
r=0;
}
}
else if (a1p==1)
{
if (c[i]!='*' && c[i]!='+' & c[i]!='-') {
arg1[r]=c[i];
r++;
}
else
{
a1p=0;
arg1[r]='\0';
r=0;
oper=c[i];
}
}
else
{
arg2[r]=c[i];
r++;
}
i++;
}
arg2[r]='\0';
printf("Machine Code is \n");
printf("MOV AX,%s\n",arg1);
if (oper=='*')
printf("MUL %s\n",arg2);
if (oper=='+')
printf("ADD AX,%s\n",arg2);

SPCC(2021-2022) 57
Department of Computer
(oper=='-')
printf("SUB AX,%s\n",arg2);
printf("MOV %s,AX\n",res);
}
Output:
[User@localhost Desktop]$ gcc codegen.c
[User@localhost Desktop]$ ./a.out
Enter C Code d=t2;
Machine Code is
MOV AX,t2
MOV d,AX
[User@localhost Desktop]$ ./a.out
Enter C Code t1=a*b;
Machine Code is
MOV AX,a
MUL b
MOV t1,AX
[User@localhost Desktop]$ ./a.out
Enter C Code t2=c+t1;
Machine Code is
MOV AX,c
ADD AX,t1
MOV t2,AX

SPCC(2021-2022) 58
Department of Computer
Program No-8

Problem Definition:

Program to generate Macro Name Table.

Theory:
A macro instruction is a line of computer program coding that results in one or more lines of
program coding in the target programming language, sets variables for use by other statements, etc.. In the
mid 1950s, when assembly language programming was commonly used to write programs for digital
computers, the use of macro instructions was initiated for two main purposes: to reduce the amount of
program coding that had to be written by generating several assembly language statements from one macro
instruction and to enforce program writing standards, e.g. specifying input/output commands in standard
ways.

A macro instruction written in the format of the target assembly language would be processed by a
macro compiler, which was a pre-processor to the assembler, to generate one or more assembly language
instructions to be processed next by the assembler program that would translate the assembly language
instructions into machine language instructions.

MACRO NAME TABLE (MNT):


Macro name table serves function very similar to that of an assembler’s machine op-code table
(MOT) and Pseudo op-code table (POT). Each MNT entry consists of character string and pointer to entry
in MDT that corresponds to beginning of macro definition. The MNT entry for INCR macro:

index name MDT index


. . .
. . .
. . .
. . .
. . .
3 INCRbbbb 15
. . .
. . .
. . .
. . .
ALGORITHM:
1. Open the source code in ‘prog.txt’ file.
2. Use the file pointer to read the file.
3. Reach the end of file and then close the file.
4. Display the input program.
5. The scan the entire array and compare with the macro word.
6. Once macro is found the next name is the macro name.
7. Print the name as output. All the macro can be displayed using the program.
8. Hence macro name table is displayed as output.
9. Stop.
SPCC(2021-2022) 59
Department of Computer
CODE:

#include<iostream>
using namespace std;
#include<string.h>
#include<stdio.h>
int main(int argc, char **argv) {
char buf[100],mnt[10][100];
int flag = 0,flag1=0, cnt = 0, i=0;
FILE *in, *out;
if(argc == 1) {
cout<<"Error:Enter input source code\n";
} else {
in = fopen(argv[1], "r");
out = fopen("MNT", "w");
cout<<"INPUT:";
while(fgetc(in) != EOF) {
fseek(in, (ftell(in) - 1), SEEK_SET);
fgets(buf, 100, in);
cout<<buf;
if(flag==1 && flag1==0)
{
strncpy(mnt[i++],buf,4);
flag1=1;
}
if(strstr(buf,"MACRO")!='\0')
flag=1;
if(strstr(buf,"MEND")!='\0')
{
flag=0;
flag1=0;
}

}
int j=0;

cout<<"Macro Name\n";
char temp[100];
fputs("Macro Name\n",out);

while(j<i)
{
strcpy(temp,mnt[j]);
strcat(temp," \n");
fputs(temp,out);
cout<<mnt[j++]<<"\n";

}
}
SPCC(2021-2022) 60
Department of Computer
fclose(in);
return 0;
}

OUTPUT:
INPUT:
A 1,Data
MACRO
INC1
A 1,DATA
A 2,DATA
MEND
DATA DC F'10'
M 2,Data
MACRO
ADD2 &ARG1
A 1,&ARG1
MEND
INCR
ADD

Macro Name
INC1
ADD2

SPCC(2021-2022) 61
Department of Computer

Program No-9

Problem Definition:

Program to implement two pass Macro Processor.

Theory:

Pass-I (Macro-def) data structures

1. The I/P macro source program.


2. Macro-defination table
3. Macro name table
4. MDTC
5. MNTC
6. ALA
7. Copy of SP use by Pass-II

Pass-II (Macro-calls & Expansions) data tructures

1. The I/P macro source program.


2. Macro-defination table ,created by Pass-I
3. Macro name table, created by Pass-I
4. MDTC
5. ALA
6. Copy of expanded sp to be used by assembler

SPCC(2021-2022) 62
Department of Computer

SPCC(2021-2022) 63
Department of Computer

#include<stdio.h>
#include<conio.h>
#include<string.h>
#define max 20

char f_name[15];
FILE *in;
char buff[30];
int p_index=0,m_cntr=0,mdt_cntr=-1;

struct MDT
{
char str[10][30];
}MDT_obj[max];

struct MNT
{
char m_name[10];
int no_para;
int para_index;
int MDT_index;
}MNT_obj[max];

struct PNT
SPCC(2021-2022) 64
Department of Computer
{
char para_name[10],value[10];
}PNT_obj[max];

void pass1();
void pass2();

void main()
{

int z;
//int t_flag,i=0;
clrscr();
printf("\n\nINPUT: ");
scanf("%s",f_name);
for(z=0;z<strlen(f_name);z++)
{
if(f_name[z]=='.')
{
z=z+1;
if(f_name[z]=='a'||f_name[z]=='A')
{
z=z+1;
if(f_name[z]=='s'||f_name[z]=='S')
{
z=z+1;
if(f_name[z]=='m'||f_name[z]=='M')
{
z++;
if(f_name[z]=='\0')
break;
}
}
}
}
}

if((in=fopen(f_name,"r"))==NULL)
{
printf("\nFILE IS UNABLE TO OPEN \n");
}

else
{
while(!feof(in))
{
fgets(buff,30,in);
printf("%s",buff);

SPCC(2021-2022) 65
Department of Computer
}
fclose(in);
}
getch();
pass1();
getch();
fclose(in);
pass2();
getch();
}

void pass1()
{
char buff1[35];
int k=0,z,f1,f2,f3,x,i;
// int len=0;
int l;
in=fopen(f_name,"r");
clrscr();
printf("\t PASS 1 \n");
do
{
fscanf(in,"%s",&buff);
if(strcmpi(buff,"macro")==0)
{
f2=0;
f1=0;
z=0;
MNT_obj[m_cntr+1].MDT_index=m_cntr;
m_cntr++;
k=0;
mdt_cntr++;
x=0;
fgets(buff1,30,in);
fgets(buff1,30,in);
while(buff1[k]!='\0')
{
if(f2!=0)
{
if(buff1[k]=='&')
{
MNT_obj[m_cntr].no_para++;
z=0;

if(f1==0) //FOR READING FIRST PARAMETER


{
MNT_obj[m_cntr].para_index=p_index; //P_INDEX KEEPS TRACK OF TOTAL

SPCC(2021-2022) 66
Department of Computer
f1=1; //NO OF PARAMETERS IN PGM
}

while(buff1[k]!=' ')//||k<strlen(buff1))
// while(k<strlen(buff1))
{

PNT_obj[p_index].para_name[z]=buff1[k];

k++;
z++;

if(buff1[k]=='\0')
{
break;
}
}
p_index++;
// a:

}
}
if(f2==0) //FOR READING MACRO NAME
{
while(buff1[k]!=' ')
{
MNT_obj[m_cntr].m_name[k]=buff1[k];
k++;
}
f2=1;
}
k++;

} //end of macro prototype stmt


// MNT_obj[m_cntr].no_para=z;
while(strcmpi(buff,"mend\n")!=0)
{
fgets(buff,30,in);
strcpy(MDT_obj[mdt_cntr].str[x],buff);
x++;
} //end of filling mdt

}
if(strcmpi(buff,"start:")==0)
{
int z;
int para_index;
do
{

SPCC(2021-2022) 67
Department of Computer
fscanf(in,"%s",&buff);
for(i=0;i<=m_cntr;i++)
{
if(strcmpi(buff,MNT_obj[i].m_name)==0)
{
para_index = MNT_obj[i].para_index;
for(z=0;z<MNT_obj[i].no_para;z++)
{
fscanf(in,"%s",&buff);
strcpy(PNT_obj[para_index].value,buff);
para_index++;
}
}
}
}while(!feof(in));
}

}while(!feof(in));

getch();
clrscr();
printf("\nMacro Name Table\n");
printf("\n\n PARA NO PARE INDEX MACRO NAME");

for(k=1;k<=m_cntr;k++)
{
printf("\n%d \t\t %d \t %s",MNT_obj[k].no_para
,MNT_obj[k].para_index,MNT_obj[k].m_name);
}
getch();
//Display PNT
clrscr();
printf("\nParameter Name Table\n");
printf("\n\n\nPARA NAME VALUE");
for(k=0;k<=p_index;k++)
{
printf("\n %s\t\t%s",PNT_obj[k].para_name,PNT_obj[k].value);
}
getch();
//DISPLAY MDT
clrscr();
printf("\n\tMacro Definition Table\n");
for(k=0;k<=mdt_cntr;k++)
{
printf("\n MDT INDEX IN MNT : %d \n\n\n CONTENTS : \n",k);
for(i=0;i<10;i++)
{
printf("\n %s",MDT_obj[k].str[i]);
}

SPCC(2021-2022) 68
Department of Computer
getch();
}

void pass2()
{
FILE *out;
int k,i,z,x,j,f1=0;
// char buff[50];
clrscr();
getch();
printf("\n\tPASS 2\n");
if((out = fopen(f_name,"r")) == NULL)
{
printf("Error");
}
else
{
do
{ fscanf(out,"%s",&buff);
printf("\n%s",buff);
if(strcmpi(buff,"start:")==0)
{
do
{
for(i=1;i<=m_cntr;i++)
{
if(strcmpi(buff,MNT_obj[i].m_name)==0)
{

for(k=MNT_obj[i].para_index;k<(MNT_obj[i].para_index+MNT_obj[i].no_para);k++)
{
fscanf(out,"%s",&buff);
strcpy(PNT_obj[k].value,buff);
}
}
}
fscanf(out,"%s",&buff);
}while(!feof(out));
}
}while(!feof(out));
fclose(out);
for(i=0;i<=mdt_cntr;i++)
{ //i
for(k=0;k<10;k++)
{ //k
for(z=0;z<strlen(MDT_obj[i].str[k]);z++)
{ //z

SPCC(2021-2022) 69
Department of Computer
if(MDT_obj[i].str[k][z]=='&')
{ //**
z=z+2;
for(x=0;x<=p_index;x++)
{ //*
for(j=0;j<strlen(PNT_obj[x].para_name);j++)
{
if(MDT_obj[i].str[k][z]==PNT_obj[x].para_name[j])
{
z=z-2;

MDT_obj[i].str[k][z]=' ';
z++;
MDT_obj[i].str[k][z]=' ';
z++;
MDT_obj[i].str[k][z]=PNT_obj[x].value[0];

}
}

} //*

} //**
} //z

} //k

} //i

if((out = fopen(f_name,"r")) == NULL)


{
printf("Error");
}
else
{
do
{ fscanf(out,"%s",&buff);
//printf("\n%s",buff);
if(strcmpi(buff,"start:")==0)
{
printf("\n%s ",buff);
do
{
f1=0;
fscanf(out,"%s",&buff);
for(i=0;i<=m_cntr;i++)
{
// printf("%d %s",i,MNT_obj[i].m_name);
if(strcmpi(buff,MNT_obj[i].m_name)==0)

SPCC(2021-2022) 70
Department of Computer
{
f1=1;
k=0;
while((strcmpi(MDT_obj[MNT_obj[i].MDT_index].str[k],"mend\n"))!=0)
{
printf("\n + %s",MDT_obj[MNT_obj[i].MDT_index].str[k]);
k++;
}

fgets(buff,30,out);
}
}
if(f1==0)
{
printf("\n%s",buff);
if (strcmpi(buff,"end")==0)
{ break;
}
fgets(buff,30,out);
printf("%s",buff);
f1=1;
}
}while(!feof(out));
}
}while(!feof(out));
}
} //else
}

OUTPUT:

INPUT FILE NAME: arp.asm


MACRO
XYZ &A
ST 1,&A
MEND
MACRO
MIT &Z
MACRO
&Z &W
AR 4,&W
XYZ ALL
MEND
arp START
USING *,15
MIT HELLO
ST 2,3
HELLO YALE

SPCC(2021-2022) 71
Department of Computer
YALE EQU 5

::PASS1::

::MNT::
NO OF PARA PARE INDEX MACRO NAME
1 0 XYZ
1 1 MIT

::PNT::
PARA NAME VALUE
&A
&Z
::MDT::
MDT INDEX IN MNT : 0
CONTENTS :
ST 1,&A
MEND

MDT INDEX IN MNT : 1


CONTENTS :
MACRO
&Z &W
AR 4,&W
XYZ ALL
MEND
::PASS 2::

MACRO
XYZ
&A
ST
1,&A
MEND
MACRO
MIT
&Z
MACRO
&Z
&W
AR
4,&W
XYZ
ALL

SPCC(2021-2022) 72
Department of Computer
MEND
arp
START
USING
*,15
MIT
HELLO
ST
2,3
HELLO
YALE
YALE
EQU
5

SPCC(2021-2022) 73
Department of Computer

Program No-10

Problem Definition:

A case study on Lex and YACC.


Theory:
Lex: Lex is a lexical analyzer generator developed by M. E. Lesk and E. Schmidt. Ithelps write programs
whose control flow is directed by instances of regularexpressions in the input stream. It is well
suited for editor-script type transformationsandforsegmentinginputinpreparationforaparsing
routine.

Lex source is a table of regular expressions and corresponding program fragments. Thetableistranslatedtoa
programwhichreadsaninputstream,copyingittoan output stream and partitioning the input into
strings which match the given expressions. As each such string is recognized the corresponding program
fragmentisexecuted.Therecognitionoftheexpressionsisperformedbyadeterministic finite automaton
generated by Lex. The program fragments written by the user are
executedintheorderinwhichthecorrespondingregularexpressionsoccurinthe input stream.

Lex is a program generator designed for lexical processing of character input streams. It accepts a
high-level, problem-oriented specification for character string matching, and produces a program in a
general-purpose language which recognizes regular expressions. The regular expressions are specified by the
user in the source specifications given to Lex. The Lex written code recognizes these expressions in an input
stream and partitions the input stream into strings matching the expressions.At the boundaries between
strings program sections provided by the user are executed. Source file associates the regular
expressions and program fragments.

SPCC(2021-2022) 74
Department of Computer

The token names are the input symbols thatthe parser processes. For instance,integer, Boolean, begin,
end, if, while, etc. are tokens in ExpL.
“integer” {returnID_TYPE_INTEGER;}

This example demonstrates the specification of a rule in LEX. The rule in this example specifies
that the lexical analyzer must return the token named ID_TYPE_INTEGER when the pattern
“integer” isfoundintheinputfile.

As each expression appears in the input to the program written by Lex, the corresponding fragment is
executed. The user supplies the additional code beyond expression matching needed to complete his tasks,
possibly including code written by other generators. Thus, a high-level expression language is provided to
write the string expressions to be matched while the user's freedom is unimpaired.

SPCC(2021-2022) 75
Department of Computer

Structure of lex program: Lex program will be in the following form-

declarations
%%
translationrules
%%
auxiliary
functions

Declarations: This section includes declarations of variables, constants, and regular


definitions.

Translation rules: It contains regular expressions and code segments.

1. Form: Pattern{Action}
2. The pattern is aregular expression orregular definition.
3. Action refers to segments of code.
Auxiliary functions This section holds additional functions which are used in actions. These
functions are compiled separately and loaded with the lexical analyzer.

yy Variables: The following variables are offered by LEX to aid the programmer in designing
sophisticated lexical analyzers. These variables are accessible in the LEX program and are automatically
declared by LEX in lex.yy.c.

1. yyin is a variable of the type FILE* and points to the input file. yyin is defined by LEX
automatically. If the programmer assigns an input file to yyin in the auxiliary functions
section, then yyin is set to point to that file. Otherwise, LEX assigns yyin to stdin(console
input).

SPCC(2021-2022) 76
Department of Computer

2. yytext is of type char* and it contains the lexeme currently found. A lexeme is a
sequence of characters in the input stream that matches some pattern in the Rules Section.
(In fact, it is the first matching sequence in the input from the position pointed to by yyin.)
Each invocationofthefunction yylex() results in yytext carrying a pointer to thelexeme found
in the input stream by yylex(). The value of yytext will beoverwritten after the next yylex()
invocation.

3. yyleng is a variable of the type int and it stores the length of the lexeme pointed to by
yytext.

yy functions: The following functions offered by LEX are :

1. yylex() is a function of return type int. LEX automatically defines yylex()in lex.yy.c but does
not call it. The programmer must call yylex() in the Auxiliary functions section of the LEX
program. LEX generates code for the definition of yylex() according to the rules specified in the
Rules section.

2. yywrap(): LEX declares the function yywrap() of return-type int in the file lex.yy.c . LEX
does not provide any definition for yywrap(). yylex() makes a call to yywrap() when it
encounters the end of input. If yywrap()returns zero (indicating false) yylex() assumes there
is more input andit continues scanning from the location pointed to by yyin. If yywrap()
returns a non-zero value (indicating true), yylex() terminates the scanningprocess and returns 0
(i.e. “wraps up”). If the programmer wishes to scan more than one input file using the
generated lexical analyzer, it can be simply done by setting yyin to a new input file in
yywrap() and return 0. AsLEXdoes notdefineyywrap() in lex.yy.c file but makes a call to it
underyylex(), the programmer must define it in the Auxiliary functions sectionor provide a
%option noyywrap in the declarations section. This optionremoves the call to yywrap() in
the lex.yy.c file. Note that, it is mandatory to either define yywrap() or indicate the
absence using the
%optionfeature. If not, LEXwillflag an error.

Disambiguation rules: Lexfollowstworulestoresolveambiguitiesthatmayarise from the lex


specification. These rules are:

SPCC(2021-2022) 77
Department of Computer

1. When a string of input text can match two or more rules in a specification,
thefirstruleinthespecificationistheonethatismatchedandtheonewhose action isexecuted.
2. If a string in the input text can match a rule in the specification, but a longstring that has the
first string as its prefix will also match a rule, then thelonger stringmatches.
Asituation of the type addressed by the first rule could arise if aspecification had patterns to match
keywords and identifiers:

%{
#include "y.tab.h"
%}
%%
START { return(STARTTOK); }
BREAK {return(BREAKTOK); }
END { return(ENDTOK); }
[a-zA-Z][a-zA-Z0-9]* {return(yytext);}

Thestring"START" could bematched by both the first orthe fourth rule: because START is a reserved
word, you want only the action associated with the first rule to beexecuted.ByplacingtheruleforSTART
andtheotherreservedwordsbeforetheruleforidentifiers,youensurethatreservedwordsarerecognized
assuch.
Thesecond kind of ambiguity could arise, for example, if the input textwascoded in a programming
language that had similar operators. Part of a lex specification for the C language might look like
this:

"+" {return(PLUS); }
"++" {return(INC); }

The lexical analyzer should recognize the increment operator "++", not the addition operator "+", when it
reads the following statement:

i++;

SPCC(2021-2022) 78
Department of Computer

YACC:

YACC stands for Yet Another Compiler-Compiler and was made by Stephen C. Johnson. Computer
program input generally has some structure; in fact, everycomputer program that does input can be
thought of as defining an ``input language'' which it accepts. An input language may be as complex
as a programming language, or as simple as a sequence of numbers. Unfortunately, usual input
facilities are limited, difficult to use, and often are lax about checkingtheir inputs forvalidity.

Yacc provides a general tool for describing the input to a computer program. TheYaccuserspecifiesthe
structuresofhisinput,togetherwithcodetobeinvokedaseach such structure is recognized. Yacc turns such
a specification into a subroutine that han- dles the input process; frequently, it is convenient and
appropriate tohave most of the flow of control in the user's application handled by this subroutine.

Yacc provides a general tool for imposing structure on the input to a computerprogram. The Yacc user
prepares a specification of the input process; this includes rules describing the input structure, code to be
invoked when these rules are recognized, and a low-level routine to do the basic input. Yacc then
generates afunction tocontrol the input process. This function, called a parser, calls theuser- supplied
low-level input routine (the lexical analyzer) to pick up the basic items(called tokens) from the input
stream. These tokens are organized according to the input structure rules, called grammar rules; when
one of these rules has beenrecognized, then user code supplied for this rule, an action, is invoked;
actionshavetheability to returnvaluesandmakeuseofthevaluesofotheractions.

SPCC(2021-2022) 79
Department of Computer

Rules: The rules section of the grammar file contains one or more grammar rules. Each rule
describes a structure and gives it a name. A rule in a YACCprogram comprises of twoparts-
1. The productionpart.
2. The actionpart.
A rule in YACC is of the form:
production_head : production_body {action in C } ;

Actions: With each grammar rule, you can specify actions to be performed each time the parser
recognizes the rule in the input stream. An action is a Clanguage statement that does input and
output, calls subprograms, and alters external vectors and variables. Actions return values and
obtain the values returned by previous actions. The lexical analyzer can also return valuesfor tokens.
Specify an action in the grammar file with one or more statementsenclosed in {} (braces). The
following examples are grammar rules with actions:
A : '('B')'
{
hello(1, "abc" );
}

Declarations: The declarations section of the yacc grammar file contains thefollowing:
1. Declarationsforanyvariablesorconstantsusedinotherpartsofthegrammar file
2. #include statements to use other files as part of this file (used for libraryheader files)
3. Statementsthatdefineprocessingconditionsforthegeneratedparser
4. You can keep semantic information associated with the tokens that are currently on the
parse stack in a user-defined C language union if the members of the union are associated
with the various names in the grammar file.

AdeclarationforavariableorconstantusesthefollowingsyntaxoftheCprogramming language:
TypeSpecifier Declarator ;

SPCC(2021-2022) 80
Sample program- Department of Computer

LEX

%{
# define NUMBER 1
# define IDENTIFIER 2
# define HEADER 3
# define SYMBOL 4
# define OPERATOR 5
# define KEYWORD 6
%}
%%
[0-9]+ {return NUMBER;}
[a-zA-Z]+[.h]+ {return HEADER;}
[if|else|then|int|while|switch|do]+ {return KEYWORD;}
[@|$|,|&|%]* {return SYMBOL;}
[+|-|*|/] {return OPERATOR;}
[a-zA-Z][0-9]* {return IDENTIFIER;}
%%

#include<stdio.h>

int main(int argc,char *argv[])


{
int ch;

while(ch=yylex())
{
switch(ch)

{
case 1:
printf("%s","number",yytext);
break;

case 2:
printf("%s","identifier",yytext);
break;

case 3:
printf("%s","header",yytext);
break;

case 4:
printf("%s","symbol",yytext);
break;

SPCC(2021-2022) 81

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy