0% found this document useful (0 votes)
21 views

Unit I QB

compiler design question bank

Uploaded by

Jeeva R
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Unit I QB

compiler design question bank

Uploaded by

Jeeva R
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Unit- I

Part- A
1. What is the usage of sentinel in lexical analyzer? List its advantages. (Nov/Dec 2020)
The usage of sentinel helps to reduce the two tests that are required into one by
extending each buffer half to hold a sentinel character at the end. The sentinel is a
special character that should not be a part of the source code.
Sentinels are used to making a check, each time when the forward pointer is
converted, a check is completed to provide that one half of the buffer has not converted
off. If it is completed, then the other half should be reloaded.

2. Construct regular expression for the binary string that starts with 0 and has odd length
or that starts with 1 and has even length. (Nov/Dec 2020)
(0(00+11)*)+(1(0+1)(0+1)*)

3. Define tokens, patterns and lexemes. (Nov/Dec 2020) (Nov/Dec 2018) (April/May
2019) (Nov/Dec 2021)
Token: Token is basically a sequence of characters that are treated as a unit as it cannot
be further broken down.
Pattern: A rule for formation of token from input characters.
Lexeme: It’s a sequence of characters in source program matched by a pattern for a
token.

4. Mention the issues in a lexical analyzer. (Nov/Dec 2020)


 Simpler design is the most important consideration.
 The separation of lexical analysis from syntax analysis often allows us to
simplify one or the other of these phases.
 Compiler efficiency is improved.
 Compiler portability is enhanced.

5. Recall the basic two parts of compilation process. (Nov/Dec 2018)


Analysis and Synthesis are the two parts of compilation. The analysis part
breaks up the source program into constituent pieces and creates an intermediate
representation of the source program. The synthesis part constructs the desired target
program from the intermediate representation.

6. How a source code is translated into machine code? (Nov/Dec 2018)


A compiler takes the source code as a whole and translates it into machine code
all in one go. Once converted, the machine code can be run at any time. This process is
called compilation.
 Compiling the program (Compiler)
 Linking the program (Linker)
 Executing the program (Loader)

7. State the rules to define regular expression. (Nov/Dec 2018)


Every letter of ∑ can be made into a regular expression, null string, ∈ itself is a
regular expression.
If r1 and r2 are regular expressions, then (r1), r1. r2, r1+r2, r1*, r1+ are also
regular expressions.

8. Construct RE for the language L={w∈{a,b}/w ends in abb} (Nov/Dec 2018)


(a+b)*abb

9. What advantages are there to a language-processing system in which the compiler


produces assembly language rather than machine language? (Nov/Dec 2020)
The compiler may produce an assembly-language program as its output,
because assembly language is easier to produce as output and is easier to debug. The
assembly language is then processed by a program called an assembler that produces
relocatable machine code as its output.

10. With a neat block diagram specify the interactions between the lexical analyzer and the
parser. (Nov/Dec 2020)

11. State the various error recovery strategies used in a parser to correct the errors.
(Nov/Dec 2020) (Nov/Dec 2021)
 Panic mode.
 Statement mode.
 Error productions.
 Global correction.
 Abstract Syntax Trees.

12. List out the phases included in analysis phases of the compiler. (April/May 2022)
 Lexical analysis
 Syntax analysis
 Semantic analysis

13. Write down the CFG for the set of odd length strings in (a,b}* whose first, middle and
last symbols are the same. (April/May 2022)
S -> a | b | aAa | bBb
A -> aAa | aAb | bAa | bAb | a
B -> aBa | aBb | bBa | bBb | b

14. What are the compiler construction tools? (Nov/Dec 2020)


 Parser Generator
 Scanner Generator
 Syntax Directed Translation Engines
 Automatic Code Generators
 Data-Flow Analysis Engines
 Compiler Construction Toolkits

15. Define sentinels. (Nov/Dec 2020)


The sentinel is a special character that cannot be part of the source program, and
a natural choice is the character EOF. Note that EOF retains its use as a marker for the
end of the entire input. Any EOF that appears other than at the end of a buffer means
that the input is at an end.

16. Name a compiler construction tool used to design lexical analyser and parser. (Nov/Dec
2019)
 Parser Generator
This software produces syntax analysis which takes input in the form of
the syntax of a programming language depends on context-free grammar.
 Scanner Generator
These generators create the lexical analysis. The fundamental lexical
analysis is produced by Finite Automata which takes input in the form of regular
expressions.

17. Which phases of the compiler (Nov/Dec 2019)


(i) is/are considered as backend?
Analysis phase
(ii) access the symbol table?
Analysis and Synthesis phase
(iii) Check for type mismatches?
Semantic analysis
(iv) is independent of underlying machine?
Synthesis phase

18. Consider the language of all strings from the alphabet{a,b,c} containing the substring
“abcabb”. Write a regular expresssion that describes this language. (Nov/Dec 2019)
(a+b+c)*abcabb(a+b+c)*

19. Give any two reasons for keeping lexical analyser a separate phase instead of making
it an integral part of syntax analysis. (Nov/Dec 2019)
1) Simpler design. Separation allows the simplification of one or the other.
2) Compiler efficiency is improved. Optimization of lexical analysis because a large
amount of time is spent reading the source program and partitioning it into tokens.

20. List the attributes stored in symbol table (April/May 2019)


 Variable names and constants.
 Procedure and function names.
 Literal constants and strings.
 Compiler generated temporaries.
 Labels in source languages.

21. Why is compiler essential? (April/May 2019)


Compilers are an essential part of software development. Compilers allow
developers to write programs in high-level languages that humans can understand, but
then convert that high-level language into a form that only a machine can read.

22. List the cousins of the compiler. (Nov/Dec 2021)


These essential tasks are performed by the preprocessor, assembler, Linker,
and Loader. They are known as the Cousins of the Compiler.

23. What is the difference between compiler and interpreter? (April/May 2023)
A Compiler takes a program as a whole. An Interpreter takes single lines of a
code. The Compilers generate intermediate machine codes. The Interpreters never
generate any intermediate machine codes.

24. Define Compiler. (Nov/Dec 2023)


A compiler is a software program that translates source code written in a high-level
programming language into machine code or an intermediate form that a computer's
processor can execute. The process typically involves several stages, including lexical
analysis, syntax analysis, semantic analysis, optimization, and code generation.

25. What is automata? (Nov/Dec 2023)


Automata refers to a mathematical model of computation that describes an
abstract machine or system capable of performing tasks automatically through a
sequence of states.
It is used to study the behavior of systems in computer science and formal
language theory. Automata theory involves various types of automata, such as finite
automata, pushdown automata, and Turing machines, each with different capabilities
and applications in recognizing patterns, processing languages, and solving
computational problems.
Part- B
1. What are the different error recovery strategies in phases of a compiler? (Nov/Dec
2020) (Nov/Dec 2018) (Nov/Dec 2021)
1. Panic mode
When a parser encounters an error anywhere in the statement, it ignores the rest
of the statement by not processing input from erroneous input to delimiter, such as
semi-colon. This is the easiest way of error-recovery and also, it prevents the parser
from developing infinite loops.

2. Statement mode
When a parser encounters an error, it tries to take corrective measures so that
the rest of inputs of statement allow the parser to parse ahead. For example,
inserting a missing semicolon, replacing comma with a semicolon etc. Parser
designers have to be careful here because one wrong correction may lead to an
infinite loop.

3. Error productions
Some common errors are known to the compiler designers that may occur in the
code. In addition, the designers can create augmented grammar to be used, as
productions that generate erroneous constructs when these errors are encountered.

4. Global correction
The parser considers the program in hand as a whole and tries to figure out what
the program is intended to do and tries to find out a closest match for it, which is
error-free. When an erroneous input (statement) X is fed, it creates a parse tree for
some closest error-free statement Y. This may allow the parser to make minimal
changes in the source code, but due to the complexity (time and space) of this
strategy, it has not been implemented in practice yet.

5. Abstract Syntax Trees


Parse tree representations are not easy to be parsed by the compiler, as they
contain more details than actually needed. Take the following parse tree as an
example:

If watched closely, we find most of the leaf nodes are single child to their parent
nodes. This information can be eliminated before feeding it to the next phase. By
hiding extra information, we can obtain a tree as shown below:
Abstract tree can be represented as:

ASTs are important data structures in a compiler with least unnecessary


information. ASTs are more compact than a parse tree and can be easily used by a
compiler.

2. What are the tools used for constructing a compiler? (Nov/Dec 2020) (Nov/Dec 2018)
(April/May 2019) (Nov/Dec 2021)
Compiler Construction Tools are specialized tools that help in the
implementation of various phases of a compiler. These tools help in the creation of an
entire compiler or its parts.
The compiler constructions tools are:-
 Parser Generator
 Scanner Generator
 Syntax Directed Translation Engines
 Automatic Code Generators
 Data-Flow Analysis Engines
 Compiler Construction Toolkits
Parser Generator
Parser Generator produces syntax analyzers (parsers) based on context-free grammar
that takes input in the form of the syntax of a programming language. It's helpful
because the syntax analysis phase is quite complex and takes more compilation and
manual time.
Scanner Generator
Scanner Generator generates lexical analyzers from the input that consists of
regular expression descriptions based on tokens of a language. It generates a finite
automaton to identify the regular expression. Example: LEX is a scanner generator
provided by UNIX systems.

Syntax Directed Translation Engines


Syntax Directed Translation Engines take a parse tree as input and generate
intermediate code with three address formats. These engines contain routines to traverse
the parse tree and generate intermediate code. Each parse tree node has one or more
translations associated with it.

Automatic Code Generators


Automatic Code Generators take intermediate code as input and convert it into
machine language. Each intermediate language operation is translated using a set of
rules and then sent into the code generator as an input. A template matching process is
used, and by using the templates, an intermediate language statement is replaced by its
machine language equivalent.

Data-Flow Analysis Engines


Data-Flow Analysis Engines is used for code optimization and can generate an
optimized code. Data flow analysis is an essential part of code optimization that collects
the information, the values that flow from one part of a program to another.

Compiler Construction Toolkits


Compiler Construction Toolkits provide an integrated set of routines that helps
in creating compiler components or in the construction of various phases of a compiler.

3. What do you mean by passes and phases of a compiler ? Explain the various phases of
a compilation with neat diagram. (Nov/Dec 2020) (Nov/Dec 2021) (Nov/Dec 2023)
1. Lexical Analysis:
Lexical analyzer phase is the first phase of compilation process. It takes
source code as input. It reads the source program one character at a time and
converts it into meaningful lexemes. Lexical analyzer represents these lexemes
in the form of tokens.
2. Syntax Analysis
Syntax analysis is the second phase of compilation process. It takes tokens as
input and generates a parse tree as output. In syntax analysis phase, the parser
checks that the expression made by the tokens is syntactically correct or not.
3. Semantic Analysis
Semantic analysis is the third phase of compilation process. It checks whether the
parse tree follows the rules of language. Semantic analyzer keeps track of identifiers,
their types and expressions. The output of semantic analysis phase is the annotated
tree syntax.
4. Intermediate Code Generation
In the intermediate code generation, compiler generates the source code into the
intermediate code. Intermediate code is generated between the high-level language
and the machine language. The intermediate code should be generated in such a way
that you can easily translate it into the target machine code.
5. Code Optimization
Code optimization is an optional phase. It is used to improve the intermediate
code so that the output of the program could run faster and take less space. It removes
the unnecessary lines of the code and arranges the sequence of statements in order
to speed up the program execution.

6. Code Generation
Code generation is the final stage of the compilation process. It takes the
optimized intermediate code as input and maps it to the target machine language.
Code generator translates the intermediate code into the machine code of the
specified computer.
i. Translate the statement pos : = init + rate * 60. (Nov/Dec 2020) (Nov/Dec 2019)

ii. Translate the statement a=(b+c)*(b+c)*2. (Nov/Dec 2018)

iii. Translate the statement SI=(p*n*r)/100. (April/May 2022)


iv. Translate the statement c=a+b*12. (April/May 2019)

v. Translate the statement amount=principle+rate*36.0 (April/May 2023)


3. Describe in detail about input buffering. (Nov/Dec 2020)
Lexical Analysis has to access secondary memory each time to identify tokens.
It is time-consuming and costly. So, the input strings are stored into a buffer and then
scanned by Lexical Analysis.
Lexical Analysis scans input string from left to right one character at a time to identify
tokens. It uses two pointers to scan tokens −
 Begin Pointer (bptr) − It points to the beginning of the string to be read.
 Look Ahead Pointer (lptr) − It moves ahead to search for the end of the token.
Example − For statement int a, b;
 Both pointers start at the beginning of the string, which is stored in the buffer.

 Look Ahead Pointer scans buffer until the token is found.

 The character ("blank space") beyond the token ("int") have to be examined before the
token ("int") will be determined.
 After processing token ("int") both pointers will set to the next token ('a'), & this process
will be repeated for the whole program.

A buffer can be divided into two halves. If the look Ahead pointer moves
towards halfway in First Half, the second half is filled with new characters to be read.
If the look Ahead pointer moves towards the right end of the buffer of the second half,
the first half will be filled with new characters, and it goes on.

Sentinels − Sentinels are used to making a check, each time when the forward pointer
is converted, a check is completed to provide that one half of the buffer has not
converted off. If it is completed, then the other half should be reloaded.
Buffer Pairs − A specialized buffering technique can decrease the amount of overhead,
which is needed to process an input character in transferring characters. It includes two
buffers, each includes N-character size which is reloaded alternatively.
There are two pointers such as the lexeme Begin and forward are supported. Lexeme
Begin points to the starting of the current lexeme which is discovered. Forward scans
ahead before a match for a pattern are discovered. Before a lexeme is initiated, lexeme
begin is set to the character directly after the lexeme which is only constructed, and
forward is set to the character at its right end.
Preliminary Scanning − Certain processes are best performed as characters are moved
from the source file to the buffer. For example, it can delete comments. Languages like
FORTRAN which ignores blank can delete them from the character stream. It can also
collapse strings of several blanks into one blank. Pre-processing the character stream
being subjected to lexical analysis saves the trouble of moving the look ahead pointer
back and forth over a string of blanks.
4. Convert NFA to DFA (April/May 2022)

NFA Transition Table


0 1
q0 q3 {q1,q2}
q1 qf {}
q2 {} q3
q3 q3 qf
qf {} {}

DFA Transition Table


0 1
q0 q3 q1q2
q3 q3 qf
q1q2 qf q3
qf {} {}

6. Briefly discuss about the role of lexical analyzer. (Nov/Dec 2023)


 It produces stream of tokens.
 It eliminates comments and whitespace.
 It keeps track of line numbers.
 It reports the error encountered while generating tokens.
 It stores information about identifiers, keywords, constants and so on into symbol
table.
Lexical analyzers are divided into two processes:
a) Scanning consists of the simple processes that do not require tokenization of the
input, such as deletion of comments and compaction of consecutive whitespace
characters into one.
b) Lexical analysis is the more complex portion, where the scanner produces the
sequence of tokens as output.
Lexical Analysis versus Parsing / Issues in Lexical analysis
1. Simplicity of design: It is the most important consideration. The separation of
lexical and syntactic analysis often allows us to simplify tasks. whitespace and
comments removed by the lexical analyzer.
2. Compiler efficiency is improved. A separate lexical analyzer allows us to apply
specialized techniques that serve only the lexical task, not the job of parsing. In
addition, specialized buffering techniques for reading input characters can speed
up the compiler significantly.
3. Compiler portability is enhanced. Input-device-specific peculiarities can be
restricted to the lexical analyzer.
Tokens, Patterns, and Lexemes
A token is a pair consisting of a token name and an optional attribute value. The
token name is an abstract symbol representing a kind of single lexical unit, e.g., a
particular keyword, or a sequence of input characters denoting an identifier. Operators,
special symbols and constants are also typical tokens.
A pattern is a description of the form that the lexemes of a token may take. Pattern
is set of rules that describe the token. A lexeme is a sequence of characters in the source
program that matches the pattern for a token.
Table 1: Tokens and Lexemes
TOKEN INFORMAL DESCRIPTION SAMPLE LEXEMES
(PATTERN)
if characters i, f if
else characters e, l, s, e else
comparison < or > or <= or >= or == or != <=, !=
id Letter, followed by letters and digits pi, score, D2, sum, id_1, AVG
number any numeric constant 35, 14159, 0, 6.02e23
literal anything surrounded by “ ” “Core”, “Design” “Appasami”,
In many programming languages, the following classes cover most or all of the tokens:
1. One token for each keyword. The pattern for a keyword is the same as the keyword
itself.
2. Tokens for the operators, either individually or in classes such as the token
comparison mentioned in table
3. One token representing all identifiers.
4. One or more tokens representing constants, such as numbers and literal strings.
5. Tokens for each punctuation symbol, such as left and right parentheses, comma,
and semicolon

Attributes for Tokens


When more than one lexeme can match a pattern, the lexical analyzer must provide
the subsequent compiler phases additional information about the particular lexeme that
matched.
The lexical analyzer returns to the parser not only a token name, but an attribute value
that describes the lexeme represented by the token.
The token name influences parsing decisions, while the attribute value influences
translation of tokens after the parse.
Information about an identifier - e.g., its lexeme, its type, and the location at which
it is first found (in case an error message) - is kept in the symbol table.
Thus, the appropriate attribute value for an identifier is a pointer to the symbol-
table entry for that identifier.
Example: The token names and associated attribute values for the Fortran
statement

E=M
* C ** 2 are written below as a sequence of pairs.
<id, pointer to symbol-table entry for E>
< assign_op >
<id, pointer to symbol-table entry for M>
<mult_op>
<id, pointer to symbol-table entry for C>
<exp_op>
<number, integer value 2 >
Note that in certain pairs, especially operators, punctuation, and keywords, there
is no need for an attribute value. In this example, the token number has been given an
integer-valued attribute.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy