Unit I QB
Unit I QB
Part- A
1. What is the usage of sentinel in lexical analyzer? List its advantages. (Nov/Dec 2020)
The usage of sentinel helps to reduce the two tests that are required into one by
extending each buffer half to hold a sentinel character at the end. The sentinel is a
special character that should not be a part of the source code.
Sentinels are used to making a check, each time when the forward pointer is
converted, a check is completed to provide that one half of the buffer has not converted
off. If it is completed, then the other half should be reloaded.
2. Construct regular expression for the binary string that starts with 0 and has odd length
or that starts with 1 and has even length. (Nov/Dec 2020)
(0(00+11)*)+(1(0+1)(0+1)*)
3. Define tokens, patterns and lexemes. (Nov/Dec 2020) (Nov/Dec 2018) (April/May
2019) (Nov/Dec 2021)
Token: Token is basically a sequence of characters that are treated as a unit as it cannot
be further broken down.
Pattern: A rule for formation of token from input characters.
Lexeme: It’s a sequence of characters in source program matched by a pattern for a
token.
10. With a neat block diagram specify the interactions between the lexical analyzer and the
parser. (Nov/Dec 2020)
11. State the various error recovery strategies used in a parser to correct the errors.
(Nov/Dec 2020) (Nov/Dec 2021)
Panic mode.
Statement mode.
Error productions.
Global correction.
Abstract Syntax Trees.
12. List out the phases included in analysis phases of the compiler. (April/May 2022)
Lexical analysis
Syntax analysis
Semantic analysis
13. Write down the CFG for the set of odd length strings in (a,b}* whose first, middle and
last symbols are the same. (April/May 2022)
S -> a | b | aAa | bBb
A -> aAa | aAb | bAa | bAb | a
B -> aBa | aBb | bBa | bBb | b
16. Name a compiler construction tool used to design lexical analyser and parser. (Nov/Dec
2019)
Parser Generator
This software produces syntax analysis which takes input in the form of
the syntax of a programming language depends on context-free grammar.
Scanner Generator
These generators create the lexical analysis. The fundamental lexical
analysis is produced by Finite Automata which takes input in the form of regular
expressions.
18. Consider the language of all strings from the alphabet{a,b,c} containing the substring
“abcabb”. Write a regular expresssion that describes this language. (Nov/Dec 2019)
(a+b+c)*abcabb(a+b+c)*
19. Give any two reasons for keeping lexical analyser a separate phase instead of making
it an integral part of syntax analysis. (Nov/Dec 2019)
1) Simpler design. Separation allows the simplification of one or the other.
2) Compiler efficiency is improved. Optimization of lexical analysis because a large
amount of time is spent reading the source program and partitioning it into tokens.
23. What is the difference between compiler and interpreter? (April/May 2023)
A Compiler takes a program as a whole. An Interpreter takes single lines of a
code. The Compilers generate intermediate machine codes. The Interpreters never
generate any intermediate machine codes.
2. Statement mode
When a parser encounters an error, it tries to take corrective measures so that
the rest of inputs of statement allow the parser to parse ahead. For example,
inserting a missing semicolon, replacing comma with a semicolon etc. Parser
designers have to be careful here because one wrong correction may lead to an
infinite loop.
3. Error productions
Some common errors are known to the compiler designers that may occur in the
code. In addition, the designers can create augmented grammar to be used, as
productions that generate erroneous constructs when these errors are encountered.
4. Global correction
The parser considers the program in hand as a whole and tries to figure out what
the program is intended to do and tries to find out a closest match for it, which is
error-free. When an erroneous input (statement) X is fed, it creates a parse tree for
some closest error-free statement Y. This may allow the parser to make minimal
changes in the source code, but due to the complexity (time and space) of this
strategy, it has not been implemented in practice yet.
If watched closely, we find most of the leaf nodes are single child to their parent
nodes. This information can be eliminated before feeding it to the next phase. By
hiding extra information, we can obtain a tree as shown below:
Abstract tree can be represented as:
2. What are the tools used for constructing a compiler? (Nov/Dec 2020) (Nov/Dec 2018)
(April/May 2019) (Nov/Dec 2021)
Compiler Construction Tools are specialized tools that help in the
implementation of various phases of a compiler. These tools help in the creation of an
entire compiler or its parts.
The compiler constructions tools are:-
Parser Generator
Scanner Generator
Syntax Directed Translation Engines
Automatic Code Generators
Data-Flow Analysis Engines
Compiler Construction Toolkits
Parser Generator
Parser Generator produces syntax analyzers (parsers) based on context-free grammar
that takes input in the form of the syntax of a programming language. It's helpful
because the syntax analysis phase is quite complex and takes more compilation and
manual time.
Scanner Generator
Scanner Generator generates lexical analyzers from the input that consists of
regular expression descriptions based on tokens of a language. It generates a finite
automaton to identify the regular expression. Example: LEX is a scanner generator
provided by UNIX systems.
3. What do you mean by passes and phases of a compiler ? Explain the various phases of
a compilation with neat diagram. (Nov/Dec 2020) (Nov/Dec 2021) (Nov/Dec 2023)
1. Lexical Analysis:
Lexical analyzer phase is the first phase of compilation process. It takes
source code as input. It reads the source program one character at a time and
converts it into meaningful lexemes. Lexical analyzer represents these lexemes
in the form of tokens.
2. Syntax Analysis
Syntax analysis is the second phase of compilation process. It takes tokens as
input and generates a parse tree as output. In syntax analysis phase, the parser
checks that the expression made by the tokens is syntactically correct or not.
3. Semantic Analysis
Semantic analysis is the third phase of compilation process. It checks whether the
parse tree follows the rules of language. Semantic analyzer keeps track of identifiers,
their types and expressions. The output of semantic analysis phase is the annotated
tree syntax.
4. Intermediate Code Generation
In the intermediate code generation, compiler generates the source code into the
intermediate code. Intermediate code is generated between the high-level language
and the machine language. The intermediate code should be generated in such a way
that you can easily translate it into the target machine code.
5. Code Optimization
Code optimization is an optional phase. It is used to improve the intermediate
code so that the output of the program could run faster and take less space. It removes
the unnecessary lines of the code and arranges the sequence of statements in order
to speed up the program execution.
6. Code Generation
Code generation is the final stage of the compilation process. It takes the
optimized intermediate code as input and maps it to the target machine language.
Code generator translates the intermediate code into the machine code of the
specified computer.
i. Translate the statement pos : = init + rate * 60. (Nov/Dec 2020) (Nov/Dec 2019)
The character ("blank space") beyond the token ("int") have to be examined before the
token ("int") will be determined.
After processing token ("int") both pointers will set to the next token ('a'), & this process
will be repeated for the whole program.
A buffer can be divided into two halves. If the look Ahead pointer moves
towards halfway in First Half, the second half is filled with new characters to be read.
If the look Ahead pointer moves towards the right end of the buffer of the second half,
the first half will be filled with new characters, and it goes on.
Sentinels − Sentinels are used to making a check, each time when the forward pointer
is converted, a check is completed to provide that one half of the buffer has not
converted off. If it is completed, then the other half should be reloaded.
Buffer Pairs − A specialized buffering technique can decrease the amount of overhead,
which is needed to process an input character in transferring characters. It includes two
buffers, each includes N-character size which is reloaded alternatively.
There are two pointers such as the lexeme Begin and forward are supported. Lexeme
Begin points to the starting of the current lexeme which is discovered. Forward scans
ahead before a match for a pattern are discovered. Before a lexeme is initiated, lexeme
begin is set to the character directly after the lexeme which is only constructed, and
forward is set to the character at its right end.
Preliminary Scanning − Certain processes are best performed as characters are moved
from the source file to the buffer. For example, it can delete comments. Languages like
FORTRAN which ignores blank can delete them from the character stream. It can also
collapse strings of several blanks into one blank. Pre-processing the character stream
being subjected to lexical analysis saves the trouble of moving the look ahead pointer
back and forth over a string of blanks.
4. Convert NFA to DFA (April/May 2022)
E=M
* C ** 2 are written below as a sequence of pairs.
<id, pointer to symbol-table entry for E>
< assign_op >
<id, pointer to symbol-table entry for M>
<mult_op>
<id, pointer to symbol-table entry for C>
<exp_op>
<number, integer value 2 >
Note that in certain pairs, especially operators, punctuation, and keywords, there
is no need for an attribute value. In this example, the token number has been given an
integer-valued attribute.