0% found this document useful (0 votes)
52 views4 pages

Reference To An Entity That Precedes Its Definition in The Program Is Called Forward Reference.) Handling The Forward Reference in A

1. Compilers convert source code to executable object code. Interpreters directly execute instructions without compilation. Assemblers convert assembly language to machine code. Linkers combine object code modules. Loaders place object code in memory for execution. 2. Pass 2 of an assembler reads the intermediate file line by line. It writes object code records, looking up opcodes and operands. Undefined symbols cause errors. 3. Advanced macro facilities include conditional expansion using sequencing symbols, expansion time variables, and altering control flow with statements like AIF, AGO, and ANOP.

Uploaded by

solo 2002
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views4 pages

Reference To An Entity That Precedes Its Definition in The Program Is Called Forward Reference.) Handling The Forward Reference in A

1. Compilers convert source code to executable object code. Interpreters directly execute instructions without compilation. Assemblers convert assembly language to machine code. Linkers combine object code modules. Loaders place object code in memory for execution. 2. Pass 2 of an assembler reads the intermediate file line by line. It writes object code records, looking up opcodes and operands. Undefined symbols cause errors. 3. Advanced macro facilities include conditional expansion using sequencing symbols, expansion time variables, and altering control flow with statements like AIF, AGO, and ANOP.

Uploaded by

solo 2002
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Write note on buffers for the I/O devices and makes sure that What are the advanced

and makes sure that What are the advanced macro facilities?
Compiler,interpreter,assembler,linker,loader? they work correctly. The advanced macro facilities can be grouped
Compiler: A compiler is a software program that • User Programs into
compiles program source code files into an This is the highest layer in the layered operating 1. Facilities to alter the flow of control during
executable program.It takes entire program and system. This layer deals with the many user expansion
converts it into object code which is typically programs and applications that run in an 2. Expansion time variables
stored in a file. The object code is also refereed as operating system such as word processors,
binary code and can be directly executed by the games, browsers etc. 3. Attributes of parameters
machine after linking. Examples of compiled Explain about the design of an assembler? 1. Altering flow control during
programming languages are C and C++. The assembly process is divided into two phases- expansion(conditional expansion)
Interpreter:The interpreter transforms the high- ANALYSIS, SYNTHESIS.The primary function It can be achieved through the use of
level program into an intermediate language that it of the analysis phase is building the symbol 1.Expansion time sequencing symbols
then executes, or it could parse the high-level table.(For this, it uses the addresses of symbolic 2.Expansion time statements AIF,AGO and
source code and then performs the commands names in the program (memory allocation). For ANOP
directly, which is done line by line or statement by this, a data structure called location counter is A sequencing symbol which has the syntax
statement. It directly executes instructions written used, which points to the next memory word of .<ordinary string>
in a programming or scripting language without target program. This is called LC processing.) is defined by putting it in the label field of
previously converting them to an object code or Meanwhile, synthesis phase uses both symbol statement in the macro body
machine code. Examples of interpreted languages table for symbolic names and mnemonic table for • An AIF ststement has the syntax
are Perl, Python and Matlab. the accepted list of mnemonics. Thus, the machine AIF(<expression>)<sequencing symbol>
Assembler:An assembler is a program that code is obtained. where <expression> is a relational expression
converts assembly language into machine code. It Analysis phase: It is used to specify the branching condition.
takes the basic commands and operations from • Isolate label, mnemonic opcode and This statements provides conditional branching
assembly code and converts them into binary code operand fields of a statement. facility.
that can be recognized by a specific type of
• If a label is present, enter the pair (symbol, ) to • An AGO ststement has the syntax
processor
symbol table. AGO<sequencing symbol>
Linkers: A linker is a program that allows a user
This statement provides the unconditional
to link library programs or separate modules of • Check validity of mnemonic opcode using branching facilities.
code into their own programs. It is used to mnemonic table. We do not specify the condition
combine different modules of object code into one
• Perform LC processing. • ANOP statement is written as
single executable code program. This may involve
Synthesis phase: <sequencing symbol>ANOP
combining a program with library programs, or Write down the pass 2 algorithm of assembler?
involve recombining blocks of object code from • Obtain machine code for the mnemonic from This statement performs no operations.
the mnemonic table. read first input line from intermediate file if
the same program, or a mixture of both.
OPCODE = ‘START’ then write listing line read
Loaders: A loader is a piece of software that • Obtain address of memory operand from 2. Expansion time variable Expansion time
next input line write Header record to object
chooses exactly where to put object code in RAM, symbol table. variables (EVs) are variables, which can only be
program initialize first Text record while
ready for it to be run. It also adjusts the memory used during the expansion of the macro calls.An
OPCODE ‘END’ do if this is not a comment line EV can be local in scop(in which case is created
references in programs.The job of a piece of
then search OPTAB for OPCODEif found then if
software known as a loader is to take the object for the use during a particular macro call) or
there is a symbol in OPERAND field then search
code generated by compilation and to find a global in scop(in which case it exists across all the
SYMTAB for OPERAND if found then store symbol
'good' place for it in RAM, where it can then be macro calls situated in a program) • Local and
value as operand address else store 0 as operand
executed. global Evs are created through declaration
address and set error flag (undefined symbol) statements with the following syntax: • LCL <EV
else store 0 as operand address assemble the
Explain about layered structure of operating specification>[,<EV specification>...]
object code instructions else if OPCODE = ‘BYTE’
system? An operating system is a construct that
or ‘WORD’ then convert constant to object code if
allows the user application programs to interact • GBL <EV specification>[,<EV
object code will not fit into the current Text specification>...]
with the system hardware. Since the operating
record then write Text record to object file
system is such a complex structure, it should be Explain about the pass structure of an assembler?
initialize new Text record add object code to Text Where <EV specification> has the syntax &<EV
created with utmost care so it can be used and A pass is defined as one complete scan of the
source program, or its equivalent representation. record write listing line read next input line write name>
modified easily. An easy way to do this is to
Single Pass Assemblers last Text record to object file write End record to
create the operating system in parts. Each of
In single pass assembler the translation of object program write last listing line 3.Attributes of formal parameters • An attribute
these parts should be well defined with clear
assembly language program into object program is is written using the syntax
inputs, outputs and functions. done in only one pass. The source program is read What are the different ways of specifying macro
only once. These assemblers suffer the problem of instruction parameters? There are generally 2 <attribute name>’ <formal parameter spec>
forward reference.(Forward reference: The
reference to an entity that precedes its definition ways of specifying parameters: And represents information about the value of the
in the program is called forward formal parameter. The type,size and length
reference.)Handling the forward reference in a Positional parameters attributes have the names T,S and L.
single pass assembler is difficult. This type of
assemblers avoids forward references. Example:
Two Pass Assemblers
The two pass assemblers are widely used and the MACRO
translation process is done in two passes. The two DCLCONST &A
pass assemblers resolve the problem of forward AIF (L’&A EQ 1) .NEXT
As seen from the image, each upper layer is built
references conveniently. An assembler, which .NEXT .....
on the bottom layer. All the layers hide some
goes through an assembly language program .....
structures, operations etc from their upper layers.
twice, is called a two pass assembler. During the ......
One problem with the layered structure is that
first pass its collects all labels. During the second MEND
each layer needs to be carefully defined. This is Keyword parameters
pass it produces the machine instruction and Here expansion time control is transferred to the
necessary because the upper layers can only use
the functionalities of the layers below them. assigns address to each of them. It assigns ststement having .NEXT in the label field only
There are six layers in the layered operating addresses to labels by counting their position actual parameter corresponding to the formal
system. A diagram demonstrating these layers is from the starting address. parameter A has lenth of ‘1’.
as follows: What are the data structures used in
assembler design? Explain about macro processor?
– Opcode, Instruction format, and length
The macroprocessor accepts an assembly
– Used to look up mnemonics operation code and
translate them in to machine language equivalents. language program containing macro definitions
and macro calls and translate it into assembly
– Label name and value, error flags
Illustrate the default parameters with an program,which does not contain any macro
– Used to store the values(address) assigned to example? definition or calls.The program output from the
labels. macro processor can now be handed over to an
– Include the name and values for each label. assembler to obtain the target language form of
– Flags to indicate error condition.eg. Duplicate the program.Thus the macro processor separates
definition of label. macro expansion from the process of program
assembly.
OPTAB, SYMBOL TABLE, LOCATION
• Hardware COUNTER(LOC)
This layer interacts with the system hardware and OPTAB (Operation Code Table )
coordinates with all the peripheral devices used SYMTAB(Symbol Table) What is the meaning of mixed parameter list?
such as printer, mouse, keyboard, scanner etc. The LOCCTR (Location Counter)
hardware layer is the lowest layer in the layered Initialed to the beginning address specified in
operating system architecture. to the START statement.
• CPU Scheduling After each source statement is processed, the
This layer deals with scheduling the processes for length of the assembled instruction to be generated
the CPU. There are many scheduling queues that is added.
are used to handle processes. When the processes Write down the pass 1 algorithm of assembler?
enter the system, they are put into the job queue. What is nested and recursive macro call?
The processes that are ready to execute in the If a macro call is seen throughout the expansion of
main memory are kept in the ready queue. a macro, the assembler starts immediately with the
• Memory Management expansion of the called macro. For this, its its
Memory management deals with memory and the expanded body lines are simply inserted into the
moving of processes from disk to primary memory expanded macro body of the calling macro, until
Compare between macro processor and macro
for execution and back again. This is handled by the called macro is completely expanded. Then the
assembler?
the third layer of the operating system. expansion of the calling macro is continued with
Macro processor is not efficient as a macro
• Process Management the body line following the nested macro call.
Example: assembler. Because a macro assembler that
This layer is responsible for managing the performs macro expansions as well as
processes i.e assigning the processor to a process
assembly.In a macro processor the number of
at a time. This is known as process scheduling.
The different algorithms used for process passes over the sourse program is large and many
scheduling are FCFS (first come first served), SJF function get duplicated.Similar functions can be
(shortest job first), priority scheduling, round- merged if macros are handled by macro
robin scheduling etc. assembler, which performs macro expansion and
• I/O Buffer program assembly simultaniously. This may also
I/O devices are very important in the computer
reduce the number of passes.
systems. They provide users with the means of
interacting with the system. This layer handles the
Explain about the pass structure of a macro Binder is a program that performs the function as Explain about the different types of compilers?
assembler? direct linking loader in binding together. 1. Cross Compiler
A macro assembler is an assembler, which It outputs the text as a file or card deck, rather A cross compiler is a compiler that creates an
performs macro expansion and program assembly than placing the relocated and linked text directly executable code for one platform and runs the
simultaniously.The use of a macro processor into memory. The output files are in format ready created executable code on another platform. A
followed by a conventional assembler is an cross compiler helps separate the build
to be loaded and are called a load module. The
expensive way of handling macros.The number of environment from the target environment and is
passees over the sourse program is large and many module loader loads the module into memory. useful in a number of ways, such as it provides
functions get duplicated. To design the pass The binder performs the function of the allocation, compiling for embedded systems and compiling
structure of a macro assembler is required to Advantages: relocation and linking.The modules loader an operating system for the first time.
identify the function of a macro processor and the • It is simple to implement. performs the function of loading. 2. One-pass or Multi-pass Compiler
conventional assembler which can be merged to • This scheme allows multiple programs or the A compiler performs many operations and
advantage.After merging, the functions can be source programs written in different languages. If Explain about the two major classes of consumes a lot of memory space. So compilers are
structured in to passes of the macro assembler. there are multiple programs written in different binders? split into smaller programs that perform analysis
This process leades to the following pass languages then the respective language assembler There are 2 major classes of binders and translations. A small program can be compiled
structure:- will convert it to the language and common object Core image builder: A specific memory in one pass while a big program is divided into a
• Pass 1: file can be prepared. allocation of the program is performed at a time numbers of sub-programs, which is compiled in
1.Macro definition processing. that the subroutines are bound together. It is called multiple times known as multi-pass. The ability to
2.Collect names of declared symbols and • The task of loader becomes simpler as it simply compile in one pass is beneficial because it
obeys the instruction regarding where to place the a core image module and the corresponding binder
information concerning their attributes. is called a core image builder simplifies the job of writing a compiler. One-pass
• Pass 2: object code to the main memory. compilers are faster than multipass compilers.
Advantages: Simple to implement and Fast to
1.Macro expansion. • The process of execution is efficient. execution 3. Source-to-source Compiler
2. Memmory allocation and LC processing. Disadvantages: Difficult to allocate and load the Source-to-source compilers are the compilers that
3.Processing of literals. Disadvantages: program and Linkage editor . take a high level
4.Intermediate code generation. • In this scheme, it's the programmer's duty to Linkage editors: The linkage editor can keep language as its input and gives an output written in
• Pass 3: 1.Target code generation. adjust all the inter-segment addresses and track of relocation information so that the the same high level language.
manually do the linking activity. For that, it is resulting load module can be further relocated and 4. Stage Compiler
Explain about Compile-and-go loader or necessary for a programmer to know the memory their care the module loader must performs A stage compiler is used for compiling assembly
Assemble-and-go loader? management. additional allocation and relocation as well as language of a machine.
In this type of loader, the instruction is read line For example, Warren Abstract Machine (WAM) is
• If at all any modification is done to some loading but it does not worry about the problem of
by line, its machine code is obtained and it is linking. used as a stage compiler.
segment the starting address of immediate next
directly put in the main memory at some known Advantages: More flexible allocation and loading 5. Just-in-time Compiler
segments may get changed the programmer has to
address.That means the assembler runs in one part scheme A just-in-time compiler allows we to deliver
take care of this issue and he/she needs to update
of memory and the assembled machine Disadvantages: Implementation is so complex . applications in byte code. For example, Smalltalk
the corresponding starting address on any
instructions and data is directly put into their and Java systems use a just-in-time compiler.
modification in the source.
assigned memory locations.After completion, the Explain about the common structure of a
assembly process assigns the starting address of Explain about overlays? compiler?
Explain about relocating loader or relative
the program to the location counter. In a general computing sense, overlaying means A typical compiler consists of various phases,
loader?
"the process of transferring a block of program such as lexical phase,parser phase, contextual
A relocating loader is capable of loading a
code or other data into main memory, replacing analysis phase, optimization phase and code
program to begin anywhere in memory.
what is already stored". generation phase. The phases of the compiler
The execution of the object program is done using
Overlaying is a programming method that allows allow we to pass the output of the phase to the
any part of the available & sufficient memory. The
programs to be larger than the computer's main next phase.The following figure shows the
object program is loaded into memory wherever
memory. structure of a compiler that takes input as the
there is room for it.
A program containing overlays is called an source code and gives output as the target code.
overlay-structured program.
Disadvantages
An overlay-structured program consists of:-
• A portion of memmory is wasted because the A permanently resident portion, called root and
memory occupied by the assembler or compiler is A set of overlays.
unavailable to the object program. Execution of an overlay-structered program
• It is necessary to retranslate (assemble) the proceedes as follows:-
user’s program every time it is run. To start with,the root is loaded in the memory and
given control for the purpose of execution.
• It is very difficult to handle multiple
Other overlays are loaded as and when necessary.
segments,especially if the sourse programs are in The loading of an overlay overwrites a previously
different languages and hence it is difficult to loaded overlay with the same load origin.
produce modular programs. This reduces the memory requirement of a
Explain about general loader schemes? program.
1. Scanner
Explain about dynamic loading using overlays? The scanner allows us to group the input
We have assumed that all of the modules needed characters into tokens and construct a symbol
are loaded into the memory at the same time.If the table that is used for contextual analysis of the
total amount of memory required by all of these program code in the later phase. The tokens are
modules exceedes the amount of available also known as lexemes, which include:
memory,the program cannot execute.This problem • Keyword
can be avoided by dynamic loading based on • Identifier
overlays.Usually the modules of a program are
needed at different times.By explicitly recognizing • Operators
which modules call other modules it is possible to • Constants
produce an overlay structure that identifies • Comments
manually exclusive modules. The scanner phase is also known as lexical phase
that helps group input characters into lexical units
or tokens. The output of the lexical phase is a
stream of tokens.
2. Parser
The parser helps group tokens into syntactical
Explain about direct linking loader? units. The output we get from the parser phase is a
Explain about dynamic binding or linking? A parse tree, which is a tree representation of a
major disadvantage of all the previous loading program. The program structure we get from the
scheme is that if a module is referenced but never parser phase is defined using context-free
executed,the loader will still incur the overhead of grammars, which is defined by the following five
linking the module.In dynamic linking scheme, components:
linking and loading of external referencies are • A finite terminal vocabulary Vt , which is a
postponed until execution time.To start with,the token produced by a scanner
loader loads only the main module. If the main • A finite set of symbols known as the non-
module references an external reference, the terminal vocabulary Vn
loader is called.Only then is the segment • A start symbol S for the grammar
containing the external reference loaded. An
advantage of this scheme is that no overhead is • A finite set of productions, P
incurred unless the module to be called is actually • Push down automata
used. The major disadvantage of this scheme is the 3. Symbol Tables and Error Handler
considerable overhead and complexity incurred, The additional information that we acquire during
due to the postponement of most of the binding any phase of the compiler and which we can use in
process until execution time. the later phase is stored in symbol tables.The
information that we store can be a name
encountered in the source program or an attribute.
Explain about design of absolute loader?
The error handler helps report and rectifies an
Absolute loaders are the loaders that doo not error that occurs during the execution of the
perform any relocation or linking.Loader read the source program.
object program and move the text in the object 4. Contextual Checkers
program into the absolute locations specified by We use context checkers to analyze the parse tree
the translator.All functions are accomplished in a for context sensitive information, which is known
Explain about Absolute loader or Bootstrap single pass. Object program contain 3 types of as static semantics. The output of the semantic
loader? records: Header,Text,End Header:- Program analysis phase is an annotated parse tree.
The absolute loader is a kind of loader in which name,starting address and length. Text:- translated 5. Intermediate Code Generator
relocated object files are created, loader accepts instructions and data of the program and the The data structure that we pass between the
these files and places them at a specified location Explain about binders and module loaders? address where these are to be loaded. End:- synthesis and analysis phase is called Intermediate
in the memory.This type of loader is called One disadvantage of direct linking loader is that it indicates end of the program and specifies the Representation (IR) of the program. The
absolute loader because no relocating information is necessary to allocate,relocate,link and load all intermediate representation in a source program
address in the program where execution is to
of the modules each time in order to execute a can be:
is needed, rather it is obtained from the beging.
program.So the loading process can be extremely • Assembly language
programmer or assembler.The starting address of time consuming.And the loader program may be
every module is known to the programmer, this • Abstract syntax tree
smaller than a translator, it does absorbe a
6. Code Optimizer
corresponding starting address is stored in the considerable amount of memory space.These
The reconstruction of a parse tree to reduce the
object file then the task of loader becomes very problems can solved by dividing the loading
size of the tree or to reproduce an equivalent tree
simple that is to simply place the executable form process into 2 seperate programs: a binder and a
that gives more efficient code is known as a code
of the machine instructions at the locations module loader.
optimizer.
mentioned in the object file.
Following example shows the use of code + + (addition symbol) Identifying common sub expressions, precedes its definition in the program. This
optimization: B identifier Unfolding loops problem can be solved by postponing the
* * (multiplication symbol) Eliminating procedures. generation of target code until more information
5 5 (number) t1=id3*5.0 concerning the entity becomes available. It leads
Hence, <id, 1><=>< id, 2>< +><id, 3 >< * >< 5> Id1=id2+t1 to the multi-pass model of compilation.
2.SYNTAX ANALYSIS PHASE 6.CODE GENERATION PHASE • In Pass I: Perform analysis of the source
Syntax analysis is the second phase of compiler The final phase of the compiler is to generate code program and note relevant information.
which is also called as parsing. for a specific machine. In this phase we consider:
7. Code Generator • In Pass II: Generate target code using
Parser converts the tokens produced by lexical Memory management,
The task of the code generator is to tranlate the information noted in the pass I.
analyzer into a tree like representation called Register assignment and
intermediate representation of sourse code into the parse tree. Machine-specific optimization
code of the target machine.The code of the target A standard way to classify compilers is by the
A parse tree describes the syntactic structure of the The output from this phase is usually assembly
machine can be binary,assembly or any high level number of “passes”.
input. language or relocatable machine code. Finally, the
language.A code generator can also be integrated Usually, compiling is a relatively resource
Syntax tree is a compressed representation of the compiler converts the (optimized) program in the
with a parser. intensive process and initially computers did not
parse tree in which the operators appear as interior intermediate code representation to the required
8. Peep Hole Optimizer have enough memory to hold such a program that
nodes and the operands of the operator are the machine language.
A peep whole optimizer allows us to scan small did the complete job. Due to this limitation of
children of the node for that operator.
segments of the target code that helps improve hardware resources in early computers, compilers
The input of syntax analyser is Tokens of Explain about symbol table?
efficiency of the instructions in a program code. were broken down in to smaller sub programs that
characters, and the output is Syntax Tree. The additional information that we acquire during
Peep hole optimization is the last phase of the did its partial job by going over the source code
The parser checks if the expression made by the any phase of the compiler and which we can use in
compilation process, which helps us to discard (made a “pass” over the source or some other form
tokens is syntactically correct the later phase is stored in symbol tables. The
redundant instructions in the program code. of it) and performed analysis, transformations and
information that we store can be a name
translation tasks separately. So, depending on this
encountered in the source program or an
Explain about the 2 different parts of a classification, compilers are indentified as one-
attribute.It is a data-structure maintained
compiler? pass or multi-pass compilers.
throughout all the phases of a compiler. All the
The structure of compiler consists of two parts: As the name suggests, one-pass compilers
identifier's names along with their types are stored
1. Analysis part compiles in a single pass.
here. The symbol table makes it easier for the
Analysis part breaks the source program into It is easier to write a one-pass compiler and also
compiler to quickly search the identifier record
constituent pieces and imposes a grammatical they perform faster than multi-pass compilers.
and retrieve it.
structure on them which further uses this structure Therefore, even at the time when we had resource
Whenever an identifier is detected in any of the
to create an intermediate representation of the limitations, languages were designed so that they
phases, it is stored in the symbol table.
source program. It is also termed as front end of could be compiled in a one-pass (e.g. Pascal).
Example:
compiler. On the other hand, a typical multi-pass compiler is
int a,b; float c;char z;
Information about the source program is collected made up of several main stages. The first stage is
and stored in a data structure called symbol table. the scanner (also known as the lexical analyzer).
3.Semantic analysis Scanner reads the program and converts it to a
2. Synthesis part
Semantic analysis checks whether the parse tree string of tokens. The second stage is the parser. It
Synthesis part takes the intermediate
constructed follows the rules of language. converts the string of tokens in to a parse tree (or
representation as input and transforms it to the Explain about error correcting routines in
For example, assignment of values is between an abstract syntax tree), which captures the
target program.It is also termed as back end of compiler?
compatible data types, and adding string to an syntactic structure of the program. Next stage is
compiler. The error correcting routine (error handler) helps
integer. Also, the semantic analyzer keeps track of the that interpret the semantics of the syntactic
identifiers, their types and expressions; whether report and rectify an error that occurs during the structure. The code optimizations stages and final
identifiers are declared before use or not etc. execution of the source program. Each phase can code generation stage follow this.
The semantic analyzer produces an annotated encounter errors. After detecting an error, a phase
syntax tree as an output. A semantic analyzer must handle the error that compilation can What is LEX ? Explain.
takes its input from the syntax analysis phase in proceed. o Lex is a program that generates
the form of a parse tree and a symbol table. • In lexical analysis, errors occur in separation of
The compilation process is a sequence of various lexical analyzer. It is used with YACC
Its purpose is to determine if the input has a well- tokens.
phases. parser generator.
defined meaning; in practice semantic analyzers • In syntax analysis, errors occur during
Each phase takes input from its previous stage, has are mainly concerned with type checking and type construction of syntax tree.
its own representation of source program, and correction based on type rules. oThe lexical analyzer is a program that
feeds its output to the next phase of the compiler. Semantic analysis is the 3rd phase of compiler. • In semantic analysis, errors may occur at the transforms an input stream into a
A compiler is broken into several logical phases It checks for semantic consistency. following cases: sequence of tokens.
that help in the execution of a source code with Type information is gathered and stored in symbol
efficiency to improve the performance of the a. When the compiler detects constructs that have
table or syntax tree.
compiler. The common logical phases that we use right syntactic structure but no meaning o It reads the input stream and
Performs type checking.
in a compiler for translating a source code into the b. During type conversion. produces the source code as output
target code are: through implementing the lexical
• In code optimization, errors occur when the
analyzer in the C program.
result is affected by the optimization.
• In code generation, it shows error when code is The function of Lex is as follows:
missing etc.

Explain about different errors in different o Firstly lexical analyzer creates a


phases of compiler? program lex.1 in the Lex language.
Each phase can encounter errors. After detecting Then Lex compiler runs the lex.1
an error, a phase must somehow deal with the program and produces a C program
error, so that compilation can proceed. lex.yy.c.
A program may have the following kinds of errors
at various stages:
Lexical Errors:- It includes incorrect or o Finally C compiler runs the lex.yy.c
misspelled name of some identifier i.e., identifiers program and produces an object
• t1= int to float (5)
typed incorrectly. program a.out.
• t2= id3*t1 Syntactical Errors:- It includes missing
• t3= id2+t2 semicolon or unbalanced parenthesis. Syntactic o a.out is lexical analyzer that
errors are handled by syntax analyzer (parser).
• id1=t3 transforms an input stream into a
When an error is detected, it must be handled by
sequence of tokens.
parser to enable the parsing of the rest of the input.
4.The intermediate code generation phase In general, errors may be expected at various
The intermediate code generation phase translates Lex file format
stages of compilation but most of the errors are
the source code into target code. A Lex program is separated into three
syntactic errors and hence the parser should be
The different phases of compiler are as follows: An intermediate code is generated when a able to detect and report those errors in the sections by %% delimiters. The formal
1. Lexical analysis direct target code is not desired. program. The goals of error handler in parser are: of Lex source is as follows:
The two main reasons, which are responsible for Report the presence of errors clearly and 1. { definitions }
2. Syntax analysis the generation of an intermediate code rather than accurately. Recover from each error quickly
3. Semantic analysis generating a direct target code, are: enough to detect subsequent errors. Add minimal 2. %%
The intermediate form is a simple version of the overhead to the processing of correcting programs.
4. Intermediate code generation source code that helps an optimizer to apply the 3. { rules }
5. Code optimization optimizations, such as common sub-expression Explain about different passes of a compiler?
elimination and strength reduction. In computer programming, a one-pass compiler is 4. %%
6. Code generation Many compilers such as cross compilers generate a compiler that passes through the parts of each
• All of the above mentioned phases involve the target code for many CPUs. compilation unit only once, immediately 5. { user subroutines }
following tasks: Intermediate code generation produces translating each part into its final machine code.
intermediate representations for the source
• Symbol table management. This is in contrast to a multi-pass compiler which Definitions include declarations of
program which are of the following forms: converts the program into one or more constant, variable and regular
• Error handling. Postfix notation
Three address code
intermediate representations in steps between definitions.
source code and machine code, and which Rules define the statement of form p1
Explain about the different phases of a Syntax tree reprocesses the entire compilation unit in each
Most commenly used form is the three address {action1} p2 {action2}....pn {action}.
compiler? sequential pass.
code. Where pi describes the regular
1.LEXICAL ANALYSIS PHASE Each pass takes the result of the previous pass as
Lexical analysis is the first phase of compiler Properties of intermediate code expression and action1 describes the
the input, and creates an intermediate output. In
which is also termed as scanning. It should be easy to produce. actions what action the lexical analyzer
this way, the (intermediate) code is improved pass
Source program is scanned to read the stream of It should be easy to translate into target by pass, until the final pass produces the final should take when pattern pi matches a
characters and those characters are grouped to program. code. lexeme.
form a sequence called lexemes which produces 5.CODE OPTIMIZATION PHASE • A one-pass compilers is faster than multi-pass User subroutines are auxiliary
token as output. The next phase does code optimization of the compilers procedures needed by the actions. The
Token: Token is a sequence of characters that intermediate code. Optimization can be assumed subroutine can be loaded with the
represent lexical unit, which matches with the as something that removes unnecessary code lines, • A one-pass compiler has limited scope of passes lexical analyzer and compiled
pattern, such as keywords, operators, identifiers and arranges the sequence of statements in order but multi-pass compiler has wide scope of passes. separately.
etc. to speed up the program execution without
wasting resources (CPU, memory). • Multi-pass compilers are sometimes called wide
Lexeme: Lexeme is instance of a token i.e., group compilers where as one-pass compiler are
of characters forming a token. sometimes called narrow compiler.
Pattern: Pattern describes the rule that the
lexemes of a token takes. It is the structure that • Many programming languages cannot be
must be matched by strings. represented with single pass compilers, for
Once a token is generated the corresponding entry An optimizer attempts to improve the time and example Pascal can be implemented with a single
is made in the symbol table. space requirements of a program. There are many pass compiler where as languages like Java
Example: ways in which code can optimized, but most are require a multi-pass compiler.
c=a+b*5; expensive in terms of time and space to
Lexemes Tokens implement. It is difficult to compile the source program into
c identifier Common optimizations include: single pass due to:
= assignment symbol Removing redundant identifiers, Forward reference: a forward reference to a
a identifier Removing unreachable sections of code program entity is a reference to the entity which
What is the difference between LEX and • The lexical analyzer returns a single
YACC? quantity, the token, to the parser. To
• Lex is used to split the text into a list pass an attribute value with information
of tokens, what text become token can about the lexeme, we can set the global
be specified using regular expression in variable yylval.
lex file. • e.g. Suppose the lexical analyzer
• Yacc is used to give some structure to returns a single token for all the
those tokens. For example in relational operators, in which case the
Programming languages, we have parser won’t be able to distinguish
assignment statements like int a = 1 + between ” <=”,”>=”,”<”,”>”,”==” etc.
2; and i want to make sure that the left We can set yylval appropriately to
hand side of '=' be an identifier and the specify the nature of the operator. 1.
right side be an expression [it could be yytext is a variable that is a pointer
more complex than this]. This can be to the first character of the lexeme.
coded using a CFG rule and this is what 2. yyleng is an integer telling how
you specify in yacc file and this you long the lexeme is.
cannot do using lex (lexcannot handle The two variables yytext and yyleng
recursive languages). (Lexemes and Tokens. A Lexeme is a string
• A typical application of lex and yacc is for of characters that is a lowest-level syntatic
implementing programming languages. unit in the programming language. These are
• Lex tokenizes the input, breaking it up the "words" and punctuation of the
into keywords, constants, punctuation, programming language. A Token is a
etc. syntactic category that forms a class of
• Yacc then implements the actual lexemes)
computer language; recognizing a for
statement, for instance, or a function
definition.
• Lex and yacc are normally used
together. This is how you usually
construct an application using both:
• Input Stream (characters) -> Lex
(tokens) -> Yacc (Abstract Syntax Tree)
-> Your Application

Explain How does this yacc works?


• • yacc is designed for use with
C code and generates a parser written in C.
• • The parser is configured
for use in conjunction with a lex-
generated scanner and relies on
standard shared features (token types,
yylval, etc.) and calls the function yylex
as a scanner coroutine.
• • You provide a grammar
specification file, which is traditionally
named using a .y extension.
• • You invoke yacc on the .y
file and it creates the y.tab.h and
y.tab.c files containing a thousand or so
lines of intense C code that implements
an efficient LALR (1) parser for your
grammar, including the code for the
actions you specified.
• • The file provides an extern
function yyparse.y that will attempt to
successfully parse a valid sentence.
• You compile that C file normally, link
with the rest of your code, and you
have a parser! By default, the parser
reads from stdin and writes to stdout,
just like a lex-generated scanner does.

What is YACC?
• Yacc is officially known as a "parser".
• It's job is to analyse the structure of
the input stream, and operate of the
"big picture".
• In the course of it's normal work, the
parser also verifies that the input is
syntactically sound.
• Consider again the example of a C-
compiler. In the C-language, a word can
be a function name or a variable,
depending on whether it is followed by
a (or a = There should be exactly one }
for each { in the program.
• YACC stands for "Yet another
Compiler Compiler". This is because this
kind of analysis of text files is normally
associated with writing compilers.

Explain How does this Lexical


analyzer work?
• The lexical analyzer created by Lex
behaves in concert with a parser in the
following manner.
• When activated by the parser, the
lexical analyzer begins reading its
remaining input, one character at a
time, until it has found the longest
prefix of the input that is matched by
one of the regular expressions p.
• Then it executes the corresponding
action. Typically the action will return
control to the parser.
• However, if it does not, then the
lexical analyzer proceeds to find more
lexemes, until an action causes control
to return to the parser.
• The repeated search for lexemes until
an explicit return allows the lexical
analyzer to process white space and
comments conveniently.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy