0% found this document useful (0 votes)
19 views40 pages

CS6109 Module 1

The document provides an overview of compiler design, covering the phases of a compiler, including lexical analysis, syntax analysis, semantic analysis, intermediate code generation, code optimization, and code generation. It also discusses the roles of assemblers, macro-processors, loaders, and linkers in the compilation process. Additionally, it highlights the importance of compiler construction tools and the structure of a compiler, emphasizing the analysis and synthesis parts.

Uploaded by

cip29.mit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views40 pages

CS6109 Module 1

The document provides an overview of compiler design, covering the phases of a compiler, including lexical analysis, syntax analysis, semantic analysis, intermediate code generation, code optimization, and code generation. It also discusses the roles of assemblers, macro-processors, loaders, and linkers in the compilation process. Additionally, it highlights the importance of compiler construction tools and the structure of a compiler, emphasizing the analysis and synthesis parts.

Uploaded by

cip29.mit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 40

CS6109 – COMPILER DESIGN

Module – 1

Presented By
Dr. S. Muthurajkumar,
Assistant Professor,
Dept. of CT, MIT Campus,
Anna University, Chennai.
MODULE - 1
 Phases of the compiler
 Compiler construction tools
 Role of assemblers
 Macro-processors
 Loaders
 Linkers

2
COMPILER
 Programming languages are notations for describing computations to people
and to machines.
 The world as we know it depends on programming languages, because all the
software running on all the computers was written in some programming
language.
 But, before a program can be run, it first must be translated into a form in
which it can be executed by a computer.
 The software systems that do this translation are called compilers.

3
COMPILER
 A compiler is a program that can read a program in one language - the source
language - and translate it into an equivalent program in another language - the
target language.
 An important role of the compiler is to report any errors in the source program
that it detects during the translation process.

4
RUNNING THE TARGET PROGRAM
 If the target program is an executable machine-language program, it can then
be called by the user to process inputs and produce outputs.

5
INTERPRETER
 An interpreter is another common kind of language processor.
 Instead of producing a target program as a translation, an interpreter appears to
directly execute the operations specified in the source program on inputs
supplied by the user

6
INTERPRETER
 The machine-language target program produced by a compiler is usually much
faster than an interpreter at mapping inputs to outputs .
 An interpreter, however, can usually give better error diagnostics than a
compiler, because it executes the source program statement by statement.

7
HYBRID COMPILER
 A Java source program may first be compiled into an intermediate form called
bytecodes. The bytecodes are then interpreted by a virtual machine.
 A benefit of this arrangement is that bytecodes compiled on one machine can
be interpreted on another machine, perhaps across a network.

8
LANGUAGE-PROCESSING SYSTEM
 A source program may be divided into modules stored in separate files.
 The task of collecting the source program is sometimes entrusted to a separate
program, called a pre-processor.
 The pre-processor may also expand short hands, called macros, into source
language statements.
 The modified source program is then fed to a compiler.
 The compiler may produce an assembly language program as its output,
because assembly language is easier to produce as output and is easier to
debug.
 The assembly language is then processed by a program called an assembler that
produces relocatable machine code as its output.

9
LANGUAGE-PROCESSING SYSTEM
 Large programs are often compiled in
pieces, so the relocatable machine
code may have to be linked together
with other relocatable object files and
library files into the code that actually
runs on the machine.
 The linker resolves external memory
addresses, where the code in one file
may refer to a location in another file.
 The loader then puts together all of
the executable object files into
memory for execution.

10
STRUCTURE OF A COMPILER/PHASES
OF A COMPILER
 Two Parts
 Analysis
 Synthesis

11
ANALYSIS
 The analysis part breaks up the source program into constituent pieces and
imposes a grammatical structure on them.
 It then uses this structure to create an intermediate representation of the source
program.
 If the analysis part detects that the source program is either syntactically ill
formed or semantically unsound, then it must provide informative messages, so
the user can take corrective action.
 The analysis part also collects information about the source program and stores
it in a data structure called a symbol table, which is passed along with the
intermediate representation to the synthesis part.

12
SYNTHESIS
 The synthesis part constructs the desired target program from the intermediate
representation and the information in the symbol table.
 The analysis part is often called the front end of the compiler; the synthesis part is the
back end.
 It operates as a sequence of phases, several phases may be grouped together, and the
intermediate representations between the grouped phases need not be constructed
explicitly.
 The symbol table, which stores information about the entire source program, is used
by all phases of the compiler.
 Some compilers have a machine-independent optimization phase between the front
end and the back end.
 The purpose of this optimization phase is to perform transformations on the
intermediate representation, so that the back end can produce a better target program
than it would have otherwise produced from an unoptimized intermediate
representation.
13
PHASES OF A COMPILER

14
LEXICAL ANALYSIS
 The first phase of a compiler is called lexical analysis or scanning. The lexical
analyzer reads the stream of characters making up the source program and
groups the characters into meaningful sequences called lexemes. For each
lexeme, the lexical analyzer produces as output a token of the form
<token-name; attribute-value>
 that it passes on to the subsequent phase, syntax analysis.
 In the token, the first component token-name is an abstract symbol that is used
during syntax analysis, and the second component attribute-value points to an
entry in the symbol table for this token. Information from the symbol-table
entry is needed for semantic analysis and code generation.
 For example, suppose a source program contains the assignment statement
 position = initial + rate * 60
15
LEXICAL ANALYSIS
 The first phase of a compiler is called lexical analysis or scanning. The lexical
For example, suppose a source program contains the assignment statement
 position = initial + rate * 60
 Position  id1
 =  token (assignment symbol)
 Initial  id2
 +  token (plus operator)
 Rate  id3
 *  token (multiplication operator)
 60  token (number 60)
 <id1> <=> <id2> <+> <id3> <*> <60>

16
LEXICAL ANALYSIS

17
SYNTAX ANALYSIS
 The second phase of the compiler is syntax analysis or parsing.
 The parser uses the first components of the tokens produced by the lexical
analyzer to create a tree like intermediate representation that depicts the
grammatical structure of the token stream.
 A typical representation is a syntax tree in which each interior node represents
an operation and the children of the node represent the arguments of the
operation.
 position = initial + rate * 60

18
SEMANTIC ANALYSIS
 The semantic analyzer uses the syntax tree and the information in the symbol
table to check the source program for semantic consistency with the language
definition.
 It also gathers type information and saves it in either the syntax tree or the
symbol table, for subsequent use during intermediate-code generation.
 An important part of semantic analysis is type checking, where the compiler
checks that each operator has matching operands.
 For example, many programming language definitions require an array index to
be an integer; the compiler must report an error if a floating-point number is
used to index an array.

19
INTERMEDIATE CODE GENERATION
 A compiler may construct one or more intermediate representations, which can
have a variety of forms.
 After syntax and semantic analysis of the source program, many compilers
generate an explicit low-level or machine-like intermediate representation,
which we can think of as a program for an abstract machine.
 This intermediate representation should have two important properties: it
should be easy to produce and it should be easy to translate into the target
machine.

20
INTERMEDIATE CODE GENERATION
 An intermediate form called three-address code, which consists of a sequence
of assembly-like instructions with three operands per instruction.
 Each operand can act like a register.
 t1 = inttofloat(60)
 t2 = id3 * t1
 t3 = id2 + t2
 id1 = t3

21
CODE OPTIMIZATION
 The machine-independent code-optimization phase attempts to improve the
intermediate code so that better target code will result.
 Usually better means faster, but other objectives may be desired, such as
shorter code, or target code that consumes less power.
 t1 = id3 * 60.0
 id1 = id2 + t1

22
CODE GENERATION
 The code generator takes as input an intermediate representation of the source
program and maps it into the target language.
 If the target language is machine code, registers or memory locations are selected for
each of the variables used by the program.
 Then, the intermediate instructions are translated into sequences of machine
instructions that perform the same task.
 A crucial aspect of code generation is the judicious assignment of registers to hold
variables.
 LDF R2, id3
 MULF R2, R2, #60.0
 LDF R1, id2
 ADDF R1, R1, R2
 STF id1, R1
23
SYMBOL-TABLE MANAGEMENT
 An essential function of a compiler is to record the variable names used in the
source program and collect information about various attributes of each name.
 These attributes may provide information about the storage allocated for a name,
its type, its scope (where in the program its value may be used), and in the case of
procedure names, such things as the number and types of its arguments, the
method of passing each argument (for example, by value or by reference), and the
type returned.
 The symbol table is a data structure containing a record for each variable name,
with fields for the attributes of the name.
 The data structure should be designed
to allow the compiler to find the record
for each name quickly and to store or retrieve
data from that record quickly.
24
COMPILER CONSTRUCTION TOOLS
 The compiler writer, like any software developer, can profitably use modern
software development environments containing tools such as language editors,
debuggers, version managers, profilers, test harnesses, and so on.
 In addition to these general software-development tools, other more specialized
tools have been created to help implement various phases of a compiler.
§ Parser generators that automatically produce syntax analyzers from a
grammatical description of a programming language.
§ Scanner generators that produce lexical analyzers from a regular-expression
description of the tokens of a language.
§ Syntax-directed translation engines that produce collections of routines for
walking a parse tree and generating intermediate code.

25
COMPILER CONSTRUCTION TOOLS
4. Code-generator generators that produce a code generator from a collection of
rules for translating each operation of the intermediate language into the
machine language for a target machine.
5. Data-flow analysis engines that facilitate the gathering of information about
how values are transmitted from one part of a program to each other part.
Data-flow analysis is a key part of code optimization.
6. Compiler-construction toolkits that provide an integrated set of routines for
constructing various phases of a compiler.

26
ROLE OF ASSEMBLERS
 Assembler is a program for converting instructions written in low-level
assembly code into relocatable machine code and generating along information
for the loader.

 An assembler translates assembly language programs into machine code.


 The output of an assembler is called an object file, which contains a
combination of machine instructions as well as the data required to place these
instructions in memory.
27
ROLE OF ASSEMBLERS
 Assembler divide these tasks in two passes:
 Pass-1:
 Define symbols and literals and remember them in symbol table and literal
table respectively.
 Keep track of location counter
 Process pseudo-operations
 Pass-2:
 Generate object code by converting symbolic opcode into respective
numeric opcode
 Generate data for literals and look for values of symbols

28
MACRO-PROCESSORS
 A macro processor is a system software.
 Macro is that the Section of code that the programmer writes (defines) once,
and then can use or invokes many times.
 A Macro instruction is the notational convenience for the programmer.
 For every occurrence of macro the whole macro body or macro block of
statements gets expanded in the main source code.
 Thus Macro instructions make writing code more convenient.

29
MACRO-PROCESSORS
 Salient features of Macro Processor:
 Macro represents a group of commonly used statements in the source
programming language.
 Macro Processor replaces each macro instruction with the corresponding group
of source language statements. This is known as the expansion of macros.
 Using Macro instructions programmer can leave the mechanical details to be
handled by the macro processor.
 Macro Processor designs are not directly related to the computer architecture
on which it runs.
 Macro Processor involves definition, invocation, and expansion.

30
MACRO-PROCESSORS
 Macro Definition and Expansion:
 Line Label Opcode Operand
5 COPY START 0
 10 RDBUFF MACRO &INDEV, &BUFADR
 15
.
.
 90
 95 MEND

31
MACRO-PROCESSORS
 Macro Definition and Expansion:
 Line 10:
 RDBUFF (Read Buffer) in the Label part is the name of the Macro or
definition of the Macro. &INDEV and &BUFADR are the parameters present
in the Operand part. Each parameter begins with the character &.
 Line 15 – Line 90:
 From Line 15 to Line 90 Macro Body is present. Macro directives are the
statements that make up the body of the macro definition.
 Line 95:
 MEND is the assembler directive that means the end of the macro definition.

32
MACRO-PROCESSORS
 Macro Invocation:
 Line Label Opcode Operand
 180 FIRST STL RETADR
 190 CLOOP RDBUFF F1, BUFFER
 15
.
.
 255 END FIRST
 Line 190:
 RDBUFF is the Macro invocation or Macro Call that gives the name of the macro
instruction being invoked and F1, BUFFER are the arguments to be used in expanding
the macro. The statement that form the expansion of a macro are generated each time
the macro is invoked.
33
LOADERS
 Loader is a part of operating system and is responsible for loading executable
files into memory and execute them.
 It calculates the size of a program (instructions and data) and creates memory
space for it.
 It initializes various registers to initiate execution.

34
LOADERS
 The loader is special program that takes input of object code from linker, loads
it to main memory, and prepares this code for execution by computer.
 Loader allocates memory space to program.
 Even it settles down symbolic reference between objects.
 It is in charge of loading programs and libraries in operating system.
 The embedded computer systems don’t have loaders. In them, code is executed
through ROM.
 There are following various loading schemes:
1. Absolute Loaders
2. Reloacting Loaders
3. Direct Linking Loaders
4. Bootstrap Loaders
35
LINKERS
 Linker is a computer program that links and merges various object files
together in order to make an executable file.
 All these files might have been compiled by separate assemblers.
 The major task of a linker is to search and locate referenced module/routines in
a program and to determine the memory location where these codes will be
loaded, making the program instruction to have absolute references.

36
LINKERS
 A linker is special program that combines the object files, generated by
compiler/assembler, and other pieces of codes to originate an executable file
have. exe extension.
 In the object file, linker searches and append all libraries needed for execution
of file.
 It regulates the memory space that will hold the code from each module.
 It also merges two or more separate object programs and establishes link
among them.
 Generally, linkers are of two types :
1. Linkage Editor
2. Dynamic Linker

37
DIFFERENCE BETWEEN LINKER AND
LOADER
S. No. LINKER LOADER
The main function of Linker is to generate Whereas main objective of Loader is to load
1
executable files. executable files to main memory.
The linker takes input of object code generated And the loader takes input of executable files
2
by compiler/assembler. generated by linker.

Linking can be defined as process of Loading can be defined as process of loading


3 combining various pieces of codes and source executable codes to main memory for further
code to obtain executable code. execution.

Linkers are of 2 types: Linkage Editor and Loaders are of 4 types: Absolute, Relocating,
4
Dynamic Linker. Direct Linking, Bootstrap.
Another use of linker is to combine all object It helps in allocating the address to executable
5
modules. codes/files.
Linker is also responsible for arranging objects Loader is also responsible for adjusting references
6
in program’s address space. which are used within the program.
38
REFERENCE
1. Alfred V. Aho, Monica S. Lam, Ravi Sethi, Jeffrey D. Ullman, “Compilers:
Principles, Techniques and Tools”, Second Edition, Pearson Education Limited,
2014.
2. Randy Allen, Ken Kennedy, “Optimizing Compilers for Modern Architectures:
A Dependence-based Approach”, Morgan Kaufmann Publishers, 2002.
3. Steven S. Muchnick, “Advanced Compiler Design and Implementation”,
Morgan Kaufmann Publishers - Elsevier Science, India, Indian Reprint, 2003.
4. Keith D Cooper and Linda Torczon, “Engineering a Compiler”, Morgan
Kaufmann Publishers, Elsevier Science, 2004.
5. V. Raghavan, “Principles of Compiler Design”, Tata McGraw Hill Education
Publishers, 2010.
6. Allen I. Holub, “Compiler Design in C”, Prentice-Hall Software Series, 1993.

39
40

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy