Introduction To CD
Introduction To CD
Lecture-Module 1
I N T R O D U C T I ON T O C O M P I L E R S :
AN OVERVIEW
Parimal Giri
2
Parimal Giri
Topics
Overview of compilers
Lexical analysis (Scanning)
Syntactic analysis (Parsing)
Context-sensitive analysis
Type checking
Runtime environments
Symbol tables
Intermediate representations
Intermediate code generation
Code optimization
Parimal Giri 3
A bit of history
4
Parimal Giri
Compiler learning
5
Parimal Giri
Terminology
6
Compiler:
a program that translates an source program in one
language into an executable program in another language
we expect the program produced by the compiler to be
better, in some way, than the original
Interpreter:
a program that reads an source program and produces
the results of running that program
usually, this involves executing the source program in
some fashion
Our course is mainly about compilers but many of
the same issues arise in interpreters
Parimal Giri
Disciplines involved
7
Algorithms
Languages and machines
Operating systems
Computer architectures
Parimal Giri
Compilers
What is a compiler?
A program that translates a program in one language (source
language) into an equivalent program in another language (target
language), and reports errors in the source program
A compiler typically lowers the level of abstraction of the
program
C assembly code for EOS machine
Java Java bytecode
What is an interpreter?
A program that reads an executable program and produces the
results of executing that program
C is typically compiled
Scheme is typically interpreted
Java is compiled to bytecodes, which are then
interpreted
Parimal Giri 8
Why build compilers?
9
Parimal Giri
Why study compilers?
Compilers embody a wide range of theoretical
techniques and their application to practice
DFAs, PDAs, formal languages, formal grammars, fixpoints
algorithms, lattice theory
Compiler construction teaches programming and
software engineering skills
Compiler construction involves a variety of areas
theory, algorithms, systems, architecture
The techniques used in various parts of compiler
construction are useful in a wide variety of applications
Many practical applications have embedded languages,
commands, macros, etc.
Is compiler construction a solved problem?
No! New developments in programming languages (Java) and
machine architectures (multicore machines) present new
challenges
Parimal Giri 10
Compiler Architecture
11
In more detail:
Intermediate
Language
Analysis Synthesis
•Separation of Concerns
•Retargeting
Parimal Giri
Abstract view
12
Source Machine
code Compiler code
errors
Recognizes legal (and illegal) programs
Generate correct code
Manage storage of all variables and code
Agreement on format for object (or assembly)
code
Parimal Giri
Front-end, Back-end division
13
Source IR Machine
code Front end Back end code
errors
Parimal Giri
Front end
14
tokens
Source
code Scanner Parser IR
errors
Parimal Giri
Front end
15
Source tokens IR
code Scanner Parser
errors
Scanner:
Maps characters into tokens – the basic unit of
syntax
x = x + y becomes <id, x> = <id, x> + <id, y>
Typical tokens: number, id, +, -, *, /, do, end
Eliminate white space (tabs, blanks, comments)
A key issue is speed so instead of using a tool
like LEX it sometimes needed to write your
own scanner
Parimal Giri
Front end
16
Source tokens IR
code Scanner Parser
errors
Parser:
Recognize context-free syntax
Guide context-sensitive analysis
Construct IR
Produce meaningful error messages
Attempt error correction
There are parser generators like YACC which
automates much of the work
Parimal Giri
Front end
17
Parimal Giri
Front end
18
Parimal Giri
Front end
19
Parimal Giri
Back end
20
errors
Parimal Giri
Back end
21
errors
Parimal Giri
Back end
22
errors
Parimal Giri
Traditional three pass compiler
23
Source IR IR Machine
Middle
code Front end Back end code
end
errors
Parimal Giri
Desirable Properties of Compilers
24
Parimal Giri
What are the issues in compiler construction?
25
Source code Target code
written in a high level Assembly language (chapter 9)
programming language which in turn is translated to
machine code
//simple example
L1:MOV total,R0
while (sum < total) CMP sum,R0
{ CJ< L2
sum = sum + x*10; GOTO L3
} L2:MOV #10,R0
MUL x,R0
ADD sum,R0
MOV R0,sum
GOTO L1
L3:first instruction
following the while
statement
Parimal Giri
What is the input?
26
//simple example
while (sum < total)
{
sum = sum + x*10;
}
//simple\bexample\nwhile\b(sum\b<\btotal)\b{\n\tsum\b=
\bsum\b+\bx*10;\n}\n
Parimal Giri
The Structure of a Compiler (1)
27
Compiler
Parimal Giri
The Structure of a Compiler (2)
28
Source
Program Tokens Syntactic Semantic
Scanner Parser
(Character Stream) Structure Routines
Intermediate
Representation
Code
Generator
Parimal Giri
Target machine code
Lexical Analyzer
29
WHILE,LPAREN,<ID,sum>,LT,<ID,total>,RPAREN,LBRACE,
<ID,sum>,EQ,<ID,sum>,PLUS,<ID,x>,TIMES,<NUM,10>,
SEMICOL,RBRACE
Parimal Giri
Lexical Analysis (Scanning)
31
Parimal Giri
Lexical analysis (Scanning)
32
Parimal Giri
Syntax Analyzer
33
oldval 12
Parimal Giri
Next Step: Syntax Analysis (Parsing)
34
Parimal Giri
Semantic Analyzer
36
The type of the identifier newval must match with type of the expression
(oldval+12)
Parimal Giri
Next Step: Semantic (Context-Sensitive) Analysis
37
+
<id,sum> *
may become
<id,sum> int2float
<id,x> <num,10>
*
Symbol
Table sum float
<id,x> <num,10>
x int
Parimal Giri
Intermediate Code Generation
38
Parimal Giri
Intermediate Representations
39
<id,sum> *
<id,x> <num,10>
Parimal Giri
Intermediate Code Generator
40
Parimal Giri
Code Optimizer (for Intermediate Code
Generator)
41
Ex:
MULT id2,id3,temp1
ADD temp1,#1,id1
Parimal Giri
Improving the Code: Code Optimization
42
Parimal Giri
Code Generator
43
Ex:
( assume that we have an architecture with instructions whose at least one of its
operands is
a machine register)
MOVE id2,R1
MULT id3,R1
ADD #1,R1
MOVE R1,id1
Parimal Giri
Next Step: Code Generation
44
Target code
If we generate code for each statement separately
MOV b,R0 we will not generate efficient code
code
for first ADD c,R0
statement MOV R0,a
MOV a,R0 This instruction is redundant
code for
second ADD e,R0
statement MOV R0,d
Parimal Giri
Code Generation: Register Allocation
46
Parimal Giri
The Structure of a Compiler (8)
47
Code Generator
[Intermediate Code Generator]
Tokens
Code Optimizer
Parser
[Syntax Analyzer]
Optimized Intermediate Code
Parse tree
Code Optimizer
Semantic Process
[Semantic analyzer] Target machine code
Parimal Giri
Issues Driving Compiler Design
48
Correctness
Speed (runtime and compile time)
Degrees of optimization
Multiple passes
Space
Feedback to user
Debugging
Parimal Giri
Tools
49
Parimal Giri