0% found this document useful (0 votes)
4 views19 pages

2.question Bank

The document discusses various aspects of code generation in compiler design, including issues like input to the code generator, target program types, memory management, instruction selection, register allocation, and evaluation order. It also covers basic blocks, flow graphs, peephole optimization, stack allocation for function calls, directed acyclic graphs (DAG), activation records, and different representations of three-address codes such as quadruples, triples, and indirect triples. Each section provides explanations and examples to illustrate the concepts involved in compiler design and code optimization.

Uploaded by

I Know
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views19 pages

2.question Bank

The document discusses various aspects of code generation in compiler design, including issues like input to the code generator, target program types, memory management, instruction selection, register allocation, and evaluation order. It also covers basic blocks, flow graphs, peephole optimization, stack allocation for function calls, directed acyclic graphs (DAG), activation records, and different representations of three-address codes such as quadruples, triples, and indirect triples. Each section provides explanations and examples to illustrate the concepts involved in compiler design and code optimization.

Uploaded by

I Know
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Question Bank

Unit-4

Q1.Explain the various issues in the design of code generation.

Answer: In the code generation phase, various issues can arises:

1.​ Input to the code generator


2.​ Target program
3.​ Memory management
4.​ Instruction selection
5.​ Register allocation
6.​ Evaluation order

1. Input to the code generator

o​ The input to the code generator contains the intermediate representation of the source
program and the information of the symbol table. The source program is produced by the
front end.
o​ Intermediate representation has the several choices:​
a) Postfix notation​
b) Syntax tree​
c) Three address code
o​ We assume front end produces low-level intermediate representation i.e. values of names
in it can directly manipulated by the machine instructions.
o​ The code generation phase needs complete error-free intermediate code as an input
requires.

2. Target program:

The target program is the output of the code generator. The output can be:
a) Assembly language: It allows subprogram to be separately compiled.

b) Relocatable machine language: It makes the process of code generation easier.

c) Absolute machine language: It can be placed in a fixed location in memory and can be
executed immediately.

3. Memory management

o​ During code generation process the symbol table entries have to be mapped to actual p
addresses and levels have to be mapped to instruction address.
o​ Mapping name in the source program to address of data is co-operating done by the front
end and code generator.
o​ Local variables are stack allocation in the activation record while global variables are in
static area.

4. Instruction selection:

o​ Nature of instruction set of the target machine should be complete and uniform.
o​ When you consider the efficiency of target machine then the instruction speed and
machine idioms are important factors.
o​ The quality of the generated code can be determined by its speed and size.

Example:

The Three address code is:

1.​ a:= b + c
2.​ d:= a + e

Inefficient assembly code is:

1.​ MOV b, R0 R0→b


2.​ ADD c, R0 R0 c + R0
3.​ MOV R0, a a → R0
4.​ MOV a, R0 R0→ a
5.​ ADD e, R0 R0 → e + R0
6.​ MOV R0, d d → R0

5. Register allocation

Register can be accessed faster than memory. The instructions involving operands in register are
shorter and faster than those involving in memory operand.

The following sub problems arise when we use registers:

Register allocation: In register allocation, we select the set of variables that will reside in
register.

Register assignment: In Register assignment, we pick the register that contains variable.

Certain machine requires even-odd pairs of registers for some operands and result.

For example:

Consider the following division instruction of the form:

1.​ D x, y

Where,

x is the dividend even register in even/odd register pair

y is the divisor

Even register is used to hold the reminder.

Old register is used to hold the quotient.

6. Evaluation order
The efficiency of the target code can be affected by the order in which the computations are
performed. Some computation orders need fewer registers to hold results of intermediate than
others.

Q2. Write short notes on basic blocks and flow graphs.

Answer: Basic Block

Basic block contains a sequence of statement. The flow of control enters at the beginning of the
statement and leave at the end without any halt (except may be the last instruction of the block).

The following sequence of three address statements forms a basic block:

1.​ t1:= x * x
2.​ t2:= x * y
3.​ t3:= 2 * t2
4.​ t4:= t1 + t3
5.​ t5:= y * y
6.​ t6:= t4 + t5

Basic block construction:

Algorithm: Partition into basic blocks

Input: It contains the sequence of three address statements

Output: it contains a list of basic blocks with each three address statement in exactly one block

Method: First identify the leader in the code. The rules for finding leaders are as follows:

o​ The first statement is a leader.


o​ Statement L is a leader if there is an conditional or unconditional goto statement like:
if....goto L or goto L
o​ Instruction L is a leader if it immediately follows a goto or conditional goto statement
like: if goto B or goto B
For each leader, its basic block consists of the leader and all statement up to. It doesn't include
the next leader or end of the program.

Consider the following source code for dot product of two vectors a and b of length 10:

1.​ begin
2.​ prod :=0;
3.​ i:=1;
4.​ do begin
5.​ prod :=prod+ a[i] * b[i];
6.​ i :=i+1;
7.​ end
8.​ while i <= 10
9.​ end

The three address code for the above source program is given below:

B1

1.​ (1) prod := 0


2.​ (2) i := 1

B2

1.​ (3) t1 := 4* i
2.​ (4) t2 := a[t1]
3.​ (5) t3 := 4* i
4.​ (6) t4 := b[t3]
5.​ (7) t5 := t2*t4
6.​ (8) t6 := prod+t5
7.​ (9) prod := t6
8.​ (10) t7 := i+1
9.​ (11) i := t7
10.​(12) if i<=10 goto (3)

Basic block B1 contains the statement (1) to (2)

Basic block B2 contains the statement (3) to (12)

Q3. Explain the peephole optimization in detail.

Answer: PEEPHOLE OPTIMIZATION

A statement-by-statement code-generations strategy often produces target code that contains


redundant instructions and suboptimal constructs. The quality of such target code can be
improved by applying “optimizing” transformations to the target program.

A simple but effective technique for improving the target code is peephole optimization,
a method for trying to improving the performance of the target program by examining a short
sequence of target instructions (called the peephole) and replacing these instructions by a shorter
or faster sequence, whenever possible.

The peephole is a small, moving window on the target program. The code in the peephole
need not be contiguous, although some implementations do require this. It is characteristic of
peephole optimization that each improvement may spawn opportunities for additional
improvements.

Characteristics of peephole optimizations:


Redundant-instructions elimination
Flow-of-control optimizations
Algebraic simplifications
Use of machine idioms
Unreachable
Q4. Explain the sequence of stack allocation process for a function call.

Answer: The stack allocation is a runtime storage management technique. The activation records
are pushed and popped as activations begin and end respectively.

Storage for the locals in each call of the procedure is contained in the activation record for that
call. Thus, locals are bound to fresh storage in each activation, because a new activation record is
pushed onto the stack when the call is made.

It can be determined the size of the variables at a run time & hence local variables can have
different storage locations & different values during various activations. Suppose that the
registered top marks the top of the stack. At runtime, an activation record can be allocated and
deal located by incrementing and decrementing top, respectively, by the size of the record.

If the procedure q has an activation record of size a then the top is incremented by before the
target code of q is executed. When the control returns from q, the top of the stack are
decremented by a.

The memory organization for the C program on the UNIX platform is as follows −

In C, data can be global, meaning it is allocated static storage and available to any procedure, or
local, meaning it can be accessed only by the procedure in which it is discarded. A program
consists of a list of global data declarations and procedures in which it is declared.

There are two pointers as one is stack pointer (SP) always points to a particular position in the
activation record for the currently activate procedure. The second pointer is called top, which
always points to the top of the stack i.e., top of the activation record.

The temporaries are used for expression evaluation and allocated above the activation record. An
Activation Record is a data structure that is activated/ created when a procedure/function is
invoked, and it contains the following information about the function.
Activation Record in 'C' language consist of

●​ Actual Parameters
●​ Number of Arguments
●​ Return Address
●​ Return Value
●​ Old Stack Pointer (SP)
●​ Local Data in a function or procedure

Q5. Define a Directed Acyclic graph. Construct a DAG and write the sequence of
Instructions for the expression a+a*(b-c)+(b-c)*d.

Answer:
Directed Acyclic graph in Compiler Design (with examples)
Directed Acyclic Graph :​
The Directed Acyclic Graph (DAG) is used to represent the structure of basic blocks, to visualize
the flow of values between basic blocks, and to provide optimization techniques in the basic
block. To apply an optimization technique to a basic block, a DAG is a three-address code that is
generated as the result of an intermediate code generation.
●​ Directed acyclic graphs are a type of data structure and they are used to apply
transformations to basic blocks.
●​ The Directed Acyclic Graph (DAG) facilitates the transformation of basic blocks.
●​ DAG is an efficient method for identifying common sub-expressions.
●​ It demonstrates how the statement’s computed value is used in subsequent statements.
Examples of directed acyclic graph :

Directed Acyclic Graph Characteristics :​


A Directed Acyclic Graph for Basic Block is a directed acyclic graph with the following labels
on nodes.
●​ The graph’s leaves each have a unique identifier, which can be variable names or constants.
●​ The interior nodes of the graph are labelled with an operator symbol.
●​ In addition, nodes are given a string of identifiers to use as labels for storing the computed
value.
●​ Directed Acyclic Graphs have their own definitions for transitive closure and transitive
reduction.
●​ Directed Acyclic Graphs have topological orderings defined
●​ STEPS FOR CONSTRUCTING A DAG:​
1.d1=leaf(id,entry-a)

●​ 2.d2=leaf(id,entry-a)=d1

●​ 3.d3=leaf(id,entry-b)

●​ 4.d4=leaf(id,entry-c)

●​ 5.d5=node('-',d3,d4)

●​ 6.d6=node('*',d1,d5)

●​ 7.d7=node('+',d1,d6)

●​ 8.d8=leaf(id,entry-b)=d3

●​ 9.d9=leaf(id,entry-c)=d4

●​ 10.d10=node('-',d3,d4)=d5

●​ 11.d11=leaf(id,entry-d)

●​ 12.d12=node('*',d5,d11)

●​ 13.d13=node('+',d7,d12)
Q6. What is an activation record? Explain how it is related with run time storage
organization.

Answer: Activation Record


o​ Control stack is a run time stack which is used to keep track of the live procedure
activations i.e. it is used to find out the procedures whose execution have not been
completed.
o​ When it is called (activation begins) then the procedure name will push on to the stack
and when it returns (activation ends) then it will popped.
o​ Activation record is used to manage the information needed by a single execution of a
procedure.
o​ An activation record is pushed into the stack when a procedure is called and it is popped
when the control returns to the caller function.

The diagram below shows the contents of activation records:


Return Value: It is used by calling procedure to return a value to calling procedure.

Actual Parameter: It is used by calling procedures to supply parameters to the called


procedures.

Control Link: It points to activation record of the caller.

Access Link: It is used to refer to non-local data held in other activation records.

Saved Machine Status: It holds the information about status of machine before the procedure is
called.

Local Data: It holds the data that is local to the execution of the procedure.

Temporaries: It stores the value that arises in the evaluation of an expression.

Q7. Define triples, indirect triples and quadruples.

Answer:
There are three different ways to express three address codes:
●​ Quadruple​

●​ Triples​

●​ Indirect Triples
Quadruple
It is a structure that has four fields: op, arg1, arg2, and result. The operator is denoted by op,
arg1, and arg2 denote the operands, and the result is used to record the outcome of the
expression.
These quadruples play a crucial role in breaking down high-level language statements into more
digestible parts, facilitating compilation-stage analysis and optimization procedures.
Benefits of Quadrule
●​ For global optimization, it's simple to restructure code.​

●​ Using a symbol table, one may rapidly get the value of temporary variables.
Drawbacks of Quadrule
●​ There are a lot of temporary items.​

●​ The establishment of temporary variables adds to the time and space complexity.​

Example
Convert a = -b * c + d into three address codes.
The following is the three-address code:
t₁ = -b
t₂ = c + d
t₃ = t₁ * t₂
a = t₃

Quadruples are used to symbolize these statements:


# Op Arg1 Arg2 Result

(0) unimus b - t₁

(1) + c d t₂

(2) * t₁ t₂ t₃

(3) = t₃ - a

Triples
Instead of using an additional temporary variable to represent a single action, a pointer to the
triple is utilized when a reference to another triple's value is required. As a result, it only has
three fields: op, arg1, and arg2.

Benefits of Triples
●​ Triples make it easier to analyze and optimize code by disassembling difficult high-level
language constructs into smaller, more manageable parts​

●​ Triples facilitate error, data flow, and control flow analysis of code, facilitating improved
debugging and comprehension​

Drawbacks of Triples​

●​ Temporaries are difficult to rearrange since they are implicit.​

●​ It's tough to optimize since it necessitates the relocation of intermediary code. When a triple
is relocated, all triples that relate to it must likewise be changed. The symbol table entry can
be accessed directly using the pointer.​
Example
Convert a = -b * c + d into three address codes.
The following is the three-address code:
t₁ = -b
t₂ = c + dM
t₃ = t₁ * t₂
a = t₃

The following triples represent these statements:


# Op Arg1 Arg2

(0) unimus b -

(1) + c d

(2) * (0) (1)

(3) = (2) -

Also See, Top Down Parsing


Indirect Triples
This approach employs a pointer to a list of all references to computations that are created and
kept separately. Its usefulness is comparable to quadruple representation, however, it takes up
less space. Temporaries are easy to rearrange since they are implicit.
Benefits of Indirect Triples
●​ For languages that use dynamic memory allocation and pointer manipulation, indirect
triples are essential for representing complex pointer operations and memory accesses​

●​ They simplify the intricate address calculations needed for nested structures,
multi-dimensional arrays, and other memory architectures​
Drawbacks of Indirect Triples
●​ Indirect triples can increase the complexity of the intermediate representation and
optimization phases of the compiler, complicating the design and implementation of the
compiler​

●​ Due to the additional pointer dereferencing and memory access operations required by using
indirect triples, there may be performance overhead that could slow down execution​

Example
Convert a = b * – c + b * – c into three address codes.
The following is the three-address code:
t1 = uminus c
t2 = b * t1
t3 = uminus c
t4 = b * t3
t5 = t2 + t4
a = t5

# Op Arg1 Arg2

(14) unimus c -

(15) * (14) b

(16) unimus c -

(17) * (16) b

(18) + (15) (17)

(19) = a (18)

Q8. Translate the arithmetic expression a*- (b+c) into syntax tree and postfix notation.
Answer:

Q9. What are the applications of three address statements?

Answer: Three-address code (3AC) is a crucial concept in compiler design and has several
important applications in the compilation process:
●​ Intermediate Representation (IR): 3AC serves as an intermediate representation of the
source code. It simplifies the complexity of high-level language constructs into a format
that's easier to analyze, optimize, and translate into machine code. It provides a structured
representation of program semantics that retains essential information while abstracting
away from the specifics of the source language.
●​ Code Optimization: 3AC facilitates various optimization techniques such as constant
folding, common subexpression elimination, dead code elimination, and loop optimizations.
Because 3-address code is relatively simple and uniform, it becomes easier for compilers to
apply optimization algorithms to improve the efficiency and performance of generated code.
●​ Register Allocation: Register allocation is a critical optimization phase where the compiler
assigns variables and temporary values to processor registers. 3AC can be transformed into
a form suitable for register allocation algorithms, helping compilers efficiently utilize
available hardware resources.
●​ Code Generation: Once the code has been optimized and registered allocated, 3-address
code can be translated into the target machine code. The simplicity and structured nature of
3AC make code generation more manageable and enable compilers to produce efficient and
correct machine code for different target architectures.
●​ Target Independence: 3-address code abstracts away from the intricacies of specific
hardware architectures, allowing compilers to generate code for various target platforms
from the same intermediate representation. This level of abstraction makes it easier to port
compilers across different architectures and optimize code for multiple platforms.

Q10. Define loop unrolling with an example.

Answer: Loop Unrolling in Compiler Design


Let's start by defining what a loop means in computer programming. It's a sequence of code that
runs repeatedly, only ending when a specific condition is met. As a basic component of
algorithms, loops are widely utilized in programming.
The fundamental idea of loop unrolling is to decrease the number of loop iterations.
Let’s understand by an example.
for (int i = 0; i < 9; i++) {
cout<<”Coding Ninjas\n”;
}

Output:
The body of the loop will run 9 times during this loop. We can, however, cut down on the
number of iterations if we unroll the loop. For instance, the code might seem as follows if the
loop were unrolled by a factor of 3:
for (int i = 0; i < 9; i += 3) {
cout<<”Coding Ninjas\n”;
cout<<”Coding Ninjas\n”;
cout<<”Coding Ninjas\n”;
}

Output:

In this case, each iteration of the loop involves running 3 times. This reduces the number of
iterations to 3, which may lead to a noticeable boost in performance. This is because compiler
don’t have to check conditions of “for loops” for every single iteration. There is no difference in
result whatsoever as we can see from the output above.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy