Screenshot 2023-05-05 at 11.38.21 PM
Screenshot 2023-05-05 at 11.38.21 PM
sharvari Govilkar
• Definition of an Assembler
• Basic functions of Assembler
• Features and Elements of Assembly Language Programming
• Forward Reference Problem and Solutions to it
• Multi-pass (2-Pass) Assembler Design
• 1-Pass Assembler Design
• SPARC Assembler
• Possible Errors during assembly of a program
An Assembler is a language translator that accepts an assembly language program as input and
produces its machine language equivalent as output, along with information required by the
Loader.
Databases
Page 1
SPCC : T. E COMP Prof. sharvari Govilkar
The primary function of assembler is to replace the mnemonic with its equivalent binary opcode
and replace the operand symbols and literals with their storage addresses.
Comments are for programmer to understand the meaning of the ALP statement in context with
the remaining code. The Assembler must ignore them as they are not for actual processing.
Literals are plain values that are not associated with any variable name. They are often used by
programmers as ad-hoc values that are not needed throughout the program, but only for a few
statements. An Assembler handles these literals by using a separate data structure called “Literal
Table” (LT).
For example, an integer abc is to be increased by 4 units, we write the statement as abc=abc+4;
Here, it is assumed that 4 is not needed again in the program. So, the value ‘4’ is used an integer
literal here. (Similarly if we consider strName = “John”, here “John” is a string literal)
Page 2
SPCC : T. E COMP Prof. sharvari Govilkar
Symbols are the variable names that hold values needed throughout the program. The Assembler
stores these symbols and their definition addresses in a data structure called as the “Symbol Table”
(ST).
Symbols are also referred to as symbolic references, because they are practically implemented as
references to memory locations where their values are stored.
Symbols can be used for Labels and Operands. The function of Assembler is to search the ST
when a symbol is found in the program and replace it with its definition address, where the value is
stored.
For example, an integer abc is increased by 4 units once, and then this value 4 is incremented and
added repeatedly to abc in a loop. Here, value 4 is used once but it’s referred from inside a loop
again and again. So, we define a variable (i.e. a symbol) incr = 4 and then this variable is used to
increment the value repeatedly inside the loop by using incr = incr + 1.
Procedures are nothing but functions in ALP; they are used when a relatively large number of
statements are to be used repeatedly in the program.
When a procedure is called, current execution state of the program is pushed onto stack. When the
procedure returns, the execution state is popped out of the stack and the calling program is
resumed.
Page 3
SPCC : T. E COMP Prof. sharvari Govilkar
2. Declarative Statements
3. Imperative Statements
Page 4
SPCC : T. E COMP Prof. sharvari Govilkar
Exam Question:
Q) What are Assembler Directives? Explain with examples (May ’06 [Comps] – 6M, Dec
’08 [Comps] – 5M)
These statements direct the assembler to take the action associated with it. They are not a part of
executable instructions. So, they are not included in the final object file generated by assembler as
output
ii. ORG has a similar purpose as START, but ORG is included in (and is implicitly associated
with) the program, whereas START is used independently of a program. So, START
statement must also include the program name in its label field.
E.g:
PG1 START 2000 ; Start storing the program PG1 from address 2000
ORG 3000 ; Start storing the current program (in which ORG is used) from address 3000
iii. USING is an assembler directive to indicate which register is to be used as base register
and what will be its value for the base.
E.g: USING 1500,15 ; Use register 15 as base register and 1500 as its value for the base
iv. DROP is an assembler directive used to drop a base register that has been allocated by a
USING statement
Page 5
SPCC : T. E COMP Prof. sharvari Govilkar
v. EQU is an assembler directive used to equate a symbolic name to a value and make program
more readable. Whenever a symbol is defined using EQU directive, no memory is allocated
to it; only an entry is made in symbol table (ST).
E.g:
SUNDAY EQU 1
MONDAY EQU SUNDAY + 1
Now, wherever in the program we use the symbols SUNDAY and MONDAY, assembler will
replace them by 1 and 2 i.e. their respective equated values.
2. Declarative Statements – These are special cases of assembler directives; they are used to
declare symbols and associated them with values. For example, var1 DW A43B is used to declare a
symbol var1 with hex value A43B; DW tells the assembler that var1 uses a 16-bit word in the
memory.
1. Use of Mnemonics to specify Opcodes makes the assembly language program much more
readable and debugging is also easier.
Page 6
SPCC : T. E COMP Prof. sharvari Govilkar
2. Use of Symbols to specify Operands means that program can be modified with no
overhead. That is, if definition address of a symbol changes, the change in ALP is done
only at the place where the symbol is declared; all the places where the symbol has been
used need not be updated.
3. Separation of Code and Data Segments allows the programmer to keep aside some
portion of memory for the data to be used by the program.
Assembler needs following basic data structures (also called as databases) as input:
1. ALP Source Code – ALP statements are contained in the source file. For example,
A sample ALP code for adding 2 numbers can be given as follows:
2. Mnemonic Opcode Table (MOT) – All different machine architectures have their own set
of mnemonics to be used for their respective assembly language (So, ALP of one machine
cannot run on another machine).
MOT is a fixed length table defined by programmer as per assembly language of the
underlying machine. During translation, the assembler searches for the mnemonic in MOT
and replaces it with the associated binary opcode given in MOT.
Page 7
SPCC : T. E COMP Prof. sharvari Govilkar
. . . .
. . . .
. . . .
3. Pseudo-Opcode Table (POT) – POT is also a fixed length table defined by programmer as
per assembly language of the underlying machine. While translation, the assembler
searches for the pseudo-opcode in POT and takes necessary action associated with it.
Page 8
SPCC : T. E COMP Prof. sharvari Govilkar
Pseudo-Opcode Number of
Operands
. .
. .
. .
Pseudo-Opcode Number of
Operands
DB 1
DW 1
DD 1
CONST 1
START 1
LTORG 1
ENDP 0
END 0
[Note that these MOT and POT are only for a Hypothetical machine (and not for any specific
machine like 8085/8086); the ALP given in above example uses only a few of these mnemonics
and pseudo-ops]
4. Symbol Table (ST) – ST is used to keep a track of symbols assigned to variables being
used in the program.
Page 9
SPCC : T. E COMP Prof. sharvari Govilkar
When a symbol gets defined, Assembler makes its entry into the Symbol Table (ST) along
with its definition address. When a symbol is used in an instruction, the Assembler first
verifies validity of the symbol using ST and if validation is successful, definition address of
symbol is written into output file.
(Note that, ST is not language-specific like MOT / POT; ST is program specific i.e. every
program has its own ST).
Structure of ST is as follows:
. . .
. . .
. . .
Page 10
SPCC : T. E COMP Prof. sharvari Govilkar
5. Literal Table – Assembler tracks the usage of literals in a program through a Literal Table
(LT).
When a literal is first encountered in a program, its entry is made into LT along with its
usage address, but the definition address is not updated. Now, with each different usage of
same literal, assembler inserts the usage address into the field accordingly. But these
literals do not get defined until EOF is reached.
Once EOF is reached, all the literals in LT get defined in the area after the end of program
and their definition address is updated in LT. This area at the end of program where literals
get defined is often called as a “Literal Pool”
Structure of LT is given as follows:
Literal Notation Value Usage Address Definition Address
. . . .
. . . .
. . . .
Page 11
SPCC : T. E COMP Prof. sharvari Govilkar
1012 STORE B
1014 STOP ; program execution ends
1015 ENDP ; code segment ends
1016 A DB 04 ; Define a Byte for variable ‘A’ with value 04
1017 B DB 04 ; Define a Byte for variable ‘B’ with value 04
1018 C DB 06 ; Define a Byte for variable ‘C’ with value 06
1019 END ; End of program
So, the LT for above example is given as follows:
Exam Questions:
Q) State the reason for assembler to be a multi-pass program. (May ’05 [Comps] – 4M, May
’04 [IT] – 4M, Dec ’06 [Comps] – 4M)
Q) Explain Forward Reference Problem in Assembler. (May ’05 [IT] – 4M, Dec ’05 [IT] – 4M)
Page 12
SPCC : T. E COMP Prof. sharvari Govilkar
[NOTE: These tables and algorithms in this book are for a hypothetical machine. So some major
changes might be required, for actual implementation of Assembler (or even simulation of these
algorithms)]
In a Multi-pass approach, assembler assembles the mnemonics and constructs ST in pass 1. In the
Pass 2, Input file is read again along with ST and an assembled object code file is generated in the
output. This is a 2-Pass Approach, but some complex programs may need more than just 2 passes
for completing the assembly of the program. Such an approach is called Multi-pass Approach.
In a Single-pass approach, the Assembler makes use of a special data structure called Forward-
Reference Table (FRT) to keep track of only the forward-referenced symbols. At the end of pass 1,
this new data structure is used to update definition addresses of forward-referenced symbols in ST
and in the program.
Exam Questions:
Q) Explain with neat flowchart and database working of two-pass assembler. (Dec ’04 [IT] –
12M, Dec ’07 [IT] – 10M, June ’08 [Comps] – 10M, Dec ’08 [Comps] – 10M, June ’08 [IT] –
10M) [Often asked as a compulsory question]
Q) Explain working of each of the two passes of a two-pass assembler, with the help of
databases. (Dec ’04 [Comps] – 10M, Dec ’05 [IT] – 10M)
Q) Give the analysis and design of a two-pass assembler w.r.t flowchart, data structures and
algorithms? (Dec ’05 [Comps] – 20M, June ’06 [Comps] – 20M).
Page 13
SPCC : T. E COMP Prof. sharvari Govilkar
When an Assembler needs more than one pass through the input program to complete the
assembly of ALP, it’s called a Multi-pass Assembler.
In general, not more than 2 passes are actually required for complete assembly of an ALP.
[Complex programs requiring more than 2 passes for assembly are beyond the scope of our
syllabus (and this book). So we are essentially looking at a Two-Pass Assembler Design]
Pass 1 Pass 2
Pass 1 is used collect all symbol definitions (for variables and labels) in ST. In Pass 2, the source
file is read again and since addresses of all symbols are now known through ST, all instructions
get fully assembled and a final object file is generated as output.
A Symbol Table (ST) containing all symbol definitions used in programs, is generated at the end
of Pass 1.
[For the purpose of simplicity, we break down algorithm based on how each different element of
an ALP statement is processed by the assembler, in each pass]
(PLEASE NOTE:
1. From here onwards in this book, NEXT STEP refers to the next LOGICAL step. So,
after processing Label, if there is a mnemonic, next step will be processing that
mnemonic. If there is no mnemonic, next line will be read. Similarly after mnemonic,
if the instruction has an operand, then next step is processing those operands. And
obviously, after operands if there are any comments they are ignored and next step is
reading the next line.
Page 15
SPCC : T. E COMP Prof. sharvari Govilkar
2. In exam, you need not draw these smaller flowcharts; they are meant only for
understanding. The main overall flowcharts of the each pass are enough. But make
sure you explain clearly how each element is processed).
When a label definition is encountered in the program, the assembler searches for it in ST. If an
existing label already exists, then it is a case of duplicate label definition. So the error flag is
turned on and “Duplicate Label” error is displayed. If the label is not found in ST, then it entry is
made into ST and definition address gets updated. (Label references or Label “calls” are treated
as symbol references by Assembler)
Page 16
SPCC : T. E COMP Prof. sharvari Govilkar
Search Label in ST
Label Definition
Display Error
“Duplicate Label” NEXT STEP
Page 17
SPCC : T. E COMP Prof. sharvari Govilkar
NEXT STEP
If an instruction has operands, the assembler first checks if the operand is a Literal or not.
If operand is a Literal, it is searched in LT. If found, its usage address is updated to current LC. If
not found, first its entry is made in LT and then the usage address is updated.
If operand is a symbol, it is searched in ST. If found, it indicates that it has already been defined.
If not found, its entry is made in ST, but neither type nor definition address are updated.
Page 18
SPCC : T. E COMP Prof. sharvari Govilkar
Operand YES
Operand is Search Literal in LT
a Literal?
NO
NO
Literal
Found?
Search Symbol in ST
YES
Symbol
Found?
Usage Address LC
{Do Nothing}
NO
LC LC + LOI NEXT STEP
Insert Symbol in ST
1. START: The START pseudo-op is used to indicate the starting location of program in the
memory. If its operand field is 0 or blank, it means program can be moved anywhere in the
memory (i.e. it is “re-locatable”). Otherwise, the operand gives a fixed starting address in
memory (i.e. program is “static”).
Page 19
SPCC : T. E COMP Prof. sharvari Govilkar
2. ENDP: The ENDP pseudo-op (without any operand) is used to indicate end of code segment
and start of data segment. So, to process this, the assembler simply resets the code_flag to 0.
3. DB / DW: These two pseudo-ops are used to define variables. Its label field is the variable
name and operand field is variable value.
So, when DB or DW is encountered, its label is searched in ST. If found, its type is updated
to VAR and definition address is updated to current LC. If not found, first its entry is made in
ST and then its type and definition address are updated.
4. CONST: The CONST pseudo-op is used to define a constant variable whose value is fixed at
the time of definition and cannot be changed during program execution. Its label field is the
variable name and operand field is variable’s fixed value.
When CONST is encountered, its label is searched in ST. If found, its type is updated to
CONST and definition address is updated to current LC. If not found, first its entry is made in
ST and then its type and definition address are updated.
Page 20
SPCC : T. E COMP Prof. sharvari Govilkar
Pseudo-Opcode
Type of Pseudo-Opcode
START
ENDP
DB / DW
CONST
Search POT for
Pseudo-op
Search POT for Pseudo-op
{Ignore}
NO
Label
Found?
Read Next Line
Make entry
YES into ST
Page 21
SPCC : T. E COMP Prof. sharvari Govilkar
1. PROC Mnemonic: The PROC mnemonic indicates start of a procedure definition. Its
operand field gives the procedure name. This procedure name is searched in ST. If it
already exists in ST, then a “Duplicate Procedure” error is displayed. If not found, its entry
is made in ST and type is set to PROC and definition address is set to current LC.
Also, the assembler creates a flag with name ‘procname_flag’ and sets it ON, indicating the
procedure definition is on. (When ENDP with same procname as its operand is found,
procname_flag is turned off).
Search Operand in ST
PROC
NO
Operand found Set error flag = ON
in ST?
YES
Display error
“Duplicate Procedure”
Make entry into ST with
Type PROC and
Definition Address LC.
NEXT STEP
Page 22
SPCC : T. E COMP Prof. sharvari Govilkar
3. CALL Mnemonic: The CALL mnemonic indicates a procedure call; its operand is the
name of the procedure to be called. When CALL is encountered, it is first searched in MOT
and its operand is searched in ST for type=”PROC”. If found, it means the procedure is
well-defined. So, only LC is incremented by LOI. If not found, its entry is made into ST,
but neither the type nor definition address are updated.
Page 23
SPCC : T. E COMP Prof. sharvari Govilkar
YES
LC LC + LOI
NEXT STEP
Page 24
SPCC : T. E COMP Prof. sharvari Govilkar
At the end of Pass 1, all the literals in LT get defined in the area after the end of program and
their definition address is updated in LT. This area at the end of program where literals get
defined is often called as a “Literal Pool”
After Pass 1 is completed successfully, assembler re-scans the source file, assembles the
mnemonics in it and writes their equivalent binary code into output file. Symbols and Literals are
resolved using ST and LT respectively.
Page 25
SPCC : T. E COMP Prof. sharvari Govilkar
Page 26
SPCC : T. E COMP Prof. sharvari Govilkar
Operand YES
Operand is Search Literal in LT and
a Literal? get definition address
NO
Write definition
address in o/p file
L LC + LOI
NEXT STEP
Page 27
SPCC : T. E COMP Prof. sharvari Govilkar
1. START: The pseudo-op is searched in POT; its equivalent binary opcode is written
into output file along with its operand.
2. ENDP: It is used to indicate end of code segment and start of data segment. So, the
assembler simply resets the code_flag to 0.
3. DB / DW / CONST: They are used to define variables. In Pass 2, their labels (i.e.
variable name) are searched in ST and their definition address is retrieved. Their
operands (i.e. value of variable) are now written into this definition address. For
CONST, the processing is same, except that values once written cannot be modified
by the program.
Page 28
SPCC : T. E COMP Prof. sharvari Govilkar
Pseudo-Opcode
Type of Pseudo-Opcode
START
ENDP
DB / DW
CONST
Search POT for Search POT for
Pseudo-op Pseudo-op
Search POT for Pseudo-op
NEXT STEP
Page 29
SPCC : T. E COMP Prof. sharvari Govilkar
1. PROC Mnemonic: The PROC mnemonic is searched in MOT; its binary opcode is
written into output file along with its operand. Operand is read into procname and
procname_flag is set ON. LC is then incremented by LOI.
Set procname_flag = ON
LC LC + LOI
NEXT STEP
2. ENDP procname: It indicates end of a procedure with name procname. So, the
assembler sets procname_flag to OFF.
Page 30
SPCC : T. E COMP Prof. sharvari Govilkar
3. CALL Mnemonic: The CALL mnemonic is searched in MOT; its operand symbol
searched in ST. Binary opcode of mnemonic and definition address of operand symbol
are written into output file.
Search Operand in ST
LC LC + LOI
NEXT STEP
Exam Questions:
Q) Using any assembler language, write a sample assembly program and w.r.t that program,
describe how a two-pass assembler will translate it. (May ’04 [IT] – 10M). Page 31
(Hint: Give a brief explanation of the steps as well)
SPCC : T. E COMP Prof. sharvari Govilkar
Page 32
SPCC : T. E COMP Prof. sharvari Govilkar
JMP 07 1 2
JNZ 08 1 2
STOP 09 0 1
Pseudo-Opcode Table (POT) (Remains static for both passes)
Pseudo-Opcode Number of
Operands
DB 1
DW 1
DD 1
CONST 1
START 1
LTORG 1
ENDP 0
END 0
(NOTE: If an exam question asks you to assemble an entire program, keep MOT / POT limited to
only mnemonics and pseudo-ops used in your program, as it is only for a hypothetical machine)
Page 33
SPCC : T. E COMP Prof. sharvari Govilkar
Page 34
SPCC : T. E COMP Prof. sharvari Govilkar
Address Value
1018 07
1019 08
1020 NULL
1021,1022 NULL
Address Value
1024 00
Exam Questions:
Q) Explain with the help of flowchart and data structures, the working of a single-pass
assembler. (May ’04 [Comps] – 10M, May ’05 [Comps] – 10M, May ’05 [IT] – 10M, May ’06 [IT]
– 10M, Dec ’07 [IT] – 10M, May ’07 [Comps] – 10M).
Forward Reference Problem (FRP) occurs when a symbol is referenced before it gets defined. Due
to this, the assembler cannot assemble the instruction right at the time when it gets encountered.
To resolve FRP, we used Multi-pass approach which collects all symbol definitions in Pass 1 and
then assembles all instructions in Pass 2. But it requires an extra pass to handle forward-referenced
symbols, which is inefficient.
A Single-pass Assembler solves FRP efficiently in one pass. It uses a special data structure called
the Forward Reference Table (FRT) and assembles all the instructions completely, right at the time
when they are encountered.
If the instruction involves a forward-referenced symbol (that is not yet defined), such symbols are
entered into FRT. At the end of pass, when all symbol definitions have been collected in ST, the
assembler uses ST to update definition addresses of forward-referenced symbols in FRT.
Page 35
SPCC : T. E COMP Prof. sharvari Govilkar
It then uses FRT to update the usage locations of forward-referenced symbols in output file. Thus,
a single-pass assembler handles forward reference symbols efficiently.
When One-Pass Assembler encounters a forward-referenced symbol, it enters the symbol in FRT
and updates its usage address. For multiple references to a symbol in FRT, the assembler will
append the usage address in the corresponding field.
When the symbol gets defined, its definition address is updated in FRT. At the end of pass, the
assembler copies the definition address from FRT into usage addresses, for every forward-
referenced symbol.
. . . .
. . . .
. . . .
1000 LOAD X
1002 UP1: SUB Y
1004 JZ DOWN1 ; Forward-referenced Symbol encountered
Page 36
SPCC : T. E COMP Prof. sharvari Govilkar
1006 ADD Y
1008 JNZ DOWN1 ; Forward-referenced Symbol encountered
1010 JMP UP1
1012 DOWN1: STORE Y
1014 ENDP
1015 X DB 02
1016 Y DB 04
1017 STOP
1018 END
At the end of assembly of this program using One-Pass Assembler, FRT will be as follows:
Now, the assembler copies definition address ‘1012’ at usage addresses 1005 and 1009.
Page 37
SPCC : T. E COMP Prof. sharvari Govilkar
10. FRT_PTR (to keep track of location being read from/written into FRT)
11. LC (Location Counter is needed to keep track of which location from source file is being
read)
12. CODE_FLAG (to indicate where code segment ends and data segment starts)
Page 38
SPCC : T. E COMP Prof. sharvari Govilkar
Initializations
NXT
YES
NXT
STOP
Display Error
“Assembly successful”
Page 39
SPCC : T. E COMP Prof. sharvari Govilkar
(Note that the design of 1P-Assembler given in this book does not handle procedures)
2.9.4 Initializations in One-Pass Assembler
When a label definition is encountered in the program, the assembler searches for the label in ST.
If an existing label already exists, then it’s a case of duplicate label definition. So the error flag is
turned on and “Duplicate Label” error is displayed.
If the label is not found in ST, then it is searched in FRT to see if it has been “forward-referenced“.
If found in FRT, then it is surely a case of forward-referenced LABEL. So, its definition address is
appended in corresponding field FRT.
If not found in FRT, then it’s a regular non-forward-referenced LABEL. So, its entry in made into
ST with type as LABEL and definition address as current LC.
Page 40
SPCC : T. E COMP Prof. sharvari Govilkar
Search Label in ST
Label Definition
YES NO
Label
Found in Search Label in FRT
ST?
NO
Label
Found in
Display Error FRT?
“Duplicate Label”
YES
Insert Label in ST ,
Type LABEL and Append
Definition Address LC Definition Address LC
in FRT
NEXT STEP
The mnemonic is searched in MOT; its equivalent binary opcode is read from MOT and written
into output file. LC is incremented by LOI.
Page 41
SPCC : T. E COMP Prof. sharvari Govilkar
Fig. 2.18
LC LC + LOI NEXT STEP
If the operand is a literal, it’s first searched in LT. If found, its usage address is updated.
Otherwise, first its entry is made into LT and then its usage address is updated.
If the operand is a symbol, it is searched in ST. If found, its definition address is written into
output file and LC is incremented by LOI.
If the operand symbol is not found in ST, then it’s a case of forward-referenced symbol. So, the
entry for that symbol is made into FRT with type=”LABEL” and usage address= LC. Now, if this
symbol is referenced again in the program, its usage address gets appended in the corresponding
field. When this symbol finally gets defined, its definition address is updated in FRT.
Page 42
SPCC : T. E COMP Prof. sharvari Govilkar
Operand YES
Operand is Search Literal in LT
a Literal?
NO
NO
Literal
Found?
Search Symbol in ST
YES
Symbol
Found in
Append LC to Usage
ST?
Address in LT
Get definition
NO address and write it
into o/p file
Append LC to usage
Address in FRT
NO
Page 43
SPCC : T. E COMP Prof. sharvari Govilkar
1. START: The pseudo-op is searched in POT; its equivalent binary opcode is written into
output file along with its operand.
2. ENDP: It is used to indicate end of code segment and start of data segment. So, the
assembler simply resets the code_flag to 0.
3. DB / DW / CONST: They are used to define variables / constants. Their labels (i.e.
variable names) are searched in ST. If not found, its entry is first made into ST and then its
definition address is updated. If found, its definition address can be directly updated.
After the definition address is updated, the operand of instruction (i.e. value of symbol) is
written at this address.
Now to check that if symbol was also forward-referenced, the label field (i.e. symbol name)
is now searched in FRT. If found, its definition address is updated. If not found, it means
symbol was not forward-referenced.
Finally, LC is incremented by 1 for DB or CONST (since each of these two occupies one
byte) or by 2 for DW (since one word occupies two bytes).
Page 44
SPCC : T. E COMP Prof. sharvari Govilkar
Pseudo-Opcode
Type of Pseudo-Opcode
START
ENDP
DB / DW
CONST
Search POT for Search POT for
Pseudo-op Pseudo-op
Search POT for Pseudo-op
NO
Label found
in ST? Enter Label into ST
NEXT STEP
YES For DB / DW, set Type VAR
For CONST, set Type CONST
Page 45
SPCC : T. E COMP Prof. sharvari Govilkar
Page 46
SPCC : T. E COMP Prof. sharvari Govilkar
LOAD 03 1 2
STORE 04 1 2
INC 05 1 2
DEC 06 1 2
JMP 07 1 2
JNZ 08 1 2
JNC 09 1 2
STOP 10 0 1
Pseudo-Opcode Number of
Operands
DB 1
DW 1
DD 1
CONST 1
START 1
LTORG 1
ENDP 0
END 0
(NOTE: If an exam question asks you to assemble an entire program, keep MOT / POT limited to
only mnemonics and pseudo-ops used in your program, as it is only for a hypothetical machine)
Page 47
SPCC : T. E COMP Prof. sharvari Govilkar
N1 VAR 1022
repeat LABEL 1008
SUM VAR 1025
down LABEL 1020
Output File:
Address Opcode Operand
1002 03 1023
1004 04 1024
1006 03 1029
1008 01 1022
1010 06 1024
1012 08 1008
1014 04 1025
Page 48
SPCC : T. E COMP Prof. sharvari Govilkar
1016 09 1020
1018 05 1025
1020 10 -
Address Value
1022 07
1023 08
1024 NULL
1025,1026 NULL
Address Value
1028 00
Exam Questions:
Q) Explain SPARC Assembler. (May ’04 [Comps] – 5M, Dec ’04 [Comps] – 10M, May ’06
[Comps] – 6M, May ’07 [Comps] – 5M, Dec ’07 [IT] – 10M, June ’08 [Comps] – 10M, Dec ’08
[Comps] – 5M).
Page 49
SPCC : T. E COMP Prof. sharvari Govilkar
The memory given to a program is divided into different segments called “sections”. Some
examples of such sections are
• .TEXT – for executable instructions.
• .DATA – for data needed by program, it is a read-write section.
• .BSS (Block Starting Symbol) – for uninitialized or zero-initialized data sections (when
programs become large and complex, often considerable number of data variables may
not be initialized at all (or initialized to zero) at the start of program; they may get
values during execution. .BSS section is needed to store such data).
The Assembler maintains a different location counter (LC) for each named section. When control
is switched form one section to another, the associated LC is also switched.
Thus, sections are like blocks of same program. But inter-section references (within the same
program) are to be resolved by the Linker and not the Assembler.
2. Object File
Page 50
SPCC : T. E COMP Prof. sharvari Govilkar
• Local Symbols: Symbols defined and used in same program are called Local Symbols.
A section can freely refer to local symbols defined in other sections of same program.
• Global Symbols: Symbols that are used in a program, but defined externally or the
ones that are defined locally in a program but can be accessed by other programs are
called Global Symbols.
• Weak Symbols: If in a section, there are two symbols of same name, one is a local
symbol and the other is a global symbol, then the local symbol may be over-ridden by
the global symbol having the same name.
Such local symbols that may be over-ridden by a global symbol are called weak
symbols
4. Delay Slot
The SPARC Architecture uses delayed branching logic i.e. instruction appearing after a branch
instruction is executed before the branch is actually taken. Such an instruction is said to be in a
“delay slot” of the branch.
SPARC Assembly Programmers often use NOP (No Operation) instruction or any other
significant instruction to optimize the performance.
Page 51
SPCC : T. E COMP Prof. sharvari Govilkar
• Wrong number (or wrong type) of parameters (i.e. parameter mismatch), in a procedure
call.
• Missing End-of-Procedure
• Missing End-of-File
Page 52
SPCC : T. E COMP Prof. sharvari Govilkar
this ST. Since it uses one pass is used to collect only symbols, it may be inefficient in case
of programs having very less or no forward-referenced symbols at all.
• So, we use a Single-pass approach that makes only one pass through the program and uses
an extra data structure, Forward Reference Table (FRT) to resolve forward-reference
symbols.
• In a Single-pass assembler, a forward-referenced symbol (along with its usage locations) is
first inserted in FRT. When these symbols get defined in ST, their definition addresses are
updated in FRT. At the end of the pass, FRT is used to insert definition address value into
usage locations for all forward-referenced symbols.
• Features of SPARC Assembler include:
o Division of program memory into different types of “sections” and maintaining
different location counters for each different section.
o Use of global, local and weak symbols.
o Object file containing a list of relocation and linking operations for the Linker
o A “Delayed Branching” facility i.e. that instruction immediately following a branch
instruction will be actually executed before the branch is taken. Programmers can
place important instructions in this “delay slot” to optimize the program
performance.
Q.3) Usage Address field in Literal Table (LT) and Forward Reference Table (FRT) can have
multiple values for same literal or symbol. How it can be practically implemented?
(Ans: A Linked List! The usage address column for an entry, will point to start [i.e. head] of
the linked list. Each usage address node of the linked list will point to the next node.
Whenever a new usage address is to be appended, it will be linked to the last node of the list).
Page 53
SPCC : T. E COMP Prof. sharvari Govilkar
Q.4) What is a “Literal Pool”? (Note: Sometimes, literals may not be defined in memory
immediately after the program; a separate memory area may be reserved for them. So, literal
pool may not necessarily be immediately after the program).
Q.6) How are literals different from regular symbols? (Also, mention about difference in structure
of ST and LT).
Q.7) Why is assembly language preferred over machine language? (Refer section 2.3)
Q.9) What could be the possible errors during the assembly of the program?
Page 54