CSC 225 Lecture 4 - 110522
CSC 225 Lecture 4 - 110522
Lesson Objectives
4.1 Introduction
As humans, communicating with a machine is a tedious task. We cannot, for example, just
say “add this number and that number and store the result here”. Computers have no way
of even beginning to understand what this means.
• As we stated before, the alphabet of the machine’s language is binary – it simply contains
the digits 0 and 1.
• Continuing with this analogy, instructions are the words of a machine’s language. That is,
they are meaningful constructions of the machine’s alphabet.
• The instruction set, then, constitutes the vocabulary of the machine. These are the words
understood by the machine itself.
To work with the machine, we need a translator. Assembly languages serve as an
intermediate form between the human-readable programming language and the machine-
understandable binary form.
Definition
Assembly language is a low-level programming language that is one step above machine
code. It provides symbolic representations of machine instructions, making it easier for
humans to write and understand.
Generally speaking, compiling a program into an executable format involves the following
stages:
1
4.2 KEY TERMS AND COMPONENTS
Register Small, fast storage inside the CPU (e.g., AX, BX, CX, DX)
• Operand(s) (usually required - The data or the location of the data (e.g., AX, BX, 1234H)
• Comment (optional)
This is the basic syntax: [label:] mnemonic [operands] [; comment]
Label: Label is an identifier that acts as a place marker for instructions and data. A label
placed just before an instruction implies the instruction’s address. Similarly, a label placed
just before a variable implies the variable’s address.
Data Label: A data label identifies the location of a variable, providing a convenient way to
reference the variable in code. The following, for example, defines a variable named count:
count DWORD 100 The assembler assigns a numeric address to each label. It is possible to
define multiple data items following a label. In the following example, array defines the
location of the first number (1024). The other numbers following in memory immediately
afterward:
array DWORD 1024, 2048
DWORD 4096, 8192; address is 4096 and number store on the address is 8192
Code Label: A label in the code area of a program (where instructions are located) must end
with a colon (:) character. Code labels are used as targets of jumping and looping
instructions. For example, the following JMP (jump) instruction transfers control to the
location marked by the label named target, creating a loop:
target:
mov ax,bx
...
jmp target
Label names are created using the rules for identifiers one can use the same code label
more than once in a program as long as each label is unique within its enclosing procedure.
(A procedure is like a function.)
Instruction Mnemonic: An instruction mnemonic is a short word that identifies an
instruction. In English, a mnemonic is a device that assists memory. Similarly, assembly
language instruction mnemonics such as mov, add, and sub provide hints about the type of
operation they perform. Following are examples of instruction mnemonics:
2
mov: Move (assign) one value to another
add: Add two values
sub: Subtract one value from another
mul: Multiply two values
jmp: Jump to a new location call: Call a procedure
Operands: Assembly language instructions can have between zero and three operands,
each of which can be a register, memory operand, constant expression, or input-output
port. A memory operand is specified by the name of a variable or by one or more registers
containing the address of a variable. A variable name implies the address of the variable and
instructs the computer to reference the contents of memory at the given address. Following
are examples of assembly language instructions having varying numbers of operands.
The STC and NOP instruction, for example, has no operands:
stc ; set Carry flag and
NOP; no operation
The INC instruction has one operand: inc eax ; add 1 to EAX
The MOV instruction has two operands: mov count,ebx ; move EBX to count
In a two-operand instruction, the first operand is called the destination. The second
operand is the source. In general, the contents of the destination operand are modified by
the instruction. In a MOV instruction, for example, data is copied from the source to the
destination.
IMUL instruction has 3 operands, in which the first operand is the destination, and the
following 2 operands are source operands:
imul eax,ebx,5 In this case, EBX is multiplied by 5, and the product is stored in the EAX
register.
Comments: Comments are an important way for the writer of a program to communicate
information about the program’s design to a person reading the source code. The following
information is typically included at the top of a program listing:
• Description of the program’s purpose
• Names of persons who created and/or revised the program
• Program creation and revision dates
• Technical notes about the program’s implementation
Comments can be specified in two ways: Single-line comments, beginning with a semicolon
character (;). All characters following the semicolon on the same line are ignored by the
assembler.
3
Mov eax, 5; I am a comment.
Block comments, beginning with the COMMENT directive and a user-specified symbol. All
subsequent lines of text are ignored by the assembler until the same user-specified symbol
appears. For example,
COMMENT!
This line is a comment.
This line is also a comment.
!
Other symbol can be used:
COMMENT &
This line is a comment.
This line is also a comment.
&
The NOP (No Operation) Instruction: The safest instruction you can write is called NOP (no
operation). It takes up 1 byte of program storage and doesn’t do any work. It is sometimes
used by compilers and assemblers to align code to even-address boundaries. In the
following example, the first MOV instruction generates three machine code bytes. The NOP
instruction aligns the address of the third instruction to a doubleword boundary x86
processors are designed to load code and data more quickly from even doubleword
addresses.
00000000 66 8B C3 mov ax,bx
00000003 90 nop; align next instruction
04 ov edx,ecx
A. Data Transfer Instructions: Used to move data between memory and registers.
Instruction Meaning
MOV AX, BX Copy contents of BX into AX
MOV AL, 34H Load 34H into the low byte of register AX
4
Instruction Meaning
ADD AX, BX Add BX to AX
SUB AX, 01H Subtract 1 from AX
Instruction Meaning
AND AX, BX Bitwise AND of AX and BX
OR AX, BX Bitwise OR of AX and BX
NOT AX Bitwise NOT of AX
D. Control Transfer Instructions: Change the flow of the program.
Instruction Meaning
JMP LABEL Jump to the location marked LABEL
JE LABEL Jump if equal (zero flag is set)
CALL SUB Call subroutine at label SUB
RET Return from subroutine
E. Comparison Instructions: Used to compare values.
Instruction Meaning
CMP AX, BX Compare AX with BX
TEST AX, AX Bitwise AND for testing purposes
5
Let’s go through the program line by line. Each line of program code will appear before its
explanation.
TITLE Add and Subtract (AddSub.asm)
The TITLE directive marks the entire line as a comment. Anything you want can be
put on this line.
; This program adds and subtracts 32-bit integers.
All text to the right of a semicolon is ignored by the assembler, so we use it for
comments.
INCLUDE Irvine32.inc
The INCLUDE directive copies necessary definitions and setup information from a
text file named Irvine32.inc, located in the assembler’s INCLUDE directory.
.code
The .code directive marks the beginning of the code segment, where all executable
statements in a program are located.
main PROC
The PROC directive identifies the beginning of a procedure. The name chosen for the
only procedure in our program is main.
mov eax,10000h ; EAX = 10000h
The MOV instruction moves (copies) the integer 10000h to the EAX register. The first
operand (EAX) is called the destination operand, and the second operand is called
the source operand.
The comment on the right side shows the expected new value in the EAX register.
add eax,40000h ; EAX = 50000h
The ADD instruction adds 40000h to the EAX register. The comment shows the
expected new value in EAX.
sub eax,20000h ; EAX = 30000h
The SUB instruction subtracts 20000h from the EAX register.
call DumpRegs ; display registers
The CALL statement calls a procedure that displays the current values of the CPU
registers.
This can be a useful way to verify that a program is working correctly.
exit
main ENDP
The exit statement (indirectly) calls a predefined MS-Windows function that halts
the program.
The ENDP directive marks the end of the main procedure. Note that exit is not a
MASM keyword; instead, it’s a macro command defined in the Irvine32.inc include
file that provides a simple way to end a program.
END main
The END directive marks the last line of the program to be assembled. It identifies
the name of the program’s startup procedure (the procedure that starts the program
execution). Program Output The following is a snapshot of the program’s output,
generated by the call
to DumpRegs:
6
Example 2:
.MODEL SMALL
.STACK 100H
.DATA
NUM1 DB 05H
NUM2 DB 03H
RESULT DB ?
.CODE
START:
MOV AL, NUM1 ; Load first number into AL
ADD AL, NUM2 ; Add second number
MOV RESULT, AL ; Store result
MOV AH, 4CH ; Exit
INT 21H
END START
Explanation
7
4.7 COMMON ERRORS AND TIPS (10 mins)
Summary:
1. What is the difference between MOV AX, BX and MOV AX, [BX]?
2. Why is INT 21H used in programs?
3. What is the role of the assembler?
REFERENCE MATERIALS