0% found this document useful (0 votes)

53 views200 pages

Operating Systems From 0 To 1 PT1

This document provides an in-depth overview of operating systems design and implementation. It discusses relevant reference documents, computer architecture, assembly, program structure, debugging tools, bootloaders, linking and loading. The goal is to guide readers through building their own operating system from scratch in a hands-on manner.

Uploaded by

Islam Tazerout

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

53 views200 pages

Operating Systems From 0 To 1 PT1

Uploaded by

Islam Tazerout

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 200

TU, DO HOANG

O P E R AT I N G S Y S T E M S :
FROM 0 TO 1
Contents

Preface i

I Preliminary 1

1 Domain documents . . . . . . . . . . . . . . . 3

1.1 Problem domains . . . . . . . . . . . . . . . 3

1.2 Documents for implementing a problem domain . . . . 6

1.3 Documents for writing an x86 Operating System. . . . 9

2 From hardware to software: Layers of abstraction . . 11

2.1 The physical implementation of a bit . . . . . . . . 11

2.2 Beyond transistors: digital logic gates . . . . . . . 12

2.3 Beyond Logic Gates: Machine Language . . . . . . 17

2.4 Abstraction . . . . . . . . . . . . . . . . . 26

3 Computer Architecture . . . . . . . . . . . . . 33

3.1 What is a computer? . . . . . . . . . . . . . 33

3.2 Computer Architecture . . . . . . . . . . . . . 39

3.3 x86 architecture . . . . . . . . . . . . . . . 44

3.4 Intel Q35 Chipset . . . . . . . . . . . . . . . 47

3.5 x86 Execution Environment . . . . . . . . . . . 47

II tu, do hoang

4 x86 Assembly and C . . . . . . . . . . . . . . 49

4.1 objdump . . . . . . . . . . . . . . . . . . 50

4.2 Reading the output . . . . . . . . . . . . . . 51

4.3 Intel manuals . . . . . . . . . . . . . . . . 53

4.4 Experiment with assembly code . . . . . . . . . . 54

4.5 Anatomy of an Assembly Instruction . . . . . . . . 56

4.6 Understand an instruction in detail . . . . . . . . 66

4.7 Example: jmp instruction . . . . . . . . . . . . 69

4.8 Examine compiled data . . . . . . . . . . . . . 72

4.9 Examine compiled code . . . . . . . . . . . . . 86

5 The Anatomy of a Program . . . . . . . . . . . 107

5.1 Reference documents: . . . . . . . . . . . . . 109

5.2 ELF header . . . . . . . . . . . . . . . . . 109

5.3 Section header table . . . . . . . . . . . . . . 114

5.4 Understand Section in-depth . . . . . . . . . . . 121

5.5 Program header table . . . . . . . . . . . . . 141

5.6 Segments vs sections. . . . . . . . . . . . . . 144

6 Runtime inspection and debug . . . . . . . . . . 151

6.1 A sample program . . . . . . . . . . . . . . 151

6.2 Static inspection of a program . . . . . . . . . . 152

6.3 Runtime inspection of a program . . . . . . . . . 163

6.4 How debuggers work: A brief introduction . . . . . . 179

II Groundwork 191

7 Bootloader . . . . . . . . . . . . . . . . . . 193

7.1 x86 Boot Process . . . . . . . . . . . . . . . 193

7.2 Using BIOS services . . . . . . . . . . . . . . 194

7.3 Boot process . . . . . . . . . . . . . . . . . 195

operating systems: from 0 to 1 III

7.4 Example Bootloader . . . . . . . . . . . . . . 195

7.5 Compile and load . . . . . . . . . . . . . . . 196

7.6 Loading a program from bootloader . . . . . . . . 201

7.7 Improve productivity with scripts . . . . . . . . . 205

8 Linking and loading on bare metal . . . . . . . . . 217

8.1 Understand relocations with readelf . . . . . . . . 218

8.2 Crafting ELF binary with linker scripts . . . . . . . 227

8.3 C Runtime: Hosted vs Freestanding . . . . . . . . 248

8.4 Debuggable bootloader on bare metal . . . . . . . . 249

8.5 Debuggable program on bare metal . . . . . . . . . 251

III Kernel Programming 275

9 x86 Descriptors. . . . . . . . . . . . . . . . . 277

9.1 Basic operating system concepts . . . . . . . . . 277

9.2 Drivers . . . . . . . . . . . . . . . . . . 279

9.3 Userspace and kernel space . . . . . . . . . . . 279

9.4 Memory Segment . . . . . . . . . . . . . . . 280

9.5 Segment Descriptor . . . . . . . . . . . . . . 280

9.6 Types of Segment Descriptors . . . . . . . . . . 280

9.7 Descriptor Scope . . . . . . . . . . . . . . . 280

9.8 Segment Selector . . . . . . . . . . . . . . . 280

9.9 Enhancement: Bootloader with descriptors . . . . . . 280

10 Process . . . . . . . . . . . . . . . . . . . . 281

10.1 Concepts . . . . . . . . . . . . . . . . . . 281

10.2 Process . . . . . . . . . . . . . . . . . . 281

10.3 Threads . . . . . . . . . . . . . . . . . . 283

10.4 Task: x86 concept of a process . . . . . . . . . . 284

10.5 Task Data Structure . . . . . . . . . . . . . . 284

IV tu, do hoang

10.6 Process Implementation . . . . . . . . . . . . 284

10.7 Milestone: Code Refactor . . . . . . . . . . . . 285

11 Interrupt . . . . . . . . . . . . . . . . . . . 287

12 Memory management . . . . . . . . . . . . . . 289

13 File System . . . . . . . . . . . . . . . . . . 291

Index . . . . . . . . . . . . . . . . . . . . . 293

Biblography . . . . . . . . . . . . . . . . . . 295
Preface

Greetings!

You’ve probably asked yourself at least once how an operating system

is written from the ground up. You might even have years of program-
ming experience under your belt, yet your understanding of operating
systems may still be a collection of abstract concepts not grounded in
actual implementation. To those who’ve never built one, an operating
system may seem like magic: a mysterious thing that can control hard-
ware while handling a programmer’s requests via the API of their favorite
programming language. Learning how to build an operating system seems
intimidating and diﬃcult; no matter how much you learn, it never feels
like you know enough. You’re probably reading this book right now to
gain a better understanding of operating systems to be a better software
engineer.

If that is the case, this book is for you. By going through this book,
you will be able to ﬁnd the missing pieces that are essential and enable
you to implement your own operating system from scratch! Yes, from
scratch, without going through any existing operating system layer to
prove to yourself that you are an operating system developer. You may
ask,“Isn’t it more practical to learn the internals of Linux?”.

Yes...

and no.

Learning Linux can help your workﬂow at your day job. However, if
you follow that route, you still won’t achieve the ultimate goal of writ-
ing an actual operating system. By writing your own operating system,
you will gain knowledge that you will not be able to glean just from learn-
ii tu, do hoang

ing Linux.

Here’s a list of some beneﬁts of writing your own OS:

✄ You will learn how a computer works at the hardware level, and you
will learn to write software to manage that hardware directly.

✄ You will learn the fundamentals of operating systems, allowing you

to adapt to any operating system, not just Linux

✄ To hack on Linux internals suitably, you’ll need to write at least one

operating system on your own. This is just like applications program-
ming: to write a large application, you’ll need to start with simple ones.

✄ You will open pathways to various low-level programming domains

such as reverse engineering, exploits, building virtual machines, game
console emulation and more. Assembly language will become one of
your most indispensable tools for low-level analysis. (But that does
not mean you have to write your operating system in Assembly!)

✄ Writing an operating system is fun!

Why another book on Operating Systems?

There are many books and courses on this topic made by famous profes-
sors and experts out there already. Who am I to write a book on such
an advanced topic? While it’s true that many quality resources exist, I
find them lacking. Do any of them show you how to compile your C code
and the C runtime library independent of an existing operating system?
Most books on operating system design and implementation only dis-
cuss the software side; how the operating system communicates with the
hardware is skipped. Important hardware details are skipped, and it’s
difficult for a self-learner to find relevant resources on the Internet. The
aim of this book is to bridge that gap: not only will you learn how to pro-
gram hardware directly, but also how to read official documents from hard-
ware vendors to program it. You no longer have to seek out resources to
help yourself interpret hardware manuals and documentation: you can
do it yourself. Lastly, I wrote this book from an autodidact’s perspec-
tive. I made this book as self-contained as possible so you can spend more
operating systems: from 0 to 1 iii

time learning and less time guessing or seeking out information on the
Internet.

One of the core focuses of this book is to guide you through the pro-
cess of reading official documentation from vendors to implement your
software. Official documents from hardware vendors like Intel are criti-
cal for implementing an operating system or any other software that di-
rectly controls the hardware. At a minimum, an operating system devel-
oper needs to be able to comprehend these documents and implement
software based on a set of hardware requirements. Thus, the first chap-
ter is dedicated to discussing relevant documents and their importance.

Another distinct feature of this book is that it is “Hello World” cen-

tric. Most examples revolve around variants of a “Hello World” program,
which will acquaint you with core concepts. These concepts must be learned
before attempting to write an operating system. Anything beyond a sim-
ple “Hello World” example gets in the way of teaching the concepts, thus
lengthening the time spent on getting started writing an operating sys-
tem.

Let’s dive in. With this book, I hope to provide enough foundational
knowledge that will open doors for you to make sense of other resources.
This book will be beneficial to students who’ve just finished their first
C/C++ course greatly. Imagine how cool it would be to show prospec-
tive employers that you’ve already built an operating system.

Prerequisites

✄ Basic knowledge of circuits

– Basic Concepts of Electricity: atoms, electrons, proton, neutron, cur-

rent ﬂow.

– Ohm’s law

If you are unfamiliar with these concepts, you can quickly learn them
here: http://www.allaboutcircuits.com/textbook/, by reading chap-
ter 1 and chapter 2.

✄ C programming. In particular:
iv tu, do hoang

– Variable and function declarations/deﬁnitions

– While and for loops

– Pointers and function pointers

– Fundamental algorithms and data structures in C

✄ Linux basics:

– Know how to navigate directory with the command line

– Know how to invoke a command with options

– Know how to pipe output to another program

✄ Touch typing. Since we are going to use Linux, touch typing helps. I
know typing speed does not relate to problem-solving, but at least your
typing speed should be fast enough not to let it get in the way and de-
grade the learning experience.

In general, I assume that the reader has basic C programming knowledge,

and can use an IDE to build and run a program.

What you will learn in this book

✄ How to write an operating system from scratch by reading hardware

datasheets. In the real world, you will not be able to consult Google
for a quick answer.

✄ Write code independently. It’s pointless to copy and paste code. Real
learning happens when you solve problems on your own. Some exam-
ples are provided to help kick start your work, but most problems are
yours to conquer. However, the solutions are available online for you
after giving a good try.

✄ A big picture of how each layer of a computer related to each other,

from hardware to software.

✄ How to use Linux as a development environment and common tools

for low-level programming.

✄ How a program is structured so that an operating system can run.

operating systems: from 0 to 1 v

✄ How to debug a program running directly on hardware with gdb and

QEMU.

✄ Linking and loading on bare metal x86_64, with pure C. No standard

library. No runtime overhead.

What this book is not about

✄ Electrical Engineering: The book discusses some concepts from

electronics and electrical engineering only to the extent of how soft-
ware operates on bare metal.

✄ How to use Linux or any OS types of books: Though Linux

is used as a development environment and as a medium to demonstrate
high-level operating system concepts, it is not the focus of this book.

✄ Linux Kernel development: There are already many high-quality

books out there on this subject.

✄ Operating system books focused on algorithms: This

book focuses more on actual hardware platform - Intel x86_64 - and
how to write an OS that utilizes of OS support from the hardware plat-
form.

The organization of the book

Part 1 provides a foundation for learning operating system.

✄ Chapter 1 brieﬂy explains the importance of domain documents.

Documents are crucial for the learning experience, so they deserve
a chapter.

✄ Chapter 2 explains the layers of abstractions from hardware to soft-

ware. The idea is to provide insight into how code runs physically.

✄ Chapter 3 provides the general architecture of a computer, then in-

troduces a sample computer model that you will use to write an
operating system.
vi tu, do hoang

✄ Chapter 4 introduces the x86 assembly language through the use

of the Intel manuals, along with commonly used instructions. This
chapter gives detailed examples of how high-level syntax corresponds
to low-level assembly, enabling you to read generated assembly code
comfortably. It is necessary to read assembly code when debugging
an operating system.
✄ Chapter 5 dissects ELF in detail. Only by understanding how the
structure of a program at the binary level, you can build one that
runs on bare metal.
✄ Chapter 6 introduces gdb debugger with extensive examples for com-
monly used commands. After acquainting the reader with gdb, it
then provides insight on how a debugger works. This knowledge is
essential for building a debuggable program on the bare metal.

Part 2 presents how to write a bootloader to bootstrap a kernel. Hence

the name “Groundwork”. After mastering this part, the reader can con-
tinue with the next part, which is a guide for writing an operating sys-
tem. However, if the reader does not like the presentation, he or she
can look elsewhere, such as OSDev Wiki: http://wiki.osdev.org/.

✄ Chapter 7 introduces what the bootloader is, how to write one in

assembly, and how to load it on QEMU, a hardware emulator. This
process involves typing repetitive and long commands, so GNU Make
is applied to improve productivity by automating the repetitive parts
and simplifying the interaction with the project. This chapter also
demonstrates the use of GNU Make in context.
✄ Chapter 8 introduces linking by explaining the relocation process
when combining object ﬁles. In addition to a bootloader and an op-
erating system written in C, this is the last piece of the puzzle re-
quired for building debuggable programs on bare metal, including
the bootloader written in Assembly and an operating system writ-
ten in C.

Part 3 provides guidance on how to write an operating system, as you

should implement an operating system on your own and be proud of
your creation. The guidance consists of simpler and coherent explana-
tions of necessary concepts, from hardware to software, to implement
operating systems: from 0 to 1 vii

the features of an operating system. Without such guidance, you will

waste time gathering information spread through various documents
and the Internet. It then provides a plan on how to map the concepts
to code.

Acknowledgments

Thank you, my beloved family. Thank you, the contributors.

Part I

Preliminary
1
Domain documents

1.1 Problem domains

In the real world, software engineering is not only focused on software,

but also the problem domain it is trying to solve.

A problem domain is the part of the world where the computer is to pro- problem domain
duce eﬀects, together with the means available to produce them, directly
or indirectly. (Kovitz, 1999)

A problem domain is anything outside of programming that a software

engineer needs to understand to produce correct code that can achieve
the desired effects. “Directly” means include anything that the software
can control to produce the desired effects, e.g. keyboards, printers, moni-
tors, other software, etc. “Indirectly” means anything not part of the soft-
ware but relevant to the problem domain e.g. appropriate people to be
informed by the software when some event happens, students that move
to correct classrooms according to the schedule generated by the soft-
ware. To write a finance application, a software engineer needs to learn
sufficient finance concepts to understand the requirements of a customer requirements
and implement such requirements, correctly.

Requirements are the eﬀects that the machine is to exert in the prob-
lem domain by virtue of its programming.
4 operating systems: from 0 to 1

Programming alone is not too complicated; programming to solve a prob-

lem domain, is 1 . Not only a software engineer needs to understand how 1
We refer to the concept of “program-
ming” here as someone able to write
to implement the software, but also the problem domain that it tries to code in a language, but not necessary
know any or all software engineering
solve, which might require in-depth expert knowledge. The software en- knowledge.
gineer must also select the right programming techniques that apply to
the problem domain he is trying to solve because many techniques that
are eﬀective in one domain might not be in another. For example, many
types of applications do not require performant written code, but a short
time to market. In this case, interpreted languages are widely popular
because it can satisfy such need. However, for writing huge 3D games or
operating system, compiled languages are dominant because it can gen-
erate the most eﬃcient code required for such applications.
Often, it is too much for a software engineer to learn non-trivial do-
mains (that might require a bachelor degree or above to understand the
domains). Also, it is easier for a domain expert to learn enough program-
ming to break down the problem domain into parts small enough for the
software engineers to implement. Sometimes, domain experts implement
the software themselves.

Figure 1.1.1: Problem domains:

Software and Non-software.

Application Non-software
Software Domain
Domain Domains

One example of such scenario is the domain that is presented in this

book: operating system. A certain amount of electrical engineering (EE)
knowledge is required to implement an operating system. If a computer
science (CS) curriculum does not include minimum EE courses, students
in the curriculum have little chance to implement a working operating
system. Even if they can implement one, either they need to invest a sig-
niﬁcant amount of time to study on their own, or they ﬁll code in a pre-
domain documents 5

deﬁned framework just to understand high-level algorithms. For that rea-

son, EE students have an easier time to implement an OS, as they only
need to study a few core CS courses. In fact, only “C programming” and
“Algorithms and Data Structures” classes are usually enough to get them
started writing code for device drivers, and later generalize it into an op-
erating system.

Figure 1.1.2: Operating System

domain.

Data Structure Operating Electrical

and Algorithms System Engineering

One thing to note is that software is its own problem domain. A prob-
lem domain does not necessarily divide between software and itself. Compilers,
3D graphics, games, cryptography, artiﬁcial intelligence, etc., are parts of
software engineering domains (actually it is more of a computer science
domain than a software engineering domain). In general, a software-exclusive
domain creates software to be used by other software. Operating System
is also a domain, but is overlapped with other domains such as electrical
engineering. To eﬀectively implement an operating system, it is required
to learn enough of the external domain. How much learning is enough
for a software engineer? At the minimum, a software engineer should be
knowledgeable enough to understand the documents prepared by hard-
ware engineers for using (i.e. programming) their devices.
Learning a programming language, even C or Assembly, does not mean
a software engineer can automatically be good at hardware programming
or any related low-level programming domains. One can spend 10 years,
20 years or his entire life writing C/C++ code, and he still cannot write
an operating system, simply because of the ignorance of relevant domain
knowledge. Just like learning English does not mean a person automat-
ically becomes good at reading Math books written in English. Much
6 operating systems: from 0 to 1

more than that is needed. Knowing one or two programming languages

is not enough. If a programmer writes software for a living, he had bet-
ter be specialized in one or two problem domains outside of software if
he does not want his job taken by domain experts who learn program-
ming in their spare time.

1.2 Documents for implementing a problem do-

main

Documents are essential for learning a problem domain (and actually,

anything) since information can be passed down in a reliable way. It is
evident that this written text has been used for thousands of years to
pass knowledge from generation to generation. Documents are integral
parts of non-trivial projects. Without the documents:

✄ New people will ﬁnd it much harder to join a project.

✄ It is harder to maintain a project because people may forget impor-

tant unresolved bugs or quirks in their system.

✄ It is challenging for customers to understand the product they are go-

ing to use. However, documents do not need to be written in book for-
mat. It can be anything from HTML format to database format to
be displayed by a graphical user interface. Important information must
be stored somewhere safe, readily accessible.

There are many types of documents. However, to facilitate the under-

standing of a problem domain, these two documents need to be written:
software requirement document and software speciﬁcation.

1.2.1 Software Requirement Document

Software requirement document includes both a list of requirements and Software requirement
a description of the problem domain (Kovitz, 1999).

A software solves a business problem. But, which problems to solve,

are requested by a customer. Many of these requests make a list of re-
quirements that our software needs to fulﬁll. However, an enumerated
list of features is seldom useful in delivering software. As stated in the
domain documents 7

previous section, the tricky part is not programming alone but program-
ming according to a problem domain. The bulk of software design and
implementation depends upon the knowledge of the problem domain. The
better understood the domain, the higher quality software can be. For
example, building a house is practiced over thousands of years and is well
understood, and it is easy to build a high-quality house; software is no
diﬀerent. Code that is diﬃcult to understand is usually due to the au-
thor’s ignorance of a problem domain. In the context of this book, we
seek to understand the low-level working of various hardware devices.

Because software quality depends upon the understanding of the prob-

lem domain, the amount of software requirement document should con-
sist of problem domain description.

Be aware that software requirements are not:

What vs How “what” and “how” are vague terms. What is the “what”?
Is it nouns only? If so, what if a customer requires his software to per-
form speciﬁc steps of operations, such as purchasing procedure for a
customer on a website. Does it include “verbs” now? However, isn’t
the “how” supposed to be step by step operations? Anything can be
the “what” and anything can be the “how”.

Sketches Software requirement document is all about the problem do-

main. It should not be a high-level description of an implementation.
Some problems might seem straightforward to map directly from its
domain description to the structure of an implementation. For exam-
ple:

✄ Users are given a list of books in a drop-down menu to choose.

✄ Books are stored in a linked list”.

✄ etc

In the future, instead of a drop-down menu, all books are listed directly
on a page in thumbnails. Books might be reimplemented as a graph,
and each node is a book for ﬁnding related books, as a recommender
is going to be added in the next version. The requirement document
needs updating again to remove all the outdated implementation de-
tails, thus required additional eﬀorts to maintain the requirement doc-
8 operating systems: from 0 to 1

ument, and when the eﬀort for syncing with the implementation is too
much, the developers give up documentation, and everyone starts rant-
ing how useless documentation is.

More often than not there is no straightforward one-to-one mapping.

For example, a regular computer user expects an OS to be something
that runs some program with GUI, or their favorite computer games.
But for such requirements, an operating system is implemented as mul-
tiple layers, each hiding the details from the upper layers. To imple-
ment an operating system, a large body of knowledge from multiple
ﬁelds is required, especially if the operating system runs on non-PC
devices.

It’s best to include information related to the problem domain in the

requirement document. A good way to test the quality of a require-
ment document is to provide it to a domain expert for proofreading,
to ensure he can understand the material thoroughly. A requirement
document is also useful as a help document later, or for writing one
much easier.

1.2.2 Software Speciﬁcation

Software speciﬁcation document states rules relating desired behavior of Software speciﬁcation
the output devices to all possible behavior of the input devices, as well
as any rules that other parts of the problem domain must obey.Kovitz
(1999)

Simply put, software speciﬁcation is interface design, with constraints

for the problem domain to follow e.g. the software can accept certain types
of input such as the software is designed to accept English but no other
language. For a hardware device, a specification is always needed, as soft-
ware depends on its hardwired behaviors. And in fact, it is mostly the
case that hardware specifications are well-defined, with the tiniest details
in it. It needs to be that way because once hardware is physically man-
ufactured, there’s no going back, and if defects exist, it’s a devastating
damage to the company on both finance and reputation.

Note that, similar to a requirement document, a speciﬁcation only con-

cerns interface design. If implementation details leak in, it is a burden
domain documents 9

to sync between the actual implementation and the speciﬁcation, and

soon to be abandoned.
Another important remark is that, though a specification document
is important, it does not have to be produced before the implementation.
It can be prepared in any order: before or after a complete implementa-
tion; or at the same time with the implementation, when some part is
done, and the interface is ready to be recorded in the specification. Regardless
of methods, what matter is a complete specification at the end.

1.3 Documents for writing an x86 Operating System

When problem domain is diﬀerent from software domain, requirement

document and specification are usually separated. However, if the prob-
lem domain is inside software, specification most often includes both, and
content of both can be mixed with each other. As demonstrated by pre-
vious sections the importance of documents, to implement an OS, we will
need to collect relevant documents to gain sufficient domain knowledge.
These documents are as follow:

✄ Intel® 64 and IA-32 Architectures Software Developer’s Manual (Volume

1, 2, 3)

✄ Intel® 3 Series Express Chipset Family Datasheet

✄ System V Application Binary Interface

Aside from the Intel’s official website, the website of this book also hosts
the documents for convenience2 . 2
Intel may change the links to the doc-
uments as they update their website,
Intel documents divide the requirement and specification sections clearly, so this book doesn’t contain any link
to the documents to avoid confusion
but call the sections with different names. The corresponding to the re- for readers.
quirement document is a section called “Functional Description”, which
consists mostly of domain description; for specification, “Register Description”
section describes all programming interfaces. Both documents carry no
unnecessary implementation details3 . Intel documents are also great ex- 3
As it should be,those details are
trade secret.
amples of how to write well requirements/specifications, as explained in
this chapter.
Other than the Intel documents, other documents will be introduced
in the relevant chapters.
2
From hardware to software:
Layers of abstraction

This chapter gives an intuition on how hardware and software connected

together, and how software is represented physically.

2.1 The physical implementation of a bit

All electronic devices, from simple to complex, manipulate this ﬂow to

achieve desired effects in the real world. Computers are no exception. When
Figure 2.1.1: A lightbulb
we write software, we indirectly manipulate electrical current at the phys-
ical level, in such a way that the underlying machine produces desired
effects. To understand the process, we consider a simple light bulb. A
light bulb can change two states between on and off with a switch, peri-
odically: an off means number 0, and an on means 1.

However, one problem is that such a switch requires manual interven-

tion from a human. What is required is an automatic switch based on
the voltage level, as described above. To enable automatic switching of
electrical signals, a device called transistor, invented by William Shockley,
John Bardeen and Walter Brattain. This invention started the whole com-
puter industry.
12 operating systems: from 0 to 1

At the core, a transistor is just a resistor whose values can vary based transistor
on an input voltage value.
Figure 2.1.2: Modern transistor
With this property, a transistor can be used as a current amplifier (more
voltage, less resistance) or switch electrical signals off and on (block and
unblock an electron flow) based on a voltage level. At 0 v, no current can
pass through a transistor, thus it acts like a circuit with an open switch
(light bulb off) because the resistor value is enough to block the electri-
cal flow. Similarly, at +3.5 v, current can flow through a transistor be-
1 2 3
cause the resistor value is lessened, effectively enables electron flow, thus
acts like a circuit with a closed switch. If you want a deeper explana-
A bit has two states: 0 and 1, which is the building block of all digi- tion elec-
tal systems and software. Similar to a light bulb that can be turned on trons move, you should look at
and off, bits are made out of this electrical stream from the power source: the video “How semiconductors
Bit 0 are represented with 0 v (no electron flow), and bit 1 is +3.5 v to work” on Youtube, by Ben Eater.
+5 v (electron flow). Transistor implements a bit correctly, as it can reg-
ulate the electron flow based on voltage level.

2.1.1 MOSFET transistors

The classic transistors invented open a whole new world of micro digi-
tal devices. Prior to the invention, vacuum tubes - which are just fancier
light bulbs - were used to present 0 and 1, and required human to turn
it on and oﬀ. MOSFET, or M etal–Oxide–Semiconductor Field-Eﬀect MOSFET
T ransistor, invented in 1959 by Dawon Kahng and Martin M. (John) Atalla
at Bell Labs, is an improved version of classic transistors that is more
suitable for digital devices, as it requires shorter switching time between
two states 0 and 1, more stable, consumes less power and easier to pro-
duce.
There are also two types of MOSFETs analogous to two types of tran-
sistors: n-MOSFET and p-MOSFET. n-MOSFET and p-MOSFET are
also called NMOS and PMOS transistors for short.

2.2 Beyond transistors: digital logic gates

All digital devices are designed with logic gates. A logic gate is a device logic gate
that implements a boolean function. Each logic gate includes a number
from hardware to software: layers of abstraction 13

of inputs and an output. All computer operations are built from the com- Figure 2.2.1: Example: NAND
binations of logic gates, which are just combinations of boolean functions. gate

2.2.1 The theory behind logic gates

A out
Logic gates accept only binary inputs1 and produce binary outputs. In
B
1
Input that is either a 0 or 1.
other words, logic gates are functions that transform binary values. Fortunately,
a branch of math that deals exclusively with binary values already ex-
isted, called Boolean Algebra, developed in the 19th century by George Boole.
With a sound mathematical theory as a foundation logic gates were cre-
ated. As logic gates implement Boolean functions, a set of Boolean func-
tions is functionally complete, if this set can construct all other Boolean functionally complete
functions can be constructed from. Later, Charles Sanders Peirce (dur-
ing 1880 – 1881) proved that either Boolean function of NOR or NAND
alone is enough to create all other Boolean logic functions. Thus NOR
and NAND gates are functionally complete Peirce (1933). Gates are sim-
ply the implementations of Boolean logic functions, therefore NAND or
NOR gate is enough to implement all other logic gates. The simplest
gates CMOS circuit can implement are inverters (NOT gates) and from
the inverters, comes NAND gates. With NAND gates, we are conﬁdent
to implement everything else. This is why the inventions of transistors,
then CMOS circuit revolutionized computer industry. If youwant to understand

We should realize and appreciate how powerful boolean functions are why and how fromNAND

available in all programming languages. gate we cancreate all Boolean

functions and a computer, I

2.2.2 Logic Gate implementation: CMOS circuit suggest the course Build a
ModernComputer fromFirst
Underlying every logic gate is a circuit called CMOS - C omplementary
Principles: to From
MOSFET. CMOS consists of two complementary transistors, NMOS
Tetris available on Coursera:
and PMOS. The simplest CMOS circuit is an inverter or a NOT gate:
https://www.coursera.org/
learn/build-a-computer. Go
even further, after the course,
you
should take the series
Computational Structures on
Edx.
CMOS
14 operating systems: from 0 to 1

Figure 2.2.2: Electron ﬂows of an

inverter.
Input is onthe left side
and output on the right side.The
upper component is a PMOS and
the lower component is a NMOS,
both connect to the input and out-
put. (Source: Created
with
http:
//www.falstad.com/circuit/)

(a) When input is low (b) When input is high

From NOT gate, a NAND gate can be created:

Figure 2.2.3: Electron ﬂows of a

NAND gate.

(a) Input = 00, Ouput = 1 (b) Input = 01, Ouput = 1

(c) Input = 10, Output = 1 (d) Input = 11, Output = 0

From NAND gate, we have all other gates. As demonstrated, such a

simple circuitry performs the logical operators in day-to-day program
languages e.g. NOT operator ~ is executed directly by an inverter cir-
cuit, and operator & is executed by an AND circuit and so on. Code does
not run on a magic black box. In contrast, code execution is precise and
transparent, often as simple as running some hardwired circuit. When
from hardware to software: layers of abstraction 15

we write software, we simply manipulate electrical current at the physi-

cal level to run appropriate circuits to produce desired outcomes. However,
this whole process somehow does not relate to any thought involving elec-
trical current. That is the real magic and will be explained soon.

One interesting property of CMOS is that a k-input gate uses k PMOS

and k NMOS transistors (Wakerly, 1999). All logic gates are built by
pairs of NMOS and PMOS transistors, and gates are the building blocks
of all digital devices from simple to complex, including any computer. Thanks
to this pattern, it is possible to separate between the actual physical cir-
cuit implementation and logical implementation. Digital designs are done
by designing with logic gates then later be “compiled” into physical cir-
cuits. In fact, later we will see that logic gates become a language that
describes how circuits operate. Understanding how CMOS works is im-
portant to understand how a computer is designed, and as a consequence,
how a computer works2 . 2
Again, if you want to understand how
logic gates make a computer, consider
Finally, an implemented circuit with its wires and transistors is stored the suggested courses on Coursera and
Edx earlier.
physically in a package called a chip. A chip is a substrate that an inte-
grated circuit is etched onto. However, a chip also refers to a completely Figure 2.2.4: 74HC00 chip physi-
cal view
packaged integrated circuit in consumer market. Depends on the context,
it is understood diﬀerently.

Example 2.2.1. 74HC00 is a chip with four 2-input NAND gates. The
chip comes with 8 input pins and 4 output pins, 1 pin for connecting to
a voltage source and 1 pin for connecting to the ground. This device is
the physical implementation of NAND gates that we can physically touch
and use. But instead of just a single gate, the chip comes with 4 gates
that can be combined. Each combination enables a diﬀerent logic func-
tion, eﬀective creating other logic gates. This feature is what make the
chip popular.

Each of the gates above is just a simple NAND circuit with the elec-
tron ﬂows, as demonstrated earlier. Yet, many these NAND-gates chips
combined can build a simple computer. Software, at the physical level,
is just electron ﬂows.
16 operating systems: from 0 to 1

Figure 2.2.5: 74HC00 logic dia-

grams (Source: 74HC00 datasheet,
http://www.scrpdf.com/pdf/
Semiconductors_new/Logic/
74HCT/74HC_HCT00.pdf)

(a) Logic diagram of 74HC00 (b) Logic diagram of one NAND gate

A A Figure 2.2.6: Gates built from

Y Y NAND gates, each accepts 2 in-
A B
put signals and generate 1 output
(a) NOT gate (b) AND gate signal.

A A

Y Y

B B

(c) OR gate (d) NOR gate

from hardware to software: layers of abstraction 17

How can the above gates be created with 74HC00? It is simple: as ev-
ery gate has 2 input pins and 1 output pin, we can write the output of
1 NAND gate to an input of another NAND gate, thus chaining NAND
gates together to produce the diagrams as above.

2.3 Beyond Logic Gates: Machine Language

2.3.1 Machine language

Being built upon gates, as gates only accept a series of 0 and 1, a hard-
ware device only understands 0 and 1. However, a device only takes 0
and 1 in a systematic way. Machine language is a collection of unique Machine language
bit patterns that a device can identify and perform a corresponding ac-
tion. A machine instruction is a unique bit pattern that a device can iden-
tify. In a computer system, a device with its language is called CPU -
C entral Processing U nit, which controls all activities going inside a com-
puter. For example, in the x86 architecture, the pattern 10100000 means
telling a CPU to add two numbers, or 000000101 to halt a computer. In
the early days of computers, people had to write completely in binary.

Why does such a bit pattern cause a device to do something? The rea-
son is that underlying each instruction is a small circuit that implements
the instruction. Similar to how a function/subroutine in a computer pro-
gram is called by its name, a bit pattern is a name of a little function in-
side a CPU that got executed when the CPU ﬁnds one.

Note that CPU is not the only device with its language. CPU is just
a name to indicate a hardware device that controls a computer system.
A hardware device may not be a CPU but still has its language. A de-
vice with its own machine language is a programmable device, since a user
can use the language to command the device to perform diﬀerent actions.
For example, a printer has its set of commands for instructing it how to
print a page.
18 operating systems: from 0 to 1

Example 2.3.1. A user can use 74HC00 chip without knowing its in-
ternal, but only the interface for using the device. First, we need to know
its layout:

Figure 2.3.1: 74HC00 Pin

1A 1 14 Vcc
Layout (Source: 74HC00 datasheet,
1B 2 13 4B http://www.nxp.com/documents/
1Y 3 12 4A data_sheet/74HC_HCT00.pdf)

2A 4 11 4Y
2B 5 10 3B
2Y 6 9 3A
GND 7 8 3Y

Then, the functionality of each pin:

Symbol Pin Description

Table 2.3.1: Pin Description
1A to 4A 1, 4, 9, 12 data input (Source: 74HC00 datasheet,
1B to 4B 2, 5, 10, 13 data input http://www.nxp.com/documents/
1Y to 4Y 3, 6, 8, 11 data output data_sheet/74HC_HCT00.pdf)
GND 7 ground (0 V)
Vcc 14 supply voltage

Finally, how to use the pins:

Input Output
Table 2.3.2: Functional
nA nB nY Description
L L H
L X H
X L H
H H L ✄ n is a number, either 1, 2, 3,
The functional description provides a truth table with all possible pin or 4
inputs and outputs, which also describes the usage of all pins in the de-
✄ H = HIGH voltage level; L =
vice. A user needs not to know the implementation, but on such a table
LOW voltage level; X = don’t
to use the device. We can say that the truth table above is the machine
care.
language of the device. Since the device is digital, its language is a col-
lection of binary strings:

✄ The device has 8 input pins, and this means it accepts binary strings
of 8 bits.
from hardware to software: layers of abstraction 19

✄ The device has 4 output pins, and this means it produces binary strings
of 4 bits from the 8-bit inputs.

The number of input strings is what the device understand, and the num-
ber of output strings is what the device can speak. Together, they make
the language of the device. Even though this device is simple, yet the lan-
guage it can accept contains quite many binary strings: 28 + 24 = 272.
However, the number is a tiny fraction of a complex device like a CPU,
with hundreds of pins.

When leaving as is, 74HC00 is simply a NAND device with two 4-bit
inputs3 . 3
Or simply 4-bit NAND gate, as it can
only accept 4 bits of input at the maxi-
Input Output mum.

Pin 1A 1B 2A 2B 3A 3B 4A 4B 1Y 2Y 3Y 4Y
Value 1 1 0 0 1 1 0 0 0 1 0 1

The inputs and outputs as visually presented:

Figure 2.3.2: Pins when receiving

1A 1 Vcc
digital signals that correspond to
1B 1 0 4B a binarystring. Green signals are
1Y 0 0 4A inputs; blue signals are outputs.

2A 0 1 4Y
2B 0 1 3B
2Y 1 1 3A
GND 0 3Y

On the other hand, if OR gate is implemented, we can only build a 2-

input OR gate from 74HC00, as it requires 3 NAND gates: 2 input NAND
gates and 1 output NAND gate. Each input NAND gate represents only
a 1-bit input of the OR gate. In the following ﬁgure, the pins of each in-
put NAND gates are always set to the same values (either both inputs
are A or both inputs are B) to represent a single bit input for the ﬁnal
OR gate:
20 operating systems: from 0 to 1

1A A Vcc
1B A 4B
A C
NAND1 1Y C 4A
2A B 4Y
Y
NAND3 2B B C 3B
2Y D D 3A
B D
NAND2 GND Y 3Y

(a) 2-bit OR gate logic diagram, built from 3 NAND (b) Pin 3A and 3B take the values from 1Y and 2Y.
gates with 4 pins just for 2 bits of input.
Figure 2.3.3: 2-bit OR gate imple-
mentation
Table 2.3.3: Truth table of OR
logic diagram.
A B C D Y
0 0 1 1 0
0 1 1 0 1
To implement a 4-bit OR gate, we need a total of four of 74HC00 chips
1 0 0 1 1
conﬁgured as OR gates, packaged as a single chip as in ﬁgure 2.3.4. 1 1 0 0 1

1A A1 Vcc
Figure 2.3.4: 4-bit OR chip made
1B A2 4B from four 74HC00 devices
1Y C1 4A
2A B1 4Y
2B B1 C1 3B
2Y D1 D1 3A
GND Y1 3Y

1A A2 Vcc
1B A2 4B
1Y C2 4A
2A B2 4Y
2B B2 C2 3B
2Y D2 D2 3A
GND Y2 3Y

1A A3 Vcc
1B A3 4B
1Y C3 4A
2A B3 4Y
2B B3 C3 3B
2Y D3 D3 3A
GND Y3 3Y

1A A4 Vcc
1B A4 4B
1Y C4 4A
2A B4 4Y
2B B4 C4 3B
2Y D4 D4 3A
GND Y4 3Y
from hardware to software: layers of abstraction 21

2.3.2 Assembly Language

Assembly language is the symbolic representation of binary machine code,

by giving bit patterns mnemonic names. It was a vast improvement when
programmers had to write 0 and 1. For example, instead of writing 000000101,
a programmer simply write hlt to stop a computer. Such an abstraction
makes instructions executed by a CPU easier to remember, and thus more
instructions could be memorized, less time spent looking up CPU man-
ual to ﬁnd instructions in bit forms and as a result, code was written faster.

Understand assembly language is crucial for low-level programming

domains, even to this day. The more instructions a programmer want
to understand, the deeper understanding of machine architecture is re-
quired.

Example 2.3.2. We can build a device with 2 assembly instructions:

or <op1>, <op2>
nand <op1>, <op2>

✄ or accepts two 4-bit operands. This corresponds to a 4-input OR gate

device built from 4 74HC00 chips.

✄ nand accepts two 4-bit operands. This corresponds to a single 74HC00

chips, leave as is.

Essentially, the gates in the example 2.3.1 implements the instructions.

Up to this point, we only specify input and output and manually feed it
to a device. That is, to perform an operation:

✄ Pick a device by hands.

✄ Manually put electrical signals into pins.

First, we want to automate the process of device selection. That is, we

want to simply write assembly instruction and the device that implements
the instruction is selected correctly. Solving this problem is easy:

✄ Give each instruction an index in binary code, called operation code

Table 2.3.4: Instruction-Opcode
or opcode for short, and embed it as part of input. The value for each
mapping.
instruction is speciﬁed as in table 2.3.4. Instruction Binary Code
nand 00
or 01
22 operating systems: from 0 to 1

Each input now contains additional data at the beginning: an opcode.

For example, the instruction:

nand 1100, 1100

corresponds to the binary string: 0011001100. The ﬁrst two bits 00

encodes a nand instruction, as listed in the table above.

✄ Add another device to select a device, based on a binary code pecu-

liar to an instruction.

Such a device is called a decoder, an important component in a CPU that

decides which circuit to use. In the above example, when feeding 0011001100
to the decoder, because the opcode is 00, data are sent to NAND device
for computing.

Finally, writing assembly code is just an easier way to write binary

strings that a device can understand. When we write assembly code and
save in a text file, a program called an assembler translates the text file assembler
into binary strings that a device can understand. So, how can an assem-
bler exist in the first place? Assume this is the first assembler in the world,
then it is written in binary code. In the next version, life is easier: the
programmers write the assembler in the assembly code, then use the first
version to compile itself. These binary strings are then stored in another
device that later can be retrieved and sent to a decoder. A storage de- storage device
vice is the device that stores machine instructions, which is an array of
circuits for saving 0 and 1 states.

A decoder is built out of logic gates similar to other digital devices. However,
a storage device can be anything that can store 0 and 1 and is retriev-
able. A storage device can be a magnetized device that uses magnetism
to store information, or it can be made out of electrical circuits that can
change and rermember states when a voltage is applied. Regardless of
the technology used, as long as the device can store data and is accessi-
ble to retrieve data, it suﬃces. Indeed, the modern devices are so com-
plex that it is impossible and unnecessary to understand every implemen-
tation detail. Instead, we only need to learn the interfaces, e.g. the pins,
that the devices expose.
from hardware to software: layers of abstraction 23

current instruction 4-bit OR

Storage
1A A1 Vcc
retrieve data 1B A2 4B
0011001100 1Y C1 4A
0111111111 2A B1 4Y
0111101100 2B B1 C1 3B
Decoder 2Y D1 D1 3A
0010101110 GND Y1 3Y
....................
....................
....................

1A A2 Vcc
1B A2 4B
1Y C2 4A
2A B2 4Y
2B B2 C2 3B
2Y D2 D2 3A
GND Y2 3Y

send data 1A A3 Vcc

1B A3 4B
1Y C3 4A
2A B3 4Y
2B B3 C3 3B
2Y D3 D3 3A
GND Y3 3Y

1A 1 Vcc 1A A4 Vcc
1B 1 0 4B 1B A4 4B
1Y 0 0 4A 1Y C4 4A
2A B4 4Y
2A 0 4-bit NAND 1 4Y
2B B4 C4 3B
2B 0 1 3B
2Y 1 1 3A 2Y D4 D4 3A
GND 0 3Y GND Y4 3Y

Figure 2.3.5: A decoder retrieves

A computer essentially implements this process: the current instruction pointed by
the arrow and selects the NAND
device to execute the nand instruc-
✄ Fetch an instruction from a storage device.
tion.

✄ Decode the instruction.

✄ Execute the instruction.

Or in short, a fetch – decode – execute cycle. The above device is extremely

rudimentary, but it already represents a computer with a fetch – decode
– execute cycle. More instructions can be implemented by adding more
devices and allocating more opcodes for the instructions, then update
the decoder accordingly. The Apollo Guidance Computer, a digital com-
puter produced for the Apollo space program from 1961 – 1972, was built
entirely with NOR gates - the other choice to NAND gate for creating
24 operating systems: from 0 to 1

other logic gates. Similarly, if we keep improving our hypothetical device,

it eventually becomes a full-ﬂedge computer.

2.3.3 Programming Languages

Assembly language is a step up from writing 0 and 1. As time goes by,

people realized that many pieces of assembly code had repeating patterns
of usages. It would be nice if instead of writing all the repeating blocks
of code all over again in all places, we simply refer to such blocks of code
with easier to use text forms. For example, a block of assembly code checks
whether one variable is greater than another and if so, execute a block
of code, else execute another block of code; in C, such block of assembly
code is represented by an if statement that is close to human language.

source1.asm
if (...) {
Figure 2.3.6: Repeated assembly
....... patterns are generalized into a new
} else { language.
.......
}

.................

source2.asm

.................

source<n>.asm

People created text forms to represent common blocks of assembly code,

such as the if syntax above, then write a program to translate the text
forms into assembly code. The program that translates such text forms
to machine code is called a compiler: compiler

Any software logic a programming language can implement, hardware

from hardware to software: layers of abstraction 25

cmp DWORD PTR [ebp+0x8],0x0

if (argc) { je 80483f7 <main+0x1c>
i = 1; mov DWORD PTR [ebp-0x4],0x1
Compiler jmp 80483fe <main+0x23>
} else { mov DWORD PTR [ebp-0x4],0x0
i = 0;
}

Figure 2.3.7: From high-level lan-

guage back to low-level language.
can also implement. The reverse is also true: any hardware logic that
is implemented in a circuit can be reimplemented in a programming lan-
guage. The simple reason is that programming languages, or assembly
languages, or machine languages, or logic gates are just languages to ex-
press computations. It is impossible for software to implement something
hardware is incapable of because programming language is just a sim-
pler way to use the underlying hardware. At the end of the day, program-
ming languages are translated to machine instructions that are valid to
a CPU. Otherwise, code is not runnable, thus a useless software. In re-
verse, software can do everything hardware (that run the software) can,
as programming languages are just an easier way to use the hardware.

In reality, even though all languages are equivalent in power, not all
of them are capable of express programs of each other. Programming lan-
guages vary between two ends of a spectrum: high level and low level.

The higher level a programming language is, the more distant it be-
comes from the hardware. In some high-level programming languages,
such as Python, a programmer cannot manipulate underlying hardware,
despite being able to deliver the same computations as low-level program-
ming languages. The reason is that high-level languages want to hide hard-
ware details to free programmers from dealing with irrelevant details not
related to current problem domains. Such convenience, however, is not
free: it requires software to carry an extra code for managing hardware
details (e.g. memory) thus making the code run slower, and it makes hard-
ware programming diﬃcult or impossible. The more abstractions a pro-
gramming language imposes, the more diﬃcult it is for writing low-level
software, such as hardware drivers or an operating system. This is the
reason why C is usually a language of choice for writing an operating sys-
tem, since C is just a thin wrapper of the underlying hardware, making
26 operating systems: from 0 to 1

it easy to understand how exactly a hardware device runs when execut-

ing a certain piece of C code.

Each programming language represents a way of thinking about pro-

grams. Higher-level programming languages help to focus on problem
domains that are not related to hardware at all, and where programmer
performance is more important than computer performance. Lower-level
programming languages help to focus on the inner-working of a machine,
thus are best suited for problem domains that are related to control hard-
ware. That is why so many languages exist. Use the right tools for the
right job to achieve the best results.

2.4 Abstraction

Abstraction is a technique for hiding complexity that is irrelevant to the

problem in context. For example, writing programs without any other
layer except the lowest layer: with circuits. Not only a person needs an
in-depth understanding of how circuits work, making it much more ob-
scure to design a circuit because the designer must look at the raw cir-
cuits but think in higher-level such as logic gates. It is a distracting pro-
cess, as a designer must constantly translate the idea into circuits. It is
possible for a designer simply thinks his high-level ideas straight, and later
translate the ideas into circuits. Not only it is more eﬃcient, but it is also
more accurate as a designer can focus all his eﬀorts into verifying the
design with high-level thinking. When a new designer arrives, he can eas-
ily understand the high-level designs, thus can continue to develop or main-
tain existing systems.

2.4.1 Why abstraction works

In all the layers, abstractions manifest itself:

✄ Logic gates abstract away the details of CMOS.

✄ Machine language abstracts away the details of logic gates.

✄ Assembly language abstracts away the details of machine languages.

✄ Programming language abstracts away the details of assembly languages.

from hardware to software: layers of abstraction 27

We see repeating patterns of how lower-layers build upper-layers:

✄ A lower layer has a recurring pattern. Then, this recurring pattern is

taken out and built a language on top of it.

✄ A higher layer strips away layer-speciﬁc (non-recurring) details to fo-

cus on the recurring details.

✄ The recurring details are given a new and simpler language than the
languages of the lower layers.

What to realize is that every layer is just a more convenient language to

describe the lower layer. Only after a description is fully created with
the language of the higher layer, it is then be implemented with the lan-
guage of the lower layer.

✄ CMOS layer has a recurring pattern that makes sure logic gates are
reliably translated to CMOS circuits: a k-input gate uses k PMOS
and k NMOS transistors (Wakerly, 1999). Since digital devices use
CMOS exclusively, a language arose to describe higher level ideas while
hiding CMOS circuits: Logic Gates.

✄ Logic Gates hides the language of circuits and focuses on how to im-
plement primitive Boolean functions and combine them to create new
functions. All logic gates receive input and generate output as binary
numbers. Thanks to this recurring patterns, logic gates are hidden away
for the new language: Assembly, which is a set of predeﬁned binary
patterns that cause the underlying gates to perform an action.

✄ Soon, people realized that many recurring patterns arisen from within
Assembly language. Repeated blocks of Assembly code appear in Assembly
source ﬁles that express the same or similar idea. There were many
such ideas that can be reliably translated into Assembly code. Thus,
the ideas were extracted for building into the high level programming
languages that everyone programmer learns today.

Recurring patterns are the key to abstraction. Recurring patterns are

why abstraction works. Without them, no language can be built, and thus
28 operating systems: from 0 to 1

no abstraction. Fortunately, human already developed a systematic dis-

cipline for studying patterns: Mathematics. As quoted from the British
mathematician G. H. Hardy (2005):

A mathematician, like a painter or a poet, is a maker of patterns. If his

patterns are more permanent than theirs, it is because they are made
with ideas.

Isn’t that a mathematical formula a representation of a pattern? A vari-

able represents values with the same properties given by constraints? Mathematics
provides a formal system to identify and describe existing patterns in
nature. For that reason, this system can certainly be applied in the digi-
tal world, which is just a subset of the real world. Mathematics can be
used as a common language to help translation between layers easier, and
help with the understanding of layers.

Programming Language

Assembly Language

Mathematics Problem Domain

Logic Gates

Circuit

Figure 2.4.1: Mathematics as a

universal language for all layers.
2.4.2 Why abstraction reduces complexity Since all layers can express mathe-
matics with their technologies, each
Abstraction by building language certainly leverages productivity by strip- layer can be translated into another
layer.
ping irrelevant details to a problem. Imagine writing programs without
any other layout except the lowest layer: with circuits. This is how com-
plexity emerges: when high-level ideas are expressed with lower-level lan-
guage, as the example above demonstrated. Unfortunately, this is the
case with software as programming languages at the moment are more
emphasized on software rather than the problem domains. That is, with-
out prior knowledge, code written in a language is unable to express it-
self the knowledge of its target domain. In other words, a language is ex-
pressive if its syntax is designed to express the problem domain it is try-
ing to solve. Consider this example: That is, the what it will do rather
from hardware to software: layers of abstraction 29

the how it will do.

Example 2.4.1. Graphviz (http://www.graphviz.org/) is a visual-

ization software that provides a language, called dot, for describing graph:

a
digraph {
a -> b;
b -> c;
a -> c; b d
d -> c;
}

Figure 2.4.2: From graph descrip-

As can be seen, the code perfectly expresses itself how the graph is tion to graph.
connected. Even a non-programmer can understand and use such lan-
guage easily. An implementation in C would be more troublesome, and
that’s assuming that the functions for drawing graphs are already avail-
able. To draw a line, in C we might write something like:

draw_line(a, b);

However, it is still verbose compared with:

a -> b;

Also, a and b must be deﬁned in C, compared to the implicit nodes in

the dot language. However, if we do not factor in the verbosity, then C
still has a limitation: it cannot change its syntax to suit the problem do-
main. A domain-speciﬁc language might even be more verbose, but it
makes a domain more understandable. If a problem domain must be ex-
pressed in C, then it is constraint by the syntax of C. Since C is not a
30 operating systems: from 0 to 1

specialized language for a problem domain that, but is a general-purpose

programming language, the domain knowledge is buried within the im-
plementation details. As a result, a C programmer is needed to decipher
and extract the domain knowledge out. If the domain knowledge cannot
be extracted, then the software cannot be further developed.

Example 2.4.2. Linux is full of applications controlled by many domain-

speciﬁc languages and are placed in /etc directory, such as a web server.
Instead of reprogramming the software, a domain-agnostic language is
made for it.

In general, code that can express a problem domain must be under-

standable by a domain expert. Even within the software domain, build-
ing a language out of repeated programming patterns is useful. It helps
people aware the existence of such patterns in code and thus making soft-
ware easier to maintain, as software structure is visible as a language. Only
a programming language that is capable of morphing itself to suit a prob-
lem domain can achieve that goal. Such language is called a programmable
programming language. Unfortunately, this approach of turning software
structure visible is not favored among programmers, as a new language
must be made out of it along with new toolchain to support it. Thus, soft-
ware structure and domain knowledge are buried within code written in
the syntax of a general-purpose language, and if a programmer is not fa-
miliar or even aware of the existence of a code pattern, then it is hope-
less to understand the code. A prime example is reading C code that con-
trols hardware, e.g. an operating system: if a programmer knows abso-
lutely nothing about hardware, then it is impossible to read and write
operating system code in C, even if he could have 20 years of writing ap-
plication C code.

With abstraction, a software engineer can also understand the inner-

working of a device without specialized knowledge of physical circuit de-
sign, enables the software engineer to write code that controls a device.
The separation between logical and physical implementation also entails
that gate designs can be reused even when the underlying technologies
from hardware to software: layers of abstraction 31

changed. For example, in some distant future biological computer could

be a reality, and gates might not be implemented as CMOS but some kind
of biological cells e.g. as living cells; in either technology: electrical or
biological, as long as logic gates are physically realized, the same com-
puter design could be implemented.
3
Computer Architecture

To write lower level code, a programmer must understand the architec-

ture of a computer. It is similar to when one writes programs in a soft-
ware framework, he must know what kinds of problems the framework
solves, and how to use the framework by its provided software interfaces.
But before getting to the deﬁnition of what computer architecture is, we
must understand what exactly is a computer, as many people still think
that a computer is a regular computer we put on a desk, or at best, a server.
Computers come in various shapes and sizes and are devices that people
never imagine they are computers, and that code can run on such devices.

3.1 What is a computer?

A computer is a hardware device that consists of at least a processor (CPU), computer

a memory device and input/output interfaces. All the computers can be
grouped into two types:

Single-purpose computer is a computer built at the hardware level for

speciﬁc tasks. For example, dedicated application encoders/decoders ,
timer, image/video/sound processors.

General-purpose computer is a computer that can be programmed (with-

out modifying its hardware) to emulate various features of single-purpose
34 operating systems: from 0 to 1

computers.

3.1.1 Server

A server is a general-purpose high-performance computer with huge re- server

sources to provide large-scale services for a broad audience. The audi-
ence are people with their personal computer connected to a server.

Figure 3.1.1: Blade servers. Each

blade server is a computer with a
modular design optimize for the use
of physical space and energy.The
enclosure of blade servers is called a
chassis.(Source: Wikimedia, author:
Victorgrigas)

3.1.2 Desktop Computer

A desktop computer is a general-purpose computer with an input and out- desktop computer
put system designed for a human user, with moderate resources enough
for regular use. The input system usually includes a mouse and a key-
board, while the output system usually consists of a monitor that can
display a large mount of pixels. The computer is enclosed in a chassis
large enough for putting various computer components such as a proces-
sor, a motherboard, a power supply, a hard drive, etc.

Figure 3.1.2: A typical desktop

computer.
computer architecture 35

3.1.3 Mobile Computer

A mobile computer is similar to a desktop computer with fewer resources mobile computer
but can be carried around.

Figure 3.1.3: Mobile computers

(a) A laptop (b) A tablet (c) A

mobile
phone

3.1.4 Game Consoles

Game consoles are similar to desktop computers but are optimized for
gaming. Instead of a keyboard and a mouse, the input system of a game
console are game controllers, which is a device with a few buttons for con-
trolling on-screen objects; the output system is a television. The chas-
sis is similar to a desktop computer but is smaller. Game consoles use
custom processors and graphic processors but are similar to ones in desk-
top computers. For example, the ﬁrst Xbox uses a custom Intel Pentium
III processor.

(a) A Play Station 4 (b) A Xbox One (c) A Wii U

Figure 3.1.4: Current-gen Game

Handheld game consoles are similar to game consoles, but incorporate Consoles

both the input and output systems along with the computer in a single
package.
36 operating systems: from 0 to 1

Figure 3.1.5: Some Handheld

Consoles

(a) A Nintendo DS (b) A PS Vita

3.1.5 Embedded Computer

An embedded computer is a single-board or single-chip computer with lim- embedded computer

ited resources designed for integrating into larger hardware devices.
Figure 3.1.6: AnIntel 82815
A microcontroller is an embedded computer designed for controlling Graphics and Memory Controller
Hub embedded on a PC mother-
other hardware devices. A microcontroller is mounted on a chip. Microcontrollers
board. (Source: Wikimedia, author:
are general-purpose computers, but with limited resources so that it is Qurren)

only able to perform one or a few specialized tasks. These computers are
used for a single purpose, but they are still general-purpose since it is pos-
sible to program them to perform diﬀerent tasks, depends on the require-
ments, without changing the underlying hardware.

Another type of embedded computer is system-on-chip. A system-on-

chip is a full computer on a single chip. Though a microcontroller is housed
on a chip, its purpose is diﬀerent: to control some hardware. A micro-
controller is usually simpler and more limited in hardware resources as
Figure 3.1.7: A PIC microcon-
it specializes only in one purpose when running, whereas a system-ontroller. (Soure: Microchip)
chip is a general-purpose computer that can serve multiple purposes. A
system-on-chip can run like a regular desktop computer that is capable
of loading an operating system and run various applications. A system-
on-chip typically presents in a smartphone, such as Apple A5 SoC used
in Ipad2 and iPhone 4S, or Qualcomm Snapdragon used in many Android microcontroller

phones.
Figure 3.1.8: Apple A5 SoC
Be it a microcontroller or a system-on-chip, there must be an environ-
ment where these devices can connect to other devices. This environment
is a circuit board called a PCB – Printed C ircuit Board. A printed cir-
cuit board is a physical board that contains lines and pads to enable elec-
tron ﬂows between electrical and electronics components. Without a PCB,
devices cannot be combined to create a larger device. As long as these
computer architecture 37

devices are hidden inside a larger device and contribute to a larger de-
vice that operates at a higher level layer for a higher level purpose, they
are embedded devices. Writing a program for an embedded device is there-
fore called embedded programming. Embedded computers are used in au-
tomatically controlled devices including power tools, toys, implantable
medical devices, oﬃce machines, engine control systems, appliances, remote
controls and other types of embedded systems.

40pins: 28x GPIO, I2C, SPI, UART

Status LED's
ACT PWR

1
2x USB 2.0
Raspberry Pi Model B+ V1.2
RUN

(C)Raspberry Pi 2014

4x USB +
Ethernet
CPU/GPU
Display DSI
on bottom side

controller
microSD slot

Broadcom LAN9514
BCM2835 2x USB 2.0
512MB SDRAM
Camera CSI

3.3V
current
limiter
HDMI
&
1.8V
Regulator polarity protection Ethernet
Video+audio
3.5mm out

RJ45
Composite

Micro power
good
USB HDMI out Ethernet
Power in
4 poles jack

(a) Functional View. (b) Physical

The SoC is a Broadcom BCM2835. View
The microcontroller is the Ethernet Controller LAN9514.
(Source: Wikimedia, author: Efa2)
Figure 3.1.9: Raspberry Pi
B+
The line between a microcontroller and a system-on-chip is blurry. If Rev 1.2, a single-board computer
hardware keeps evolving more powerful, then a microcontroller can get that includes both a system-on-chip
and a microcontroller.
enough resources to run a minimal operating system on it for multiple
specialized purposes. In contrast, a system-on-chip is powerful enough
to handle the job of a microcontroller. However, using a system-on-chip
as a microcontroller would not be a wise choice as price will rise signiﬁ-
cantly, but we also waste hardware resources since the software written
for a microcontroller requires little computing resources.

3.1.6 Field Gate Programmable Array

Field Programmable Gate Array (FPGA) is a hardware an array of re- Field Programmable Gate
configurable gates that makes circuit structure programmable after it Array
is shipped away from the factory1 . Recall that in the previous chapter, 1
This is why it is called Field Gate
Programmable Array. It is changeable
each 74HC00 chip can be configured as a gate, and a more sophisticated “in the field” where it is applied.
device can be built by combining multiple 74HC00 chips. In a similar
38 operating systems: from 0 to 1

manner, each FPGA device contains thousands of chips called logic blocks,
which is a more complicated chip than a 74HC00 chip that can be con-
ﬁgured to implement a Boolean logic function. These logic blocks can
be chained together to create a high-level hardware feature. This high-
level feature is usually a dedicated algorithm that needs high-speed pro-
cessing.

Figure 3.1.10: FPGA

Architecture (Source: National
Instruments)

Digital devices can be designed by combining logic gates, without re-

garding actual circuit components, since the physical circuits are just mul-
tiples of CMOS circuits. Digital hardware, including various components
in a computer, is designed by writing code, like a regular programmer,
by using a language to describe how gates are wired together. This lan-
guage is called a Hardware Description Language. Later the hardware
description is compiled to a description of connected electronic compo-
nents called a netlist, which is a more detailed description of how gates
are connected.

The diﬀerence between FPGA and other embedded computers is that

programs in FPGA are implemented at the digital logic level, while pro-
grams in embedded computers like microcontrollers or system-on-chip
devices are implemented at assembly code level. An algorithm written
for a FPGA device is a description of the algorithm in logic gates, which
the FPGA device then follows the description to conﬁgure itself to run
the algorithm. An algorithm written for a microcontroller is in assem-
bly instructions that a processor can understand and act accordingly.

FPGA is applied in the cases where the specialized operations are un-
suitable and costly to run on a regular computer such as real-time medi-
cal image processing, cruise control system, circuit prototyping, video en-
computer architecture 39

coding/decoding, etc. These applications require high-speed processing

that is not achievable with a regular processor because a processor wastes
a signiﬁcant amount of time in executing many non-specialized instruc-
tions - which might add up to thousands of instructions or more - to im-
plement a specialized operation, thus more circuits at physical level to
carry the same operation. A FPGA device carries no such overhead; in-
stead, it runs a single specialized operation implemented in hardware di-
rectly.

3.1.7 Application-Speciﬁc Integrated Circuit

An Application-Speciﬁc I ntegrated C ircuit (or ASIC ) is a chip designed

for a particular purpose rather than for general-purpose use. ASIC does
not contain a generic array of logic blocks that can be reconﬁgured to
adapt to any operation like an FPGA; instead, every logic block in an
ASIC is made and optimized for the circuit itself. FPGA can be consid-
ered as the prototyping stage of an ASIC, and ASIC as the ﬁnal stage
of circuit production. ASIC is even more specialized than FPGA, so it
can achieve even higher performance. However, ASICs are very costly to
manufacture and once the circuits are made, if design errors happen, ev-
erything is thrown away, unlike the FPGA devices which can simply be
reprogrammed because of the generic gate array.

3.2 Computer Architecture

The previous section examined various classes of computers. Regardless

of shapes and sizes, every computer is designed for an architect from high
level to low level.

Computer Architecture = Instruction Set Architecture + Computer Organization + Hardware

At the highest-level is the Instruction Set Architecture.

At the middle-level is the Computer Organization.

At the lowest-level is the Hardware.

40 operating systems: from 0 to 1

3.2.1 Instruction Set Architecture

An instruction set is the basic set of commands and instructions that a

microprocessor understands and can carry out.
An I nstruction Set Architecture, or ISA, is the design of an environ-
ment that implements an instruction set. Essentially, a runtime environ-
ment similar to those interpreters of high-level languages. The design in-
cludes all the instructions, registers, interrupts, memory models (how mem-
ory are arranged to be used by programs), addressing modes, I/O, etc., of
a CPU. The more features (e.g. more instructions) a CPU has, the more
circuits are required to implement it.

3.2.2 Computer organization

Computer organization is the functional view of the design of a computer. Computer organization
In this view, hardware components of a computer are presented as boxes
with input and output that connects to each other and form the design
of a computer. Two computers may have the same ISA, but diﬀerent or-
ganizations. For example, both AMD and Intel processors implement x86
ISA, but the hardware components of each processor that make up the
environments for the ISA are not the same.
Computer organizations may vary depend on a manufacturer’s design,
but they are all originated from the Von Neumann architecture2 : 2
John von Neumann was a mathe-
matician and physicist who invented a
computer architecture.
Input and Figure 3.2.1: Von-Neumann
CPU Memory Output Architecture
System bus

Control bus

Address bus

Data bus

CPU fetches instructions continuously from main memory and execute.

computer architecture 41

Memory stores program code and data.

Bus are electrical wires for sending raw bits between the above compo-
nents.

I/O Devices are devices that give input to a computer i.e. keyboard, mouse,
sensor, etc, and takes the output from a computer i.e. monitor takes
information sent from CPU to display it, LED turns on/oﬀ according
to a pattern computed by CPU, etc.

The Von-Neumann computer operates by storing its instructions in main

memory, and CPU repeatedly fetches those instructions into its internal
storage for executing, one after another. Data are transferred through
a data bus between CPU, memory and I/O devices, and where to store
in the devices is transferred through the address bus by the CPU. This
architecture completely implements the fetch – decode – execute cycle.
The earlier computers were just the exact implementations of the Von
Neumann architecture, with CPU and memory and I/O devices commu-
nicate through the same bus. Today, a computer has more buses, each is
specialized in a type of traﬃc. However, at the core, they are still Von
Neumann architecture. To write an OS for a Von Neumann computer,
a programmer needs to be able to understand and write code that con-
trols the cores components: CPU, memory, I/O devices, and bus.
CPU , or C entral Processing U nit, is the heart and brain of any com-
puter system. Understand a CPU is essential to writing an OS from scratch:

✄ To use these devices, a programmer needs to controls the CPU to use

the programming interfaces of other devices. CPU is the only way, as
CPU is the only direct device a programmer can use and the only de-
vice that understand code written by a programmer.

✄ In a CPU, many OS concepts are already implemented directly in hard-

ware, e.g. task switching, paging. A kernel programmer needs to know
how to use the hardware features, to avoid duplicating such concept
in software, thus wasting computer resources.

✄ CPU built-in OS features boost both OS performance and developer

productivity because those features are actual hardware, the lowest
possible level, and developers are free to implement such features.
42 operating systems: from 0 to 1

✄ To eﬀectively use the CPU, a programmer needs to understand the

documentation provided from CPU manufacturer. For example, Intel®
64 and IA-32 Architectures Software Developer Manuals.

✄ After understanding one CPU architecture well, it is easier to learn

other CPU architectures.

A CPU is an implementation of an ISA, eﬀectively the implementation

of an assembly language (and depending on the CPU architecture, the
language may vary). Assembly language is one of the interfaces that are
provided for software engineers to control a CPU, thus control a computer.
But how can every computer device be controlled with only the access
to the CPU? The simple answer is that a CPU can communicate with
other devices through these two interfaces, thus commanding them:

Registers are a hardware component for high-speed data access and com- Registers
munication with other hardware devices. Registers allow software to
control hardware directly by writing to registers of a device, or receive
information from hardware device when reading from registers of a
device.

Not all registers are used for communication with other devices. In
a CPU, most registers are used as high-speed storage for temporary
data. Other devices that a CPU can communicate always have a set
of registers for interfacing with the CPU.

Port is a specialized register in a hardware device used for communica- Port

tion with other devices. When data are written to a port, it causes a
hardware device to perform some operation according to values writ-
ten to the port. The diﬀerent between a port and a register is that
port does not store data, but delegate data to some other circuit.

These two interfaces are extremely important, as they are the only inter-
faces for controlling hardware with software. Writing device drivers is es-
sentially learning the functionality of each register and how to use them
properly to control the device.

Memory is a storage device that stores information. Memory consists Memory

of many cells. Each cell is a byte with its address number, so a CPU can
computer architecture 43

use such address number to access an exact location in memory. Memory

is where software instructions (in the form of machine language) is stored
and retrieved to be executed by CPU; memory also stores data needed
by some software. Memory in a Von Neumann machine does not distin-
guish between which bytes are data and which bytes are software instruc-
tions. It’s up to the software to decide, and if somehow data bytes are
fetched and executed as instructions, CPU still does it if such bytes rep-
resents valid instructions, but will produce undesirable results. To a CPU,
there’s no code and data; both are merely diﬀerent types of data for it
to act on: one tells it how to do something in a speciﬁc manner, and one
is necessary materials for it to carry such action.

The RAM is controlled by a device called a memory controller. Currently,

most processors have this device embedded, so the CPU has a dedicated
memory bus connecting the processor to the RAM. On older CPU3 , how- 3
Prior to the CPU’s produced in 2009

ever, this device was located in a chip also known as MCH or M emory
C ontroller H ub. In this case, the CPU does not communicate directly
to the RAM, but to the MCH chip, and this chip then accesses the mem-
ory to read or write data. The ﬁrst option provides better performance
since there is no middleman in the communications between the CPU
and the memory.

System Bus
System Bus
Control
Control Address
Address CPU Memory
MCH Data
Memory Data

MCH
CPU
(a) Old CPU (b) Modern CPU

Figure 3.2.2: CPU - Memory

At the physical level, RAM is implemented as a grid of cells that each Communication
contain a transistor and an electrical device called a capacitor, which stores capacitor
charge for short periods of time. The transistor controls access to the ca-
pacitor; when switched on, it allows a small charge to be read from or
written to the capacitor. The charge on the capacitor slowly dissipates,
44 operating systems: from 0 to 1

requiring the inclusion of a refresh circuit to periodically read values from

the cells and write them back after ampliﬁcation from an external power
source.

Bus is a subsystem that transfers data between computer components Bus

or between computers. Physically, buses are just electrical wires that con-
nect all components together and each wire transfer a single big chunk
of data. The total number of wires is called bus width, and is dependent bus width
on how many wires a CPU can support. If a CPU can only accept 16
bits at a time, then the bus has 16 wires connecting from a component
to the CPU, which means the CPU can only retrieve 16 bits of data a
time.

3.2.3 Hardware

Hardware is a speciﬁc implementation of a computer. A line of proces-

sors implement the same instruction set architecture and use nearly iden-
tical organizations but diﬀer in hardware implementation. For example,
the Core i7 family provides a model for desktop computers that is more
powerful but consumes more energy, while another model for laptops is
less performant but more energy eﬃcient. To write software for a hard-
ware device, seldom we need to understand a hardware implementation
if documents are available. Computer organization and especially the in-
struction set architecture are more relevant to an operating system pro-
grammer. For that reason, the next chapter is devoted to study the x86
instruction set architecture in depth.

3.3 x86 architecture

A chipset is a chip with multiple functions. Historically, a chipset is ac-

tually a set of individual chips, and each is responsible for a function, e.g.
memory controller, graphic controllers, network controller, power controller,
etc. As hardware progressed, the set of chips were incorporated into a
single chip, thus more space, energy, and cost eﬃcient. In a desktop com-
puter, various hardware devices are connected to each other through a
PCB called a motherboard. Each CPU needs a compatible motherboard
that can host it. Each motherboard is deﬁned by its chipset model that
computer architecture 45

determine the environment that a CPU can control. This environment

typically consists of

✄ a slot or more for CPU

✄ a chipset of two chips which are the Northbridge and Southbridge chips

– Northbridge chip is responsible for the high-performance commu-

nication between CPU, main memory and the graphic card.

– Southbridge chip is responsible for the communication with I/O

devices and other devices that are not performance sensitive.

✄ slots for memory sticks

✄ a slot or more for graphic cards.

✄ generic slots for other devices, e.g. network card, sound card.

✄ ports for I/O devices, e.g. keyboard, mouse, USB.

To write a complete operating system, a programmer needs to under-

stand how to program these devices. After all, an operating system man-
ages hardware automatically to free application programs doing so. However,
of all the components, learning to program the CPU is the most impor-
tant, as it is the component present in any computer, regardless of what
type a computer is. For this reason, the primary focus of this book will
be on how to program an x86 CPU. Even solely focused on this device,
a reasonably good minimal operating system can be written. The reason
is that not all computers include all the devices as in a normal desktop
computer. For example, an embedded computer might only have a CPU
and limited internal memory, with pins for getting input and producing
an output; yet, operating systems were written for such devices.

However, learning how to program an x86 CPU is a daunting task,

with 3 primary manuals written for it: almost 500 pages for volume 1,
over 2000 pages for volume 2 and over 1000 pages for volume 3. It is an
impressive feat for a programmer to master every aspect of x86 CPU pro-
gramming.
46 operating systems: from 0 to 1

Figure 3.3.1: Motherboard organi-

zation.

CPU

Clock Front-side
Graphics Generator
bus
card slot
Chipset
Memory Slots
High-speed
graphics bus
(AGP or PCI
Express)
Northbridge Memory
bus

(memory
controller hub)

Internal
Bus

Southbridge
PCI (I/O controller
Bus hub)
IDE
SATA
USB Cables and
Ethernet ports leading
Audio Codec
CMOS Memory off-board

PCI Slots

LPC
Bus Super I/O
Serial Port
Parallel Port
Flash ROM Floppy Disk
(BIOS) Keyboard
Mouse
computer architecture 47

3.4 Intel Q35 Chipset

Q35 is an Intel chipset released September 2007. Q35 is used as an ex-

ample of a high-level computer organization because later we will use QEMU
to emulate a Q35 system, which is latest Intel system that QEMU can
emulate. Though released in 2007, Q35 is relatively modern to the cur-
rent hardware, and the knowledge can still be reused for current chipset
model. With a Q35 chipset, the emulated CPU is also relatively up-to-
date with features presented in current day CPUs so we can use the lat-
est software manuals from Intel.

Figure 3.3.1 on the facing page is a typical current-day motherboard

organization, in which Q35 shares similar organization.

3.5 x86 Execution Environment

An execution environment is an environment that provides the facility

to make code executable. The execution environment needs to address
the following question:

✄ Supported operations? data transfer, arithmetic, control, ﬂoating-

point, etc.

✄ Where are operands stored? registers, memory, stack, accu-

mulator

✄ How many explicit operands are there for each instruc-

tion? 0, 1, 2, or 3

✄ How is the operand location specified? register, immedi-

ate, indirect, etc.

✄ What type and size of operands are supported? byte,

int, ﬂoat, double, string, vector, etc.

✄ etc.

For the remain of this chapter, please carry on the reading to chapter 3
in Intel Manual Volume 1, “Basic Execution Environment” .
4
x86 Assembly and C

In this chapter, we will explore assembly language, and how it connects

to C. But why should we do so? Isn’t it better to trust the compiler, plus
no one writes assembly anymore?

Not quite. Surely, the compiler at its current state of the art is trust-
worthy, and we do not need to write code in assembly, most of the time.
A compiler can generate code, but as mentioned previously, a high-level
language is a collection of patterns of a lower-level language. It does not
cover everything that a hardware platform provides. As a consequence,
not every assembly instruction can be generated by a compiler, so we still
need to write assembly code for these circumstances to access hardware-
speciﬁc features. Since hardware-speciﬁc features require writing assem-
bly code, debugging requires reading it. We might spend even more time
reading than writing. Working with low-level code that interacts directly
with hardware, assembly code is unavoidable. Also, understand how a
compiler generates assembly code could improve a programmer’s produc-
tivity. For example, if a job or school assignment requires us to write as-
sembly code, we can simply write it in C, then let gcc does the hard work-
ing of writing the assembly code for us. We merely collect the generated
assembly code, modify as needed and be done with the assignment.

We will learn objdump extensively, along with how to use Intel docu-
ments to aid in understanding x86 assembly code.
50 operating systems: from 0 to 1

4.1 objdump

objdump is a program that displays information about object ﬁles. It will

be handy later to debug incorrect layout from manual linking. Now, we
use objdump to examine how high level source code maps to assembly
code. For now, we ignore the output and learn how to use the command
ﬁrst. Supposed that we have a executable binary named hello compiled
from a hello.c thath prints “Hello World’, it is simple to use objdump
:

$ objdump -d hello

-d option only displays assembled contents of executable sections. A

section is a block of memory that contains either program code or data.
A code section is executable by the CPU, while a data section is not
executable. Non-executable sections, such as .data and .bss (for
storing program data), debug sections, etc, are not displayed. We will
learn more about section when studying ELF binary ﬁle format in
chapter 5 on page 107 . On the other hand:

$ objdump -D hello

where -D option displays assembly contents of all sections. If -D, -d is

implicitly assumed. objdump is mostly used for inspecting assembly
code, so -d is the most useful and thus is set by default.

The output overruns the terminal screen. To make it easy for reading,
send all the output to less:

$ objdump -d hello | less

To intermix source code and assembly, the binary must be compiled

with -g option to include source code in it, then add -S option:

$ objdump -S hello | less

x86 assembly and c 51

The default syntax used by objdump is AT&T syntax. To change it

to the familiar Intel syntax:

$ objdump -M intel -D hello | less

When using -M option, option -D or -d must be explicitly supplied.

Next, we will use objdump to examine how compiled C data and code
are represented in machine code.

Finally, we will write a 32-bit kernel, therefore we will need to com-

pile a 32-bit binary and examine it in 32-bit mode:

$ objdump -M i386,intel -D hello | less

-M i386 tells objdump to display assembly content using 32-bit layout.

Knowing the diﬀerence between 32-bit and 64-bit is crucial for writing
kernel code. We will examine this matter later on when writing our
kernel.

4.2 Reading the output

At the start of the output displays the ﬁle format of the object ﬁle:

hello: file format elf64-x86-64

After the line is a series of disassembled sections:

Disassembly of section .interp:

...
Disassembly of section .note.ABI-tag:
...
Disassembly of section .note.gnu.build-id:
...
...
etc
52 operating systems: from 0 to 1

Finally, each disassembled section displays its actual content - which is

a sequence of assembly instructions - with the following format:

4004d6: 55 push rbp

✄ The ﬁrst column is the address of an assembly instruction. In the above

example, the address is 0x4004d6.

✄ The second column is assembly instruction in raw hex values. In the

above example, the value is 0x55.

✄ The third column is the assembly instruction. Depends on the section,

the assembly instruction might be meaningful or meaningless. For ex-
ample, if the assembly instructions are in a .text section, then the
assembly instructions are actual program code. On the other hand, if
the assembly instructions are displayed in a .data section, then we
can safely ignore the displayed instructions. The reason is that objdump
doesn’t know which hex values are code and which are data, so it blindly
translates every hex values into assembly instructions. In the above
example, the assembly instruction is push %rbp.

✄ The optional fourth column is a comment - appears when there is a

reference to an address - to inform where the address originates. For
example, the comment in blue:

lea r12,[rip+0x2008ee] # 600e10 <__frame_dummy_init_array_entry>

is to inform that the referenced address from [rip+0x2008ee] is 0x600e10,

where the variable __frame_dummy_init_array_entry resides.

In a disassembled section, it may also contain labels. A label is a name

given to an assembly instruction. The label denotes the purpose of an
assembly block to a human reader, to make it easier to understand. For
example, .text section carries many of such labels to denote where code
in a program start; .text section below carries two functions: _start
and deregister_tm_clones. The _start function starts at address 4003e0,
is annotated to the left of the function name. Right below _start label
is also the instruction at address 4003e0. This whole thing means that
a label is simply a name of a memory address. The function deregister_tm_clones
also shares the same format as every function in the section.
x86 assembly and c 53

00000000004003e0 <_start>:
4003e0: 31 ed xor ebp,ebp
4003e2: 49 89 d1 mov r9,rdx
4003e5: 5e pop rsi
...more assembly code....
0000000000400410 <deregister_tm_clones>:
400410: b8 3f 10 60 00 mov eax,0x60103f
400415: 55 push rbp
400416: 48 2d 38 10 60 00 sub rax,0x601038
...more assembly code....

4.3 Intel manuals

The best way to understand and use assembly language properly is to

understand precisely the underlying computer architecture and what each
machine instruction does. To do so, the most reliable source is to refer
to documents provided by vendors. After all, hardware vendors are the
one who made their machines. To understand Intel’s instruction set, we
need the document “Intel 64 and IA-32 architectures software developer’s
manual combined volumes 2A, 2B, 2C, and 2D: Instruction set reference,
A-Z ”. The document can be retrieved here: https://software.intel.
com/en-us/articles/intel-sdm.

✄ Chapter 1 provides brief information about the manual, and the com-
ment notations used in the book.

✄ Chapter 2 provides an in-depth explanation of the anatomy of an as-

sembly instruction, which we will investigate in the next section.

✄ Chapter 3 - 5 provide the details of every instruction of the x86_64

architecture.

✄ Chapter 6 provides information about safer mode extensions. We won’t

need to use this chapter.

The ﬁrst volume “Intel® 64 and IA-32 Architectures Software Developer’s

Manual Volume 1: Basic Architecture” describes the basic architecture
and programming environment of Intel processors. In the book, Chapter
54 operating systems: from 0 to 1

5 gives the summary of all Intel instructions, by listing instructions into

diﬀerent categories. We only need to learn general-purpose instructions
listed chapter 5.1 for our OS. Chapter 7 describes the purpose of each
category. Gradually, we will learn all of these instructions.

Exercise 4.3.1. Read section 1.3 in volume 2, exclude sections 1.3.5 and
1.3.7.

4.4 Experiment with assembly code

The subsequent sections examine the anatomy of an assembly instruc-

tion. To fully understand, it is necessary to write code and see the code
in its actual form displayed as hex numbers. For this purpose, we use nasm
assembler to write a few line of assembly code and see the generated code.

Example 4.4.1. Suppose we want to see the machine code generated

for this instruction:

jmp eax

Then, we use an editor e.g. Emacs, then create a new ﬁle, write the code
and save it in a ﬁle, e.g. test.asm. Then, in the terminal, run the com-
mand:

$ nasm -f bin test.asm -o test

-f option specifies the file format, e.g. ELF, of the final output file. But
in this case, the format is bin, which means this file is just a flat binary
output without any extra information. That is, the written assembly
code is translated to machine code as is, without the overhead of the
metadata from file format like ELF. Indeed, after compiling, we can
examine the output using this command:

$ hd test
x86 assembly and c 55

hd (short for hexdump) is a program that displays the content of a

ﬁle in hex format. And get the following output: Though its name is short for hex-
dump, hd can display in diﬀerent
00000000 66 ff e0 |f..| base, e.g. binary, other than hex.
00000003

The ﬁle only consists of 3 bytes: 66 ff e0, which is equivalent to the in-
struction jmp eax.

Example 4.4.2. If we were to use elf as ﬁle format:

$ nasm -f elf test.asm -o test

It would be more challenging to learn and understand assembly

instructions with all the added noise1 : 1
The output from hd.

00000000 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 |.ELF............|
00000010 01 00 03 00 01 00 00 00 00 00 00 00 00 00 00 00 |................|
00000020 40 00 00 00 00 00 00 00 34 00 00 00 00 00 28 00 |@.......4.....(.|
00000030 05 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00000060 00 00 00 00 00 00 00 00 01 00 00 00 01 00 00 00 |................|
00000070 06 00 00 00 00 00 00 00 10 01 00 00 02 00 00 00 |................|
00000080 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00 |................|
00000090 07 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00 |................|
000000a0 20 01 00 00 21 00 00 00 00 00 00 00 00 00 00 00 | ...!...........|
000000b0 01 00 00 00 00 00 00 00 11 00 00 00 02 00 00 00 |................|
000000c0 00 00 00 00 00 00 00 00 50 01 00 00 30 00 00 00 |........P...0...|
000000d0 04 00 00 00 03 00 00 00 04 00 00 00 10 00 00 00 |................|
000000e0 19 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00 |................|
000000f0 80 01 00 00 0d 00 00 00 00 00 00 00 00 00 00 00 |................|
00000100 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
56 operating systems: from 0 to 1

00000110 ff e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000120 00 2e 74 65 78 74 00 2e 73 68 73 74 72 74 61 62 |..text..shstrtab|
00000130 00 2e 73 79 6d 74 61 62 00 2e 73 74 72 74 61 62 |..symtab..strtab|
00000140 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00000160 01 00 00 00 00 00 00 00 00 00 00 00 04 00 f1 ff |................|
00000170 00 00 00 00 00 00 00 00 00 00 00 00 03 00 01 00 |................|
00000180 00 74 65 73 74 2e 61 73 6d 00 00 00 00 00 00 00 |.disp8-5.asm....|
00000190

Thus, it is better just to use ﬂat binary format in this case, to experiment
instruction by instruction.

With such a simple workﬂow, we are ready to investigate the struc-

ture of every assembly instruction.

Note: Using the bin format puts nasm by default into 16-bit mode.
To enable 32-bit code to be generated, we must add this line at the be-
ginning of an nasm source ﬁle:

bits 32

4.5 Anatomy of an Assembly Instruction

Chapter 2 of the instruction reference manual provides an in-depth of

view of instruction format. But, the information is too much that it can
overwhelm beginners. This section provides an easier instruction before
reading the actual chapter in the manual.

Recall that an assembly instruction is simply a ﬁxed-size series of bits.

The length of an instruction varies and depends on how complicated an
instruction is. What every instruction shares is a common format described
in the ﬁgure above that divides the bits of an instruction into smaller
parts that encode diﬀerent types of information. These parts are:

Instruction Prefixes appears at the beginning of an instruction. Preﬁxes

are optional. A programmer can choose to use a preﬁx or not because
in practice, a so-called preﬁx is just another assembly instruction to
x86 assembly and c 57

Instruction
Opcode ModR/M SIB Displacement Immediate
Preﬁxes

Preﬁxes of 1-, 2-, or 3-byte 1 byte 1 byte Address Immediate

1 byte each opcode (if required) (if required) displacement data of
(optional)1,2 of 1, 2 or 4 1, 2 or 4
bytes or none3 bytes or none3

7 65 32 0 7 65 32 0
Reg/
Mod R/M Scale Index Base
Opcode

1. The REX preﬁx is optional, but if used must be immediately before the opcode; see Section
2.2.1, “REX Preﬁxes” in the manual for additional information.
2. For VEX encoding information, see Section 2.3, “Intel® Advanced Vector Extensions (Intel®
AVX)” in the manual.
3. Some rare instructions can take an 8B immediate or 8B displacement.

Figure 4.5.1: Intel 64 and IA-32

be inserted before another assembly instruction that such preﬁx is ap- Architectures Instruction Format

plicable. Instructions with 2 or 3-bytes opcodes include the preﬁxes

by default.

Opcode is a unique number that identiﬁes an instruction. Each opcode

is given an mnemonic name that is human readable, e.g. one of the
opcodes for instruction add is 04. When a CPU sees the number 04
in its instruction cache, it sees instruction add and execute accord-
ingly. Opcode can be 1,2 or 3 bytes long and includes an additional
3-bit ﬁeld in the ModR/M byte when needed.

Example 4.5.1. This instruction:

jmp [0x1234]

generates the machine code:

ff 26 34 12

The very ﬁrst byte, 0xff is the opcode, which is unique to jmp
instruction.

ModR/M speciﬁes operands of an instruction. Operand can either be a

58 operating systems: from 0 to 1

register, a memory location or an immediate value. This component

of an instruction consists of 3 smaller parts:

✄ mod field, or modifier field, is combined with r/m field for a total of
5 bits of information to encode 32 possible values: 8 registers and
24 addressing modes.

✄ reg/opcode ﬁeld encodes either a register operand, or extends the

Opcode ﬁeld with 3 more bits.

✄ r/m ﬁeld encodes either a register operand or can be combined with

mod ﬁeld to encode an addressing mode.

The tables 4.5.1 and 4.5.2 list all possible 256 values of ModR/M byte
and how each value maps to an addressing mode and a register, in 16-
bit and 32-bit modes.
x86 assembly and c 59

r8(/r) AL CL DL BL AH CH DH BH
r16(/r) AX CX DX BX SP BP1 SI DI
r32(/r) EAX ECX EDX EBX ESP EBP ESI EDI
mm(/r) MM0 MM1 MM2 MM3 MM4 MM5 MM6 MM7
xmm(/r) XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7
(In decimal) /digit (Opcode) 0 1 2 3 4 5 6 7
(In binary) REG = 000 001 010 011 100 101 110 111
Eﬀective Address Mod R/M Values of ModR/M Byte (In Hexadecimal)
[BX + SI] 00 000 00 08 10 18 20 28 30 38
[BX + DI] 001 01 09 11 19 21 29 31 39
[BP + SI] 010 02 0A 12 1A 22 2A 32 3A
[BP + DI] 011 03 0B 13 1B 23 2B 33 3B
[SI] 100 04 0C 14 1C 24 2C 34 3C
[DI] 101 05 0D 15 1D 25 2D 35 3D
disp162 110 06 0E 16 1E 26 2E 36 3E
[BX] 111 07 0F 17 1F 27 2F 37 3F
[BX + SI] + disp83 01 000 40 48 50 58 60 68 70 78
[BX + DI] + disp8 001 41 49 51 59 61 69 71 79
[BP + SI] + disp8 010 42 4A 52 5A 62 6A 72 7A
[BP + DI] + disp8 011 43 4B 53 5B 63 6B 73 7B
[SI] + disp8 100 44 4C 54 5C 64 6C 74 7C
[DI] + disp8 101 45 4D 55 5D 65 6D 75 7D
[BP] + disp8 110 46 4E 56 5E 66 6E 76 7E
[BX] + disp8 111 47 4F 57 5F 67 6F 77 7F
[BX + SI] + disp16 10 000 80 88 90 98 A0 A8 B0 B8
[BX + DI] + disp16 001 81 89 91 99 A1 A9 B1 B9
[BP + SI] + disp16 010 82 8A 92 9A A2 AA B2 BA
[BP + DI] + disp16 011 83 8B 93 9B A3 AB B3 BB
[SI] + disp16 100 84 8C 94 9C A4 AC B4 BC
[DI] + disp16 101 85 8D 95 9D A5 AD B5 BD
[BP] + disp16 110 86 8E 96 9E A6 AE B6 BE
[BX] + disp16 111 87 8F 97 9F A7 AF B7 BF
EAX/AX/AL/MM0/XMM0 11 000 C0 C8 D0 D8 E0 E8 F0 F8
ECX/CX/CL/MM1/XMM1 001 C1 C9 D1 D9 E1 E9 F1 F9
EDX/DX/DL/MM2/XMM2 010 C2 CA D2 DA E2 EA F2 FA
EBX/BX/BL/MM3/XMM3 011 C3 CB D3 DB E3 EB F3 FB
ESP/SP/AHMM4/XMM4 100 C4 CC D4 DC E4 EC F4 FC
EBP/BP/CH/MM5/XMM5 101 C5 CD D5 DD E5 ED F5 FD
ESI/SI/DH/MM6/XMM6 110 C6 CE D6 DE E6 EE F6 FE
EDI/DI/BH/MM7/XMM7 111 C7 CF D7 DF E7 EF F7 FF

1. The default segment register is SS for the eﬀective addresses containing a BP index, DS for other eﬀective
addresses.

2. The disp16 nomenclature denotes a 16-bit displacement that follows the ModR/M byte and that is added to the
index.

3. The disp8 nomenclature denotes an 8-bit displacement that follows the ModR/M byte and that is sign-extended
and added to the index.

Table 4.5.1: 16-Bit Addressing

Forms with the ModR/M Byte
60 operating systems: from 0 to 1

r8(/r) AL CL DL BL AH CH DH BH
r16(/r) AX CX DX BX SP BP SI DI
r32(/r) EAX ECX EDX EBX ESP EBP ESI EDI
mm(/r) MM0 MM1 MM2 MM3 MM4 MM5 MM6 MM7
xmm(/r) XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7
(In decimal) /digit (Opcode) 0 1 2 3 4 5 6 7
(In binary) REG = 000 001 010 011 100 101 110 111
Eﬀective Address Mod R/M Values of ModR/M Byte (In Hexadecimal)
[EAX] 00 000 00 08 10 18 20 28 30 38
[ECX] 001 01 09 11 19 21 29 31 39
[EDX] 010 02 0A 12 1A 22 2A 32 3A
[EBX] 011 03 0B 13 1B 23 2B 33 3B
[--][--]1 100 04 0C 14 1C 24 2C 34 3C
disp322 101 05 0D 15 1D 25 2D 35 3D
[ESI] 110 06 0E 16 1E 26 2E 36 3E
[EDI] 111 07 0F 17 1F 27 2F 37 3F
[EAX] + disp83 01 000 40 48 50 58 60 68 70 78
[ECX] + disp8 001 41 49 51 59 61 69 71 79
[EDX] + disp8 010 42 4A 52 5A 62 6A 72 7A
[EBX] + disp8 011 43 4B 53 5B 63 6B 73 7B
[--][--] + disp8 100 44 4C 54 5C 64 6C 74 7C
[EBP] + disp8 101 45 4D 55 5D 65 6D 75 7D
[ESI] + disp8 110 46 4E 56 5E 66 6E 76 7E
[EDI] + disp8 111 47 4F 57 5F 67 6F 77 7F
[EAX] + disp32 10 000 80 88 90 98 A0 A8 B0 B8
[ECX] + disp32 001 81 89 91 99 A1 A9 B1 B9
[EDX] + disp32 010 82 8A 92 9A A2 AA B2 BA
[EBX] + disp32 011 83 8B 93 9B A3 AB B3 BB
[--][--] + disp32 100 84 8C 94 9C A4 AC B4 BC
[EBP] + disp32 101 85 8D 95 9D A5 AD B5 BD
[ESI] + disp32 110 86 8E 96 9E A6 AE B6 BE
[EDI] + disp32 111 87 8F 97 9F A7 AF B7 BF
EAX/AX/AL/MM0/XMM0 11 000 C0 C8 D0 D8 E0 E8 F0 F8
ECX/CX/CL/MM/XMM1 001 C1 C9 D1 D9 E1 E9 F1 F9
EDX/DX/DL/MM2/XMM2 010 C2 CA D2 DA E2 EA F2 FA
EBX/BX/BL/MM3/XMM3 011 C3 CB D3 DB E3 EB F3 FB
ESP/SP/AH/MM4/XMM4 100 C4 CC D4 DC E4 EC F4 FC
EBP/BP/CH/MM5/XMM5 101 C5 CD D5 DD E5 ED F5 FD
ESI/SI/DH/MM6/XMM6 110 C6 CE D6 DE E6 EE F6 FE
EDI/DI/BH/MM7/XMM7 111 C7 CF D7 DF E7 EF F7 FF

1. The [--][--] nomenclature means a SIB follows the ModR/M byte.

2. The disp32 nomenclature denotes a 32-bit displacement that follows the ModR/M byte (or the SIB byte if one is
present) and that is added to the index.

3. The disp8 nomenclature denotes an 8-bit displacement that follows the ModR/M byte (or the SIB byte if one is
present) and that is sign-extended and added to the index.

Table 4.5.2: 32-Bit Addressing

Forms with the ModR/M Byte
x86 assembly and c 61

How to read the table:

In an instruction, next to the opcode is a ModR/M byte. Then, look up

the byte value in this table to get the corresponding operands in the row
and column.

Example 4.5.2. An instruction uses this addressing mode:

jmp [0x1234]

Then, the machine code is:

ff 26 34 12

0xff is the opcode. Next to it, 0x26 is the ModR/M byte. Look up in
the 16-bit table , the ﬁrst operand is in the row, equivalent to a disp16, Remember, using bin format
which means a 16-bit oﬀset. Since the instruction does not have a generates 16-bit code by default
second operand, the column can be ignored.

Example 4.5.3. An instruction uses this addressing mode:

add eax, ecx

Then the machine code is:

66 01 c8

The interesting feature of this instruction is that 0x66 is the not the
opcode. 0x01 is the opcode. So then, what is 0x66? Recall that for
every assembly instruction, there will be an optional instruction preﬁx,
and that is what 0x66 is. According to the Intel manual, vol 1:

The operand-size override preﬁx allows a program to switch between 16-

and 32-bit operand sizes. Either size can be the default; use of the pre-
ﬁx selects the non-default size.

If the CPU is switched to 32-bit mode, when it runs an instruction with

0x66 preﬁx, the instruction operands are limited to only 16-bit width.
62 operating systems: from 0 to 1

On the other hand, if the CPU is in 16-bit environment, as a result, 32-

bit is considered non-standard and as such, instruction operands are tem-
porary upgraded to 32-bit width while the instructions without the pre-
ﬁx use 16-bit operands.

Next to it, c8 is the ModR/M byte. Look up in the 16-bit table at c8

value, the row tells the ﬁrst operand is ax , the column tells the second Remember, using bin format gen-
operand is cx; the column can’t be ignored as the second operand is in erates 16-bit code by default
the instruction.

Why is the ﬁrst operand in the row and the second in a column? Let’s
break down the ModR/M byte, with an example value c8, into bits:

mod reg/opcode r/m

1 1 0 0 1 0 0 0

The mod field divides addressing modes into 4 different categories. Further
combines with the r/m field, exactly one addressing mode can be selected
from one of the 24 rows. If an instruction only requires one operand, then
the column can be ignored. Then the reg/opcode field finally provides
an extra register or different variants, if an instruction requires one.

SIB is Scale-I ndex-Base byte. This byte encodes ways to calculate the
memory position into an element of an array. SIB is the name that is
based on this formula for calculating an eﬀective address:

Effective address = scale ∗ index + base

✄ Index is an oﬀset into an array.

✄ Scale is a factor of Index. Scale is one of the values 1, 2, 4 or 8;

any other value is invalid. To scale with values other than 2, 4 or
8, the scale factor must be set to 1, and the oﬀset must be calculated
manually. For example, if we want to get the address of the nth el-
ement in an array and each element is 12-bytes long. Because each
element is 12-bytes long instead of 1, 2, 4 or 8, Scale is set to 1 and
a compiler needs to calculate the oﬀset:
x86 assembly and c 63

Effective address = 1 ∗ (12 ∗ n) + base

Why do we bother with SIB when we can manually calculate the

oﬀset? The answer is that in the above scenario, an additional mul
instruction must be executed to get the oﬀset, and the mul instruc-
tion consumes more than 1 byte, while the SIB only consumes 1
byte. More importantly, if the element is repeatedly accessed many
times in a loop, e.g. millions of times, then an extra mul instruc-
tion can detriment the performance as the CPU must spend time
executing millions of these additional mul instructions.
The values 2, 4 and 8 are not random chosen. They map to 16-bit
(or 2 bytes), 32-bit (or 4 bytes) and 64-bit (or 8 bytes) numbers that
are often used for intensive numeric calculations.

✄ Base is the starting address.

Below is the table listing all 256 values of SIB byte, with the lookup
rule similar to ModR/M tables:

Example 4.5.4. This instruction:

jmp [eax*2 + ebx]

generates the following code:

00000000 67 ff 24 43

First of all, the first byte, 0x67 is not an opcode but a prefix. The num-
ber is a predefined prefix for address-size override prefix. After the pre-
fix, comes the opcode 0xff and the ModR/M byte 0x24. The value from
ModR/M suggests that there exists a SIB byte that follows. The SIB byte
is 0x43.

Look up in the SIB table, the row tells that eax is scaled by 2, and the
column tells that the base to be added is in ebx.

Displacement is the oﬀset from the start of the base index.

Example 4.5.5. This instruction:

jmp [0x1234]
64 operating systems: from 0 to 1

r32(/r) EAX ECX EDX EBX ESP EBP ESI EDI

(In decimal) /digit (Opcode) 0 1 2 3 4 5 6 7
(In binary) REG = 000 001 010 011 100 101 110 111
Eﬀective Address SS R/M Values of SIB Byte (In Hexadecimal)
[EAX] 00 000 00 01 02 03 04 05 06 07
[ECX] 001 08 09 0A 0B 0C 0D 0E 0F
[EDX] 010 10 11 12 13 14 15 16 17
[EBX] 011 18 19 1A 1B 1C 1D 1E 1F
none 100 20 21 22 23 24 25 26 27
[EBP] 101 28 29 2A 2B 2C 2D 2E 2F
[ESI] 110 30 31 32 33 34 35 36 37
[EDI] 111 38 39 3A 3B 3C 3D 3E 3F
[EAX*2] 01 000 40 41 42 43 44 45 46 47
[ECX*2] 001 48 49 4A 4B 4C 4D 4E 4F
[EDX*2] 010 50 51 52 53 54 55 56 57
[EBX*2] 011 58 59 5A 5B 5C 5D 5E 5F
none 100 60 61 62 63 64 65 66 67
[EBP*2] 101 68 69 6A 6B 6C 6D 6E 6F
[ESI*2] 110 70 71 72 73 74 75 76 77
[EDI*2] 111 78 79 7A 7B 7C 7D 7E 7F
[EAX*4] 10 000 80 81 82 83 84 85 86 87
[ECX*4] 001 88 89 8A 8B 8C 8D 8E 8F
[EDX*4] 010 90 91 92 93 94 95 96 97
[EBX*4] 011 98 99 9A 9B 9C 9D 9E 9F
none 100 A0 A1 A2 A3 A4 A5 A6 A7
[EBP*4] 101 A8 A9 AA AB AC AD AE AF
[ESI*4] 110 B0 B1 B2 B3 B4 B5 B6 B7
[EDI*4] 111 B8 B9 BA BB BC BD BE BF
[EAX*8] 11 000 C0 C1 C2 C3 C4 C5 C6 C7
[ECX*8] 001 C8 C9 CA CB CC CD CE CF
[EDX*8] 010 D0 D1 D2 D3 D4 D5 D6 D7
[EBX*8] 011 D8 D9 DA DB DC DD DE DF
none 100 E0 E1 E2 E3 E4 E5 E6 E7
[EBP*8] 101 E8 E9 EA EB EC ED EE EF
[ESI*8] 110 F0 F1 F2 F3 F4 F5 F6 F7
[EDI*8] 111 F8 F9 FA FB FC FD FE FF

1. The [*] nomenclature means a disp32 with no base if the MOD is 00B. Otherwise, [*] means disp8 or disp32 +
[EBP]. This provides the following address modes:

MOD bits Eﬀective Address

00 [scaled index] + disp32
01 [scaled index] + disp8 + [EBP]
10 [scaled index] + disp32 + [EBP]
Table 4.5.3: 32-Bit Addressing
Forms with the SIB Byte
x86 assembly and c 65

generates machine code is:

ff 26 34 12

0x1234, which is generated as 34 12 in raw machine code, is the dis-

placement and stands right next to 0x26, which is the ModR/M byte.

Example 4.5.6. This instruction:

jmp [eax * 4 + 0x1234]

generates the machine code:

67 ff 24 85 34 12 00 00

✄ 0x67 is an address-size override preﬁx. Its meaning is that if an in-

struction runs a default address size e.g. 16-bit, the use of preﬁx
enables the instruction to use non-default address size, e.g. 32-bit
or 64-bit. Since the binary is supposed to be 16-bit, 0x67 changes
the instruction to 32-bit mode.

✄ 0xff is the opcode.

✄ 0x24 is the ModR/M byte. According to table 4.5.2, the value sug-
gests that a SIB byte follows, .

✄ 0x85 is the SIB byte. According to table 4.5.3, the byte 0x85 can
be destructured into bits as follow:
SS R/M REG

1 0 0 0 0 1 0 1
The above values are obtained through the columns SS, R/M and
ﬁnally the 8 column of REG respectively. The total bits combined
into the value 10000101, which is 0x85 in hex value. By default,
if a register after the displacement is not speciﬁed, it is set to EBP
register, and thus the 6th column (bit pattern 101) is always cho-
sen. If the example uses another register:

Example 4.5.7. For example:

jmp [eax * 4 + eax + esi]

the SIB byte becomes 0x86 instead of , which is in the 7th column.
Try to verify with the table 4.5.3 again.
66 operating systems: from 0 to 1

✄ 34 12 00 00 is the displacement. As can be seen, the displacement

is 4 bytes in size, which is equivalent to 32-bit, due to address-size
override preﬁx.

Immediate When an instruction accepts a ﬁxed value, e.g. 0x1234, as

an operand, this optional field holds the value. Note that this field is
different from displacement: the value is not necessary used an offset,
but an arbitrary value of anything.

Example 4.5.8. This instruction:

mov eax, 0x1234

generates the code:

66 b8 34 12 00 00

✄ 0x66 is operand-sized override preﬁx. Similar to address-size over-

ride preﬁx, this preﬁx enables operand-size to be non-default.

✄ 0xb8 is one of the opcodes for mov instruction.

✄ 0x1234 is the value to be stored in register eax. It is just a value

for storing directly into a register, and nothing more. On the other
hand, displacement value is an oﬀset for some address calculation.

Exercise 4.5.1. Read section 2.1 in Volume 2 for even more details.

Exercise 4.5.2. Skim through section 5.1 in volume 1. Read chapter

7 in volume 1. If there are terminologies that you don’t understand e.g.
segmentation, don’t worry as the terms will be explained in later chap-
ters or ignored.

4.6 Understand an instruction in detail

In the instruction reference manual (Volume 2), from chapter 3 onward,

every x86 instruction is documented in detail. Whenever the precise be-
havior of an instruction is needed, we always consult this document ﬁrst.
However, before using the document, we must know the writing conven-
tions ﬁrst. Every instruction has the following common structure for or-
ganizing information:
x86 assembly and c 67

Opcode table lists all possible opcodes of an assembly instruction.

Each table contains the following ﬁelds, and can have one or more rows:

Opcode Instruction Op/En 64/32-bit Mode CPUID Description

Feature flag

Opcode shows a unique hexadecimal number assigned to an instruc-

tion. There can be more than one opcode for an instruction, each
encodes a variant of the instruction. For example, one variant re-
quires one operand, but another requires two. In this column, there
can be other notations aside from hexadecimal numbers. For exam-
ple, /r indicates that the ModR/M byte of the instruction contains
a reg operand and an r/m operand. The detail listing is in section
3.1.1.1 and 3.1.1.2 in the Intel’s manual, volume 2.

Instruction gives the syntax of the assembly instruction that a pro-

grammer can use for writing code. Aside from the mnemonic repre-
sentation of the opcode, e.g. jmp, other symbols represent operands
with speciﬁc properties in the instruction. For example, rel8 rep-
resents a relative address from 128 bytes before the end of the in-
struction to 127 bytes after the end of instruction; similarly rel16/rel32
also represents relative addresses, but with the operand size of 16/32-
bit instead of 8-bit like rel8. For a detailed listing, please refer to
section 3.1.1.3 of volume 2.

Op/En is short for Operand/ Encoding. An operand encoding speci-

ﬁes how a ModR/M byte encodes the operands that an instruction
requires. If a variant of an instruction requires operands, then an
additional table named “Instruction Operand Encoding” is added
for explaining the operand encoding, with the following structure:

Op/En Operand 1 Operand 2 Operand 3 Operand 4

Most instructions require one to two operands. We make use of these

instructions for our OS and skip the instructions that require three
or four operands. The operands can be readable or writable or both.
The symbol (r) denotes a readable operand, and (w) denotes a writable
operand. For example, when Operand 1 ﬁeld contains ModRM:r/m
68 operating systems: from 0 to 1

(r), it means the ﬁrst operand is encoded in r/m ﬁeld of ModR/M byte,
and is only readable.

64/32-bit mode indicates whether the opcode sequence is supported

in a 64-bit mode and possibly 32-bit mode.

CPUID Feature Flag indicates a particular CPU feature must be

available to enable the instruction. An instruction is invalid if
a CPU does not support the required feature. In Linux, the command:
Compat/Leg Mode Many instructions do not have this field, but in- cat /proc/cpuinfo
stead is replaced with Compat/Leg Mode, which stands for Compatibility
lists the information of available
or Legacy Mode. This mode enables 64-bit variants of instruc- CPUs and its features in flags
tions to run normally in 16 or 32-bit mode. field.
Description briefly explains the variant of an instruction in the cur-
Table 4.6.1: Notations in
rent row. Compat/Leg Mode
Notation Description
Description specifies the purpose of the instructions and how an in- Valid Supported
struction works in detail. I Not supported
N.E. The 64-bit opcode cannot be
Operation is pseudo-code that implements an instruction. If a descrip- encoded as it overlaps with
existing 32-bit opcode.
tion is vague, this section is the next best source to understand an as-
sembly instruction. The syntax is described in section 3.1.1.9 in vol-
ume 2.

Flags affected lists the possible changes to system ﬂags in EFLAGS reg-
ister.

Exceptions list the possible errors that can occur when an instruction
cannot run correctly. This section is valuable for OS debugging. Exceptions
fall into one of the following categories:

✄ Protected Mode Exceptions

✄ Real-Address Mode Exception

✄ Virtual-8086 Mode Exception

✄ Floating-Point Exception

✄ SIMD Floating-Point Exception

✄ Compatibility Mode Exception

x86 assembly and c 69

✄ 64-bit Mode Exception

For our OS, we only use Protected Mode Exceptions and Real-Address Mode
Exceptions. The details are in section 3.1.1.13 and 3.1.1.14, volume 2.

4.7 Example: jmp instruction

Let’s look at our good old jmp instruction. First, the opcode table:

Opcode Instruction Op/ 64-bit Compat/Leg Description

En Mode Mode
EB cb JMP rel8 D Valid Valid Jump short, RIP = RIP + 8-bit displacement sign
extended to 64-bits
E9 cw JMP rel16 D N.S. Valid Jump near, relative, displacement relative to next
instruction. Not supported in 64-bit mode.
E9 cd JMP rel32 D Valid Valid Jump near, relative, RIP = RIP + 32-bit displacement
sign extended to 64-bits
FF /4 JMP r/m16 M N.S. Valid Jump near, absolute indirect, address = zero- extended
r/m16. Not supported in 64-bit mode
FF /4 JMP r/m32 M N.S. Valid Jump near, absolute indirect, address given in r/m32.
Not supported in 64-bit mode
FF /4 JMP r/m64 M Valid N.E Jump near, absolute indirect, RIP = 64-Bit oﬀset from
register or memory
EA cd JMP ptr16:16 D Inv. Valid Jump far, absolute, address given in operand
EA cp JMP ptr16:32 D Inv. Valid Jump far, absolute, address given in operand
FF /5 JMP m16:16 D Valid Valid Jump far, absolute indirect, address given in m16:16
FF /5 JMP m16:32 D Valid Valid Jump far, absolute indirect, address given in m16:32
REX.W + FF /5 JMP m16:64 D Valid N.E. Jump far, absolute indirect, address given in m16:64

Table 4.7.1: jmp opcode table

Each row lists a variant of jmp instruction. The ﬁrst column has the
opcode EB cb, with an equivalent symbolic form jmp rel8. Here, rel8
means 128 bytes oﬀset, counting from the end of the instruction. The
end of an instruction is the next byte after the last byte of an instruc-
tion. To make it more concrete, consider this assembly code:

main:
jmp main
jmp main2
jmp main
70 operating systems: from 0 to 1

main2:
jmp 0x1234

generates the machine code:

main main2
Table 4.7.2: Memory address of
↓ ↓
each opcode
Address 00 01 02 03 04 05 06 07 08 09
Opcode eb fe eb 02 eb fa e9 2b 12 00

The ﬁrst jmp main instruction is generated into eb fe and occupies

the addresses 00 and 01; the end of the first jmp main is at address 02,
past the last byte of the first jmp main which is located at the address
01. The value fe is equivalent to -2, since eb opcode uses only a byte
(8 bits) for relative addressing. The offset is -2, and the end address of
the first jmp main is 02, adding them together we get 00 which is the
destination address for jumping to.

Similarly, the jmp main2 instruction is generated into eb 02, which

means the oﬀset is +2; the end address of jmp main2 is at 04, and
adding together with the oﬀset we get the destination address is 06,
which is the start instruction marked by the label main2.

The same rule can be applied to rel16 and rel32 encoding. In the
example code, jmp 0x1234 uses rel16 (which means 2-byte offset) and
is generated into e9 2b 12. As the table 4.7.1 shows, e9 opcode takes a
cw operand, which is a 2-byte offset (section 3.1.1.1, volume 2). Notice
one strange issue here: the offset value is 2b 12, while it is supposed to
be 34 12. There is nothing wrong. Remember, rel8/rel16/rel32 is an
offset, not an address. A offset is a distance from a point. Since no label
is given but a number, the offset is calculated from the start of a program.
In this case, the start of the program is the address 00, the end of jmp
0x1234 is the address 092 , so the offset is calculated as 0x1234 - 0x9 2
which means 9 bytes was consumed,
starting from address 0.
= 0x122b. That solved the mystery!

The jmp instructions with opcode FF /4 enable jumping to a near,

absolute address stored in a general-purpose register or a memory loca-
tion; or in short, as written in the description, absolute indirect. The sym-
bol /4 is the column with digit 4 in table 4.5.13 . For example: 3
The column with the following ﬁelds:
AH
SP
ESP
M45
XMM4
4
100
x86 assembly and c 71

jmp [0x1234]

is generated into:

ff 26 34 12

Since this is 16-bit code, we use table 4.5.1. Looking up the table,
ModR/M value 26 means disp16, which means a 16-bit oﬀset from the
start of current index4 , which is the base address stored in DS register. 4
Look at the note under the table.

In this case, jmp [0x1234] is implicitly understood as jmp

[ds:0x1234], which means the destination address is 0x1234 bytes
away from the start of a data segment.

The jmp instruction with opcode FF /5 enables jumping to a far, ab-

solute address stored in a memory location (as opposed to /4, which means
stored in a register); in short, a far pointer. To generate such instruction,
the keyword far is needed to tell nasm we are using a far pointer:

jmp far [eax]

is generated into:

67 ff 28

Since 28 is the value in the 5th column of the table 4.5.25 that refers 5
Remember the preﬁx 67 indicates the
instruction is used as 32-bit. The pre-
to [eax], we successfully generate an instruction for a far jump. After ﬁx only added if the default environ-
ment is assumed as 16-bit when gener-
CPU runs the instruction, the program counter eip and code segment
ating code by an assembler.
register cs is set to the memory address, stored in the memory location
that eax points to, and CPU starts fetching code from the new address
in cs and eip. To make it more concrete, here is an example:

00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
eax

0x00001000 1000 34 12 00 00 78 56
cs

0x00005678
eip

0x00001234

Figure 4.7.1: far jmp example,

with the destination memory stored
The far address consumes total of 6 bytes in size for a 16-bit segment at address 0x1000, which is stored
and 32-bit address, which is encoded as m16:32 from the table 4.7.1. As in eax to be dereferenced. After
CPU executes the instruction, code
segment register cs and instruction
pointer eip
72 operating systems: from 0 to 1

can be seen from the ﬁgure above, the blue part is a segment address,
loaded into cs register with the value 0x5678; the red part is the
memory address within that segment, loaded into eip register with the
value 0x1234 and start executing from there.

Finally, the jmp instructions with EA opcode jump to a direct abso-

lute address. For example, the instruction:

jmp 0x5678:0x1234

is generated into:

ea 34 12 78 56

The address 0x5678:0x1234 is right next to the opcode, unlike FF /5

instruction that needs an indirect address in eax register.

We skip the jump instruction with REX preﬁx, as it is a 64-bit instruc-

tion.

4.8 Examine compiled data

In this section, we will examine how data deﬁnition in C maps to its as-
sembly form. The generated code is extracted from .bss section. That
means, the assembly code displayed has no6 , aside from showing that such 6
Actually, code is just a type of data,
and is often used for hijacking into a
a value has an equivalent assembly opcode that represents an instruc- running program to execute such code.
However, we have no use for it in this
tion. book.
The code-assembly listing is not random, but is based on Chapter 4
of Volume 1, “Data Type”. The chapter lists fundamental data types that
x86 hardware operates on, and through learning the generated assembly
code, it can be understood how close C maps its syntax to hardware, and
then a programmer can see why C is appropriate for OS programming.
The speciﬁc objdump command used in this section will be:

$ objdump -z -M intel -S -D -j .data -j .bss <object

file> | less

Note: zero bytes are hidden with three dot symbols: ... To show all
the zero bytes, we add -z option.
x86 assembly and c 73

4.8.1 Fundamental data types

The most basic types that x86 architecture works with are based on sizes,
each is twice as large as the previous one: 1 byte (8 bits), 2 bytes (16 bits),
4 bytes (32 bits), 8 bytes (64 bits) and 16 bytes (128 bits).

Byte Unsigned Integer

7 0 Figure 4.8.1: Fundamental Data
Types
Word Unsigned Integer
15 0

Doubleword Unsigned Integer

31 0

Quadword Unsigned Integer

63 0
Sign

Byte Signed Integer

7 0
Sign

Word Signed Integer

15 0
Sign

Doubleword Signed Integer

31 0
Sign

Quadword Signed Integer

63 0

These types are simplest: they are just chunks of memory at diﬀerent
sizes that enables CPU to access memory eﬃciently. From the manual,
section 4.1.1, volume 1:

Words, doublewords, and quadwords do not need to be aligned in mem-

ory on natural boundaries. The natural boundaries for words, double words,
and quadwords are even-numbered addresses, addresses evenly divisible
by four, and addresses evenly divisible by eight, respectively. However,
to improve the performance of programs, data structures (especially stacks)
should be aligned on natural boundaries whenever possible. The reason
for this is that the processor requires two memory accesses to make an
unaligned memory access; aligned accesses require only one memory ac-
cess. A word or doubleword operand that crosses a 4-byte boundary or
a quadword operand that crosses an 8-byte boundary is considered un-
aligned and requires two separate memory bus cycles for access.

Some instructions that operate on double quadwords require memory

operands to be aligned on a natural boundary. These instructions gen-
erate a general-protection exception (#GP) if an unaligned operand is spec-
iﬁed. A natural boundary for a double quadword is any address evenly
divisible by 16. Other instructions that operate on double quadwords
74 operating systems: from 0 to 1

permit unaligned access (without generating a general-protection excep-

tion). However, additional memory bus cycles are required to access un-
aligned data from memory.

In C, the following primitive types (must include stdint.h) maps to the

fundamental types:
Source
#include <stdint.h>

uint8_t byte = 0x12;

uint16_t word = 0x1234;
uint32_t dword = 0x12345678;
uint64_t qword = 0x123456789abcdef;
unsigned __int128 dqword1 = (__int128) 0x123456789abcdef;
unsigned __int128 dqword2 = (__int128) 0x123456789abcdef << 64;

int main(int argc, char *argv[]) {

return 0;
}

Assembly 0804a018 <byte>:

804a018: 12 00 adc al,BYTE PTR [eax]
0804a01a <word>:
804a01a: 34 12 xor al,0x12
0804a01c <dword>:
804a01c: 78 56 js 804a074 <_end+0x48>
804a01e: 34 12 xor al,0x12
0804a020 <qword>:
804a020: ef out dx,eax
804a021: cd ab int 0xab
804a023: 89 67 45 mov DWORD PTR [edi+0x45],esp
804a026: 23 01 and eax,DWORD PTR [ecx]
0000000000601040 <dqword1>:
601040: ef out dx,eax
601041: cd ab int 0xab
601043: 89 67 45 mov DWORD PTR [rdi+0x45],esp
601046: 23 01 and eax,DWORD PTR [rcx]
x86 assembly and c 75

601048: 00 00 add BYTE PTR [rax],al

60104a: 00 00 add BYTE PTR [rax],al
60104c: 00 00 add BYTE PTR [rax],al
60104e: 00 00 add BYTE PTR [rax],al
0000000000601050 <dqword2>:
601050: 00 00 add BYTE PTR [rax],al
601052: 00 00 add BYTE PTR [rax],al
601054: 00 00 add BYTE PTR [rax],al
601056: 00 00 add BYTE PTR [rax],al
601058: ef out dx,eax
601059: cd ab int 0xab
60105b: 89 67 45 mov DWORD PTR [rdi+0x45],esp
60105e: 23 01 and eax,DWORD PTR [rcx]

gcc generates the variables byte, word, dword, qword, dqword1,

dword2, written earlier, with their respective values highlighted in the
same colors; variables of the same type are also highlighted in the
same color. Since this is data section, the assembly listing carries no
meaning. When byte is declared with uint8_t, gcc guarantees that the
size of byte is always 1 byte. But, an alert reader might notice the 00
value next to the 12 value in the byte variable. This is normal, as gcc
avoid memory misalignment by adding extra padding bytes. To make it
easier to see, we look at readelf output of .data section:

$ readelf -x .data hello

the output is (the colors mark which values belong to which variables):

Hex dump of section ’.data’:

0x00601020 00000000 00000000 00000000 00000000 ................
0x00601030 12003412 78563412 efcdab89 67452301 ..4.xV4.....gE#.
0x00601040 efcdab89 67452301 00000000 00000000 ....gE#.........
0x00601050 00000000 00000000 efcdab89 67452301 ............gE#.

As can be seen in the readelf output, variables are allocated storage

space according to their types and in the declared order by the program-
76 operating systems: from 0 to 1

mer (the colors correspond the the variables). Intel is a little-endian ma-
chine, which means smaller addresses hold bytes with smaller values, larger
addresses hold byte with larger values. For example, 0x1234 is displayed
as 34 12; that is, 34 appears ﬁrst at address 0x601032, then 12 at 0x601033.
The decimal values within a byte is unchanged, so we see 34 12 instead
of 43 21. This is quite confusing at ﬁrst, but you will get used to it soon.

Also, isn’t it redundant when char type is always 1 byte already and
why do we bother adding int8_t? The truth is, char type is not guar-
anteed to be 1 byte in size, but only the minimum of 1 byte in size. In
C, a byte is deﬁned to be the size of a char, and a char is deﬁned to be small-
est addressable unit of the underlying hardware platform. There are hard-
ware devices that the smallest addressable unit is 16 bit or even bigger,
which means char is 2 bytes in size and a “byte” in such platforms is ac-
tually 2 units of 8-bit bytes.

Not all architectures support the double quadword type. Still, gcc does
provide support for 128-bit number and generate code when a CPU sup-
ports it (that is, a CPU must be 64-bit). By specifying a variable of type
__int128 or unsigned __int128, we get a 128-bit variable. If a CPU does
not support 64-bit mode, gcc throws an error.

The data types in C, which represents the fundamental data types,

are also called unsigned numbers. Other than numerical calculations, un-
signed numbers are used as a tool for structuring data in memory; we
will see this application later on the book, when various data structures
are organized into bit groups.

In all the examples above, when the value of a variable with smaller
size is assigned to a variable with larger size, the value easily ﬁts in the
larger variable. On the contrary, the value of a variable with larger size
is assigned to a variable with smaller size, two scenarios occur:

✄ The value is greater than the maximum value of the variable with smaller
layout, so it needs truncating to the size of the variable and causing
incorrect value.

✄ The value is smaller than the maximum value of the variable with a
smaller layout, so it ﬁts the variable.
x86 assembly and c 77

However, the value might be unknown until runtime and can be value, it
is best not to let such implicit conversion handled by the compiler, but
explicitly controlled by a programmer. Otherwise it will cause subtle bugs
that are hard to catch as the erroneous values might rarely be used to
reproduce the bugs.

4.8.2 Pointer Data Types

Pointers are variables that hold memory addresses. x86 works with 2 types
of pointers:

Near pointer is a 16-bit/32-bit oﬀset within a segment, also called eﬀec-

tive address.

Far pointer is also an oﬀset like a near pointer, but with an explicit seg-
ment selector.

Near Pointer
Figure 4.8.2: Numeric Data
Types
Oﬀset

31 0

Far Pointer or Logical Address

Segment Selector Oﬀset

47 32 31 0

C only provides support for near pointers, since far pointers are plat-
form dependent, such as x86. In application code, you can assume that
the address of current segment starts at 0, so the oﬀset is actually any
memory address from 0 to the maximum address.
Source
#include <stdint.h>

int8_t i = 0;
int8_t *p1 = (int8_t *) 0x1234;
int8_t *p2 = &i;
78 operating systems: from 0 to 1

int main(int argc, char *argv[]) {

return 0;
}

Assembly 0000000000601030 <p1>:

601030: 34 12 xor al,0x12
601032: 00 00 add BYTE PTR [rax],al
601034: 00 00 add BYTE PTR [rax],al
601036: 00 00 add BYTE PTR [rax],al
0000000000601038 <p2>:
601038: 41 10 60 00 adc BYTE PTR [r8+0x0],spl
60103c: 00 00 add BYTE PTR [rax],al
60103e: 00 00 add BYTE PTR [rax],al
Disassembly of section .bss:
0000000000601040 <__bss_start>:
601040: 00 00 add BYTE PTR [rax],al
0000000000601041 <i>:
601041: 00 00 add BYTE PTR [rax],al
601043: 00 00 add BYTE PTR [rax],al
601045: 00 00 add BYTE PTR [rax],al
601047: 00 .byte 0x0

The pointer p1 holds a direct address with the value 0x1234. The pointer
p2 holds the address of the variable i. Note that both the pointers are
8 bytes in size (or 4-byte, if 32-bit).

4.8.3 Bit Field Data Type

A bit field is a contiguous sequence of bits. Bit fields allow data structur-
ing at bit level. For example, a 32-bit data can hold multiple bit fields
that represent multiples different pieces of information, such as bits 0-4
specifies the size of a data structure, bit 5-6 specifies permissions and so
on. Data structures at the bit level are common for low-level program-
ming.
Source
struct bit_field {
int data1:8;
x86 assembly and c 79

. Figure 4.8.3: Numeric Data

Bit Field
Types (Source: Figure 4-6, Volume
1
Field Length

Least
Signiﬁcant
Bit

int data2:8;
int data3:8;
int data4:8;
};

struct bit_field2 {
int data1:8;
int data2:8;
int data3:8;
int data4:8;
char data5:4;
};

struct normal_struct {
int data1;
int data2;
int data3;
int data4;
};

struct normal_struct ns = {
.data1 = 0x12345678,
.data2 = 0x9abcdef0,
.data3 = 0x12345678,
.data4 = 0x9abcdef0,
};
80 operating systems: from 0 to 1

int i = 0x12345678;

struct bit_field bf = {
.data1 = 0x12,
.data2 = 0x34,
.data3 = 0x56,
.data4 = 0x78
};

struct bit_field2 bf2 = {

.data1 = 0x12,
.data2 = 0x34,
.data3 = 0x56,
.data4 = 0x78,
.data5 = 0xf
};

int main(int argc, char *argv[]) {

return 0;
}

Assembly Each variable and its value are given a unique color in the as-
sembly listing below:

0804a018 <ns>:
804a018: 78 56 js 804a070 <_end+0x34>
804a01a: 34 12 xor al,0x12
804a01c: f0 de bc 9a 78 56 34 lock fidivr WORD PTR [edx+ebx*4+0x12345678]
804a023: 12
804a024: f0 de bc 9a 78 56 34 lock fidivr WORD PTR [edx+ebx*4+0x12345678]
804a02b: 12
0804a028 <i>:
804a028: 78 56 js 804a080 <_end+0x44>
804a02a: 34 12 xor al,0x12
0804a02c <bf>:
804a02c: 12 34 56 adc dh,BYTE PTR [esi+edx*2]
x86 assembly and c 81

804a02f: 78 12 js 804a043 <_end+0x7>

0804a030 <bf2>:
804a030: 12 34 56 adc dh,BYTE PTR [esi+edx*2]
804a033: 78 0f js 804a044 <_end+0x8>
804a035: 00 00 add BYTE PTR [eax],al
804a037: 00 .byte 0x0

The sample code creates 4 variables: ns, i, bf, bf2. The definition of normal_struct
and bit_field structs both specify 4 integers. bit_field specifies ad-
ditional information next to its member name, separated by a colon, e.g.
.data1 : 8. This extra information is the bit width of each bit group.
It means, even though defined as an int, .data1 only consumes 8 bit of
information. If additional data members are specified after .data1, two
scenarios happen:

✄ If the new data members ﬁt within the remaining bits after .data, which
are 24 bits7 , then the total size of bit_field struct is still 4 bytes, or 7
Since .data1 is declared as an int, 32
bits are still allocated, but .data1 can
32 bits. only access 8 bits of information.

✄ If the new data members don’t ﬁt, then the remaining 24 bits (3 bytes)
are still allocated. However, the new data members are allocated brand
new storages, without using the previous 24 bits.

In the example, the 4 data members: .data1, .data2, .data3 and .data4,
each can access 8 bits of information, and together can access all of 4 bytes
of the integer first declared by .data1. As can be seen by the generated
assembly code, the values of bf are follow natural order as written in the
C code: 12 34 56 78, since each value is a separate members. In con-
trast, the value of i is a number as a whole, so it is subject to the rule
of little endianess and thus contains the value 78 56 34 12. Note that
at 804a02f, is the address of the final byte in bf, but next to it is a num-
ber 12, despite 78 is the last number in it. This extra number 12 does
not belong to the value of bf. objdump is just being confused that 78 is
an opcode; 78 corresponds to js instruction, and it requires an operand.
For that reason, objdump grabs whatever the next byte after 78 and put
it there. objdump is a tool to display assembly code after all. A better
tool to use is gdb that we will learn in the next chapter. But for this chap-
ter, objdump suffices.
82 operating systems: from 0 to 1

Unlike bf, each data member in ns is allocated fully as an integer, 4

bytes each, 16 bytes in total. As we can see, bit field and normal struct
are different: bit field structure data at the bit level, while normal struct
works at byte level.

Finally, the struct of bf28 is the same of bf9 , except it contains one 8
bit_field2
9
bit_field
more data member: .data5, and is defined as a char. For this reason, an-
other 4 bytes are allocated just for .data5, even though it can only ac-
cess 4 bits of information, and the final value of bf2 is: 12 34 56 78 0f
00 00 00. The remaining 3 bytes must be accessed by the mean of a pointer,
or casting to another data type that can fully access all 4 bytes..

Exercise 4.8.1. What happens when the deﬁnition of bit_field struct

and bf variable are changed to:

struct bit_field {
int data1:8;
};
struct bit_field bf = {
.data1 = 0x1234,
};

What will be the value of .data1?

Exercise 4.8.2. What happens when the deﬁnition of bit_field2 struct

is changed to:

struct bit_field2 {
int data1:8;
int data5:32;
};

What is layout of a variable of type bit_field2?

4.8.4 String Data Types

Although share the same name, string as defined by x86 is different than
a string in C. x86 defines string as “continuous sequences of bits, bytes,
words, or doublewords”. On the other hand, C defines a string as an ar-
ray of 1-byte characters with a zero as the last element of the array to
x86 assembly and c 83

make a null-terminated string. This implies that strings in x86 are ar-
rays, not C strings. A programmer can deﬁne an array of bytes, words
or doublewords with char or uint8_t, short or uint16_t and int or uint32_t,
except an array of bits. However, such a feature can be easily implemented,
as an array of bits is essentially any array of bytes, or words or double-
words, but operates at the bit level.
The following code demonstrates how to deﬁne array (string) data types:
Source
#include <stdint.h>

uint8_t a8[2] = {0x12, 0x34};

uint16_t a16[2] = {0x1234, 0x5678};
uint32_t a32[2] = {0x12345678, 0x9abcdef0};
uint64_t a64[2] = {0x123456789abcdef0, 0x123456789abcdef0
};

int main(int argc, char *argv[])

{
return 0;
}

Assembly 0804a018 <a8>:

804a018: 12 34 00 adc dh,BYTE PTR [eax+eax*1]
804a01b: 00 34 12 add BYTE PTR [edx+edx*1],dh
0804a01c <a16>:
804a01c: 34 12 xor al,0x12
804a01e: 78 56 js 804a076 <_end+0x3a>
0804a020 <a32>:
804a020: 78 56 js 804a078 <_end+0x3c>
804a022: 34 12 xor al,0x12
804a024: f0 de bc 9a f0 de bc lock fidivr WORD PTR [edx+ebx*4-0x65432110]
804a02b: 9a
0804a028 <a64>:
804a028: f0 de bc 9a 78 56 34 lock fidivr WORD PTR [edx+ebx*4+0x12345678]
804a02f: 12
804a030: f0 de bc 9a 78 56 34 lock fidivr WORD PTR [edx+ebx*4+0x12345678]
84 operating systems: from 0 to 1

804a037: 12

Despite a8 is an array with 2 elements, each is 1-byte long, but it is still

allocated with 4 bytes. Again, to ensure natural alignment for best per-
formance, gcc pads extra zero bytes. As shown in the assembly listing,
the actual value of a8 is 12 34 00 00, with a8[0] equals to 12 and a8[1]
equals to 34.

Then it comes a16 with 2 elements, each is 2-byte long. Since 2 ele-
ments are 4 bytes in total, which is in the natural alignment, gcc pads
no byte. The value of a16 is 34 12 78 56, with a16[0] equals to 34 12
and a16[1] equals to 78 56. Note that, objdump is confused again, as
de is the opcode for the instruction fidivr (short of reverse divide) that
requires another operand, so objdump grabs whatever the next bytes that
makes sense to it for creating “an operand”. Only the highlighted values
belong to a32.

Next is a32, with 2 elements, 4 bytes each. Similar to above arrays,

the value of a32[0] is 78 56 34 12, the value of a32[1] is f0 de bc 9a,
exactly what is assigned in the C code.

Finally is a64, also with 2 elements, but 8 bytes each. The total size
of a64 is 16 bytes, which is in the natural alignment, therefore no padding
bytes added. The values of both a64[0] and a64[1] are the same: f0
de bc 9a 78 56 34 12, that got misinterpreted to fidivr instruction.

a8: 12 | 34 Figure 4.8.4: a8, a16, a32 and

a16: 34 12 | 78 56 a64 memory layouts
a32: 78 56 34 12 | f0 de bc 9a
a64: f0 de bc 9a 78 56 34 12 | f0 de bc 9a 78 56 34 12

However, beyond one-dimensional arrays that map directly to hard-

ware string type, C provides its own syntax for multi-dimensional arrays:
Source
#include <stdint.h>

uint8_t a2[2][2] = {
{0x12, 0x34},
{0x56, 0x78}
};
x86 assembly and c 85

uint8_t a3[2][2][2] = {
{{0x12, 0x34},
{0x56, 0x78}},
{{0x9a, 0xbc},
{0xde, 0xff}},
};

int main(int argc, char *argv[]) {

return 0;
}

Assembly 0804a018 <a2>:

804a018: 12 34 56 adc dh,BYTE PTR [esi+edx*2]
804a01b: 78 12 js 804a02f <_end+0x7>
0804a01c <a3>:
804a01c: 12 34 56 adc dh,BYTE PTR [esi+edx*2]
804a01f: 78 9a js 8049fbb <_DYNAMIC+0xa7>
804a021: bc .byte 0xbc
804a022: de ff fdivrp st(7),st

Technically, multi-dimensional arrays are like normal arrays: in the end,

the total size is translated into ﬂat allocated bytes. A 2 x 2 array is allo-
cated with 4 bytes; a 2 × 2 × 2 array is allocated with 8 bytes, as can be
seen in the assembly listing of a210 and a3. In low-level assembly code, 10
Again, objdump is confused and put
the number 12 next to 78 in a3 listing.
the representation is the same between a[4] and a[2][2]. However, in
high-level C code, the diﬀerence is tremendous. The syntax of multi-dimensional
array enables a programmer to think with higher level concepts, instead
of translating manually from high-level concepts to low-level code and
work with high-level concepts in his head at the same time.

Example 4.8.1. The following two-dimensional array can hold a list of

2 names with the length of 10:

char names[2][10] = {
"John␣Doe",
"Jane␣Doe"
86 operating systems: from 0 to 1

};

To access a name, we simply adjust the column index11 e.g. names[0], 11

The left index is called column index
since it changes the index based on a
names[1]. To access individual character within a name, we use the row column.
index12 e.g. names[0][0] gives the character “J”, names[0][1] gives the 12
Same with column index, the right
index is calledrow index since it
character “o” and so on. changes the index based on a row.

Without such syntax, we need to create a 20-byte array e.g. names[20],

and whenever we want to access a character e.g. to check if the names
contains with a number in it, we need to calculate the index manually.
It would be distracting, since we constantly need to switch thinkings be-
tween the actual problem and the translate problem.
Since this is a repeating pattern, C abstracts away this problem with
the syntax for deﬁne and manipulating multi-dimensional array. Through
this example, we can clearly see the power of abstraction through lan-
guage can give us. It would be ideal if a programmer is equipped with
such power to deﬁne whatever syntax suitable for a problem at hands.
Not many languages provide such capacity. Fortunately, through C macro,
we can partially achieve that goal .

In all cases, an array is guaranteed to generate contiguous bytes of mem-

ory, regardless of the dimensions it has.

Exercise 4.8.3. What is the diﬀerence between a multi-dimensional ar-

ray and an array of pointers, or even pointers of pointers?

4.9 Examine compiled code

This section will explore how compiler transform high level code into as-
sembly code that CPU can execute, and see how common assembly pat-
terns help to create higher level syntax. -S option is added to objdump
to better demonstrate the connection between high and low level code.
In this section, the option --no-show-raw-insn is added to objdump
command to omit the opcodes for clarity:

$ objdump --no-show-raw-insn -M intel -S -D <object

file> | less
x86 assembly and c 87

4.9.1 Data Transfer

Previous section explores how various types of data are created, and how
they are laid out in memory. Once memory storages are allocated for vari-
ables, they must be accessible and writable. Data transfer instructions
move data (bytes, words, doublewords or quadwords) between memory
and registers, and between registers, eﬀectively read from a storage source
and write to another storage source.
Source
#include <stdint.h>

int32_t i = 0x12345678;

int main(int argc, char *argv[]) {

int j = i;
int k = 0xabcdef;

return 0;
}

Assembly 080483db <main>:

#include <stdint.h>
int32_t i = 0x12345678;
int main(int argc, char *argv[]) {
80483db: push ebp
80483dc: mov ebp,esp
80483de: sub esp,0x10
int j = i;
80483e1: mov eax,ds:0x804a018
80483e6: mov DWORD PTR [ebp-0x8],eax
int k = 0xabcdef;
80483e9: mov DWORD PTR [ebp-0x4],0xabcdef
return 0;
80483f0: mov eax,0x0
}
80483f5: leave
88 operating systems: from 0 to 1

80483f6: ret
80483f7: xchg ax,ax
80483f9: xchg ax,ax
80483fb: xchg ax,ax
80483fd: xchg ax,ax
80483ff: nop

The general data movement is performed with the mov instruction. Note
that despite the instruction being called mov, it actually copies data from
one destination to another.

The red instruction copies data from the register esp to the register
ebp. This mov instruction moves data between registers and is assigned
the opcode 89.

The blue instructions copies data from one memory location (the i
variable) to another (the j variable). There exists no data movement from
memory to memory; it requires two mov instructions, one for copying the
data from a memory location to a register, and one for copying the data
from the register to the destination memory location.

The pink instruction copies an immediate value into memory. Finally,

the green instruction copies immediate data into a register.

4.9.2 Expressions
Source
int expr(int i, int j)
{
int add = i + j;
int sub = i - j;
int mul = i * j;
int div = i / j;
int mod = i % j;
int neg = -i;
int and = i & j;
int or = i | j;
int xor = i ^ j;
int not = ~i;
int shl = i << 8;
x86 assembly and c 89

int shr = i >> 8;

char equal1 = (i == j);
int equal2 = (i == j);
char greater = (i > j);
char less = (i < j);
char greater_equal = (i >= j);
char less_equal = (i <= j);
int logical_and = i && j;
int logical_or = i || j;
++i;
--i;
int i1 = i++;
int i2 = ++i;
int i3 = i--;
int i4 = --i;

return 0;
}

int main(int argc, char *argv[]) {

return 0;
}

Assembly The full assembly listing is really long. For that reason, we ex-
amine expression by expression.

Expression: int add = i + j;

80483e1: mov edx,DWORD PTR [ebp+0x8]

80483e4: mov eax,DWORD PTR [ebp+0xc]
80483e7: add eax,edx
80483e9: mov DWORD PTR [ebp-0x34],eax

The assembly code is straight forward: variable i and j are stored

in eax and edx respectively, then added together with the add in-
struction, and the ﬁnal result is stored into eax. Then, the result
is saved into the local variable add, which is at the location [ebp-0x34].
90 operating systems: from 0 to 1

Expression: int sub = i - j;

80483ec: mov eax,DWORD PTR [ebp+0x8]

80483ef: sub eax,DWORD PTR [ebp+0xc]
80483f2: mov DWORD PTR [ebp-0x30],eax

Similar to add instruction, x86 provides a sub instruction for sub-

traction. Hence, gcc translates a subtraction into sub instruction,
with eax is reloaded with i, as eax still carries the result from pre-
vious expression. Then, j is subtracted from i. After the subtrac-
tion, the value is saved into the variable sub, at location [ebp-0x30].

Expression: int mul = i * j;

80483f5: mov eax,DWORD PTR [ebp+0x8]

80483f8: imul eax,DWORD PTR [ebp+0xc]
80483fc: mov DWORD PTR [ebp-0x34],eax

Similar to sub instruction, only eax is reloaded, since it carries the

result of previous calculation. imul performs signed multiply13 . eax 13
Unsigned multiply is perform by mul
instruction.
is ﬁrst loaded with i, then is multiplied with j and stored the re-
sult back into eax, then stored into the variable mul at location [ebp-0x34].

Expression: int div = i / j;

80483ff: mov eax,DWORD PTR [ebp+0x8]

8048402: cdq
8048403: idiv DWORD PTR [ebp+0xc]
8048406: mov DWORD PTR [ebp-0x30],eax

Similar to imul, idiv performs sign divide. But, diﬀerent from imul
above idiv only takes one operand:

1. First, i is reloaded into eax.

2. Then, cdq converts the double word value in eax into a quad-
word value stored in the pair of registers edx:eax, by copying
the signed (bit 31th ) of the value in eax into every bit position
in edx. The pair edx:eax is the dividend, which is the variable
i, and the operand to idiv is the divisor, which is the variable
j.
3. After the calculation, the result is stored into the pair edx:eax
registers, with the quotient in eax and remainder in edx. The
quotient is stored in the variable div, at location [ebp-0x30].
x86 assembly and c 91

Expression: int mod = i % j;

8048409: mov eax,DWORD PTR [ebp+0x8]

804840c: cdq
804840d: idiv DWORD PTR [ebp+0xc]
8048410: mov DWORD PTR [ebp-0x2c],edx

The same idiv instruction also performs the modulo operation, since
it also calculates a remainder and stores in the variable mod, at lo-
cation [ebp-0x2c].

Expression: int neg = -i;

8048413: mov eax,DWORD PTR [ebp+0x8]

8048416: neg eax
8048418: mov DWORD PTR [ebp-0x28],eax

neg replaces the value of operand (the destination operand) with

its two’s complement (this operation is equivalent to subtracting
the operand from 0). In this example, the value i in eax is replaced
replaced with -i using neg instruction. Then, the new value is stored
in the variable neg at [ebp-0x28].

Expression: int and = i & j;

804841b: mov eax,DWORD PTR [ebp+0x8]

804841e: and eax,DWORD PTR [ebp+0xc]
8048421: mov DWORD PTR [ebp-0x24],eax

and performs a bitwise AND operation on two operands, and stores

the result in the destination operand, which is the variable and at
[ebp-0x24].

Expression: int or = i | j;

8048424: mov eax,DWORD PTR [ebp+0x8]

8048427: or eax,DWORD PTR [ebp+0xc]
804842a: mov DWORD PTR [ebp-0x20],eax

Similar to and instruction, or performs a bitwise OR operation on

two operands, and stores the result in the destination operand, which
is the variable or at [ebp-0x20] in this case.

Expression: int xor = i ^ j;

92 operating systems: from 0 to 1

804842d: mov eax,DWORD PTR [ebp+0x8]

8048430: xor eax,DWORD PTR [ebp+0xc]
8048433: mov DWORD PTR [ebp-0x1c],eax

Similar to and/or instruction, xor performs a bitwise XOR opera-

tion on two operands, and stores the result in the destination operand,
which is the variable xor at [ebp-0x1c].

Expression: int not = ~i;

8048436: mov eax,DWORD PTR [ebp+0x8]

8048439: not eax
804843b: mov DWORD PTR [ebp-0x18],eax

not performs a bitwise NOT operation (each 1 is set to 0, and each

0 is set to 1) on the destination operand and stores the result in
the destination operand location, which is the variable not at [ebp-0x18].

Expression: int shl = i <�< 8;

804843e: mov eax,DWORD PTR [ebp+0x8]

8048441: shl eax,0x8
8048444: mov DWORD PTR [ebp-0x14],eax

shl (shift logical left) shifts the bits in the destination operand to
the left by the number of bits speciﬁed in the source operand. In
this case, eax stores i and shl shifts eax by 8 bits to the left. A dif-
ferent name for shl is sal ( shift arithmetic left). Both can be used
synonymous. Finally, the result is stored in the variable shl at [ebp-0x14].
Here is a visual demonstration of shl/sal and shr instructions:
After shifting to the left, the right most bit is set for Carry Flag in
EFLAGS register.

Expression: int shr = i >�> 8;

8048447: mov eax,DWORD PTR [ebp+0x8]

804844a: sar eax,0x8
804844d: mov DWORD PTR [ebp-0x10],eax

sar is similar to shl/sal, but shift bits to the right and extends
the sign bit. For right shift, shr and sar are two diﬀerent instruc-
tions. shr diﬀers to sar is that it does not extend the sign bit. Finally,
the result is stored in the variable shr at [ebp-0x10].
x86 assembly and c 93

Initial State Initial State

CF CF

X 10001000100010001000100010001111 10001000100010001000100010001111 X

After 1-bit SHL/SAL instruction After 1-bit SHR instruction

1 00010001000100010001000100011110 0 0 01000100010001000100010001000111 1

After 1-bit SHL/SAL instruction After 10-bit SHR instruction

0 01000100010001000111100000000000 0 0 00000000001000100010001000100010 0

(a) SHL/SAL (Source: Figure 7-6, Volume 1) (b) SHR (Source: Figure 7-7, Volume 1)

Figure 4.9.1: Shift Instructions

In the ﬁgure 4.9.1(b), notice that initially, the sign bit is 1, but af- (red is the start bit, blue is the end
bit.)
ter 1-bit and 10-bit shiftings, the shifted-out bits are ﬁlled with ze-
ros.
Initial State (Positive Operand)
Operand CF
Figure 4.9.2: SAR Instruction
01000100010001000100010001000111 X Operation (Source: Figure 7-8,
Volume 1)
After 1-bit SAR instruction

00100010001000100010001000100011 1

Initial State (Negative Operand)

Operand

11000100010001000100010001000111 X

After 10-bit SAR instruction

11100010001000100010001000100011 1

With sar, the sign bit (the most signiﬁcant bit) is preserved. That
is, if the sign bit is 0, the new bits always get the value 0; if the sign
bit is 1, the new bits always get the value 1.

Expression: char equal1 = (i == j);

8048450: mov eax,DWORD PTR [ebp+0x8]

8048453: cmp eax,DWORD PTR [ebp+0xc]
8048456: sete al
8048459: mov BYTE PTR [ebp-0x41],al
94 operating systems: from 0 to 1

cmp and variants of the variants of set instructions make up all the
logical comparisons. In this expression, cmp compares variable i and
j; then sete stores the value 1 to al register if the comparison from
cmp earlier is equal, or stores 0 otherwise. The general name for vari-
ants of set instruction is called SETcc. The suﬃx cc denotes the
condition being tested for in EFLAGS register. Appendix B in vol-
ume 1, “EFLAGS Condition Codes”, lists the conditions it is possi-
ble to test for with this instruction. Finally, the result is stored in
the variable equal1 at [ebp-0x41].

Expression: int equal2 = (i == j);

804845c: mov eax,DWORD PTR [ebp+0x8]

804845f: cmp eax,DWORD PTR [ebp+0xc]
8048462: sete al
8048465: movzx eax,al
8048468: mov DWORD PTR [ebp-0xc],eax

Similar to equality comparison, this expression also compares for

equality, with an exception that the result is stored in an int type.
For that reason, one more instruction is a added: movzx instruction,
a variant of mov that copies the result into a destination operand
and fills the remaining bytes with 0. In this case, since eax is 4-byte
wide, after copying the first byte in al, the remaining bytes of eax
are filled with 0 to ensure the eax carries the same value as al.

12 34 56 78 00 00 00 78 Figure 4.9.3: movzx instruction

(a) eax before movzx (b) after movzx eax, al

Expression: char greater = (i > j);

804846b: mov eax,DWORD PTR [ebp+0x8]

804846e: cmp eax,DWORD PTR [ebp+0xc]
8048471: setg al
8048474: mov BYTE PTR [ebp-0x40],al

Similar to equality comparison, but used setg for greater compari-

son instead.

Expression: char less = (i < j);

8048477: mov eax,DWORD PTR [ebp+0x8]

x86 assembly and c 95

804847a: cmp eax,DWORD PTR [ebp+0xc]

804847d: setl al
8048480: mov BYTE PTR [ebp-0x3f],al

Applied setl for less comparison.

Expression: char greater_equal = (i >= j);

8048483: mov eax,DWORD PTR [ebp+0x8]

8048486: cmp eax,DWORD PTR [ebp+0xc]
8048489: setge al
804848c: mov BYTE PTR [ebp-0x3e],al

Applied setge for greater or equal comparison.

Expression: char less_equal = (i <= j);

804848f: mov eax,DWORD PTR [ebp+0x8]

8048492: cmp eax,DWORD PTR [ebp+0xc]
8048495: setle al
8048498: mov BYTE PTR [ebp-0x3d],al

Applied setle for less than or equal comparison.

Expression: int logical_and = (i && j);

804849b: cmp DWORD PTR [ebp+0x8],0x0

804849f: je 80484ae <expr+0xd3>
80484a1: cmp DWORD PTR [ebp+0xc],0x0
80484a5: je 80484ae <expr+0xd3>
80484a7: mov eax,0x1
80484ac: jmp 80484b3 <expr+0xd8>
80484ae: mov eax,0x0
80484b3: mov DWORD PTR [ebp-0x8],eax

Logical AND operator && is one of the syntaxes that is made entirely
in software14 with simpler instructions. The algorithm from the as- 14
That is, there is no equivalent assem-
bly instruction implemented in hard-
sembly code is simple: ware.

1. First, check if i is 0 with the instruction at 0x804849b.

(a) If true, jump to 0x80484ae and set eax to 0.

(b) Set the variable logical_and to 0, as it is the next instruc-
tion after 0x80484ae.
96 operating systems: from 0 to 1

2. If i is not 0, check if j is 0 with the instruction at 0x80484a1.

(a) If true, jump to 0x80484ae and set eax to 0.

(b) Set the variable logical_and to 0, as it is the next instruc-
tion after 0x80484ae.

3. If both i and j are not 0, the result is certainly 1, or true.

(a) Set it accordingly with the instruction at 0x80484a7.

(b) Then jump to the instruction at 0x80484b3 to set the vari-
able logical_and at [ebp-0x8] to 1.

Expression: int logical_or = (i || j);

80484b6: cmp DWORD PTR [ebp+0x8],0x0

80484ba: jne 80484c2 <expr+0xe7>
80484bc: cmp DWORD PTR [ebp+0xc],0x0
80484c0: je 80484c9 <expr+0xee>
80484c2: mov eax,0x1
80484c7: jmp 80484ce <expr+0xf3>
80484c9: mov eax,0x0
80484ce: mov DWORD PTR [ebp-0x4],eax

Logical OR operator || is similar to logical and above. Understand

the algorithm is left as an exercise for readers.

Expression: ++i; and --i; (or i++ and i--)

80484d1: add DWORD PTR [ebp+0x8],0x1

80484d5: sub DWORD PTR [ebp+0x8],0x1

The syntax of increment and decrement is similar to logical AND and

logical OR in that it is made from existing instruction, that is add.
The difference is that the CPU actually does has a built-in instruc-
tion, but gcc decided not to use the instruction because inc and
dec cause a partial flag register stall, occurs when an instruction
modifies a part of the flag register and the following instruction is
dependent on the outcome of the flags (section 3.5.2.6, Intel Optimization
Manual, 2016b). The manual even suggests that inc and dec should
be replaced with add and sub instructions (section 3.5.1.1, Intel Optimization
Manual, 2016b).

Expression: int i1 = i++;

x86 assembly and c 97

80484d9: mov eax,DWORD PTR [ebp+0x8]

80484dc: lea edx,[eax+0x1]
80484df: mov DWORD PTR [ebp+0x8],edx
80484e2: mov DWORD PTR [ebp-0x10],eax

First, i is copied into eax at 80484d9. Then, the value of eax + 0x1
is copied into edx as an effective address at 80484dc. The lea (load
effective address) instruction copies a memory address into a reg-
ister. According to Volume 2, the source operand is a memory ad-
dress specified with one of the processors addressing modes. This
means, the source operand must be specified by the addressing modes
defined in 16-bit/32-bit ModR/M Byte tables, 4.5.1 and 4.5.2.
After loading the incremented value into edx, the value of i is in-
creased by 1 at 80484df. Finally, the previous i value is stored back
to i1 at [ebp-0x8] by the instruction at 80484e2.

Expression: int i2 = ++i;

80484e5: add DWORD PTR [ebp+0x8],0x1

80484e9: mov eax,DWORD PTR [ebp+0x8]
80484ec: mov DWORD PTR [ebp-0xc],eax

The primary diﬀerences between this increment syntax and the pre-
vious one are:

✄ add is used instead of lea to increase i directly.

✄ the newly incremented i is stored into i2 instead of the old value.
✄ the expression only costs 3 instructions instead of 4.

This preﬁx-increment syntax is faster than the post-ﬁx one used

previously. It might not matter much which version to use if the
increment is only used once or a few hundred times in a small loop,
but it matters when a loop runs millions or more times. Also, de-
pends on diﬀerent circumstances, it is more convenient to use one
over the other e.g. if i is an index for accessing an array, we want
to use the old value for accessing previous array element and newly
incremented i for current element.

Expression: int i3 = i--;

80484ef: mov eax,DWORD PTR [ebp+0x8]

98 operating systems: from 0 to 1

80484f2: lea edx,[eax-0x1]

80484f5: mov DWORD PTR [ebp+0x8],edx
80484f8: mov DWORD PTR [ebp-0x8],eax

Similar to i++ syntax, and is left as an exercise to readers.

Expression: int i4 = --i;

80484fb: sub DWORD PTR [ebp+0x8],0x1

80484ff: mov eax,DWORD PTR [ebp+0x8]
8048502: mov DWORD PTR [ebp-0x4],eax

Similar to ++i syntax, and is left as an exercise to readers.

Exercise 4.9.1. Read section 3.5.2.4, “Partial Register Stalls” to un-

derstand register stalls in general.

Exercise 4.9.2. Read the sections from 7.3.1 to 7.3.7 in volume 1.

4.9.3 Stack

A stack is a contiguous array of memory locations that holds a collection

of discrete data. When a new element is added, a stack grows down in
memory toward lesser addresses, and shrinks up toward greater addresses
when an element is removed. x86 uses the esp register to point to the
top of the stack, at the newest element. A stack can be originated any-
where in main memory, as esp can be set to any memory address. x86
provides two operations for manipulating stacks:

✄ push instruction and its variants add a new element on top of the stack

✄ pop instructions and its variants remove the top-most element from
the stack.

0x10000 00 0x10000 00 0x10000 00

0x10001 00 0x10001 00 0x10001 00
0x10002 00 0x10002 78 ← esp 0x10002 00
0x10003 00 0x10003 56 0x10003 00
0x10004 12 ← esp 0x10004 12 0x10004 12 ← esp
(a) Initial state at address 0x10004 (b) After executing push word (c) After executing pop word
0x5678
Figure 4.9.4: Stack operations
x86 assembly and c 99

4.9.4 Automatic variables

Local variables are variables that exist within a scope. A scope is delim-
ited by a pair of braces: {..}. The most common scope to deﬁne local
variables is at function scope. However, scope can be unnamed, and vari-
ables created inside an unnamed scope do not exist outside of its scope
and its inner scope.

Example 4.9.1. Function scope:

void foo() {
int a;
int b;
}

a and b are variables local to the function foo.

Example 4.9.2. Unnamed scope:

int foo() {
int i;

{
int a = 1;
int b = 2;
{
return i = a + b;
}
}
}

a and b are local to where it is deﬁned and local into its inner child
scope that return i = a + b. However, they do not exist at the function
scope that creates i.

When a local variable is created, it is pushed on the stack; when a lo-

cal variable goes out of scope, it is pop out of the stack, thus destroyed.
When an argument is passed from a caller to a callee, it is pushed on the
stack; when a callee returns to the caller, the arguments are popped out
100 operating systems: from 0 to 1

the stack. The local variables and arguments are automatically allocated
upon enter a function and destroyed after exiting a function, that’s why
it’s called automatic variables.

A base frame pointer points to the start of the current function frame,
and is kept in ebp register. Whenever a function is called, it is allocated
with its own dedicated storage on stack, called stack frame. A stack frame
is where all local variables and arguments of a function are placed on a
stack15 . 15
Data and only data are exclusively
allocatedon stackforevery stack
When a function needs a local variable or an argument, it uses ebp frame. No code resides here.

to access a variable:

✄ All local variables are allocated after the ebp pointer. Thus, to access
a local variable, a number is subtracted from ebp to reach the loca-
tion of the variable.

✄ All arguments are allocated before ebp pointer. To access an argument,

a number is added to ebp to reach the location of the argument.

✄ The ebp itself pointer points to the return address of its caller.

Previous Frame Current Frame

Function Arguments ebp Local variables
A1 A2 A3 ........ An Return Address Old ebp L1 L2 L3 ........ Ln
Figure 4.9.5: Function arguments
A = Argument and local variables

L = Local Variable

Here is an example to make it more concrete:

Source
int add(int a, int b) {
int i = a + b;

return i;
}

Assembly 080483db <add>:

#include <stdint.h>
int add(int a, int b) {
80483db: push ebp
x86 assembly and c 101

80483dc: mov ebp,esp

80483de: sub esp,0x10
int i = a + b;
80483e1: mov edx,DWORD PTR [ebp+0x8]
80483e4: mov eax,DWORD PTR [ebp+0xc]
80483e7: add eax,edx
80483e9: mov DWORD PTR [ebp-0x4],eax
return i;
80483ec: mov eax,DWORD PTR [ebp-0x4]
}
80483ef: leave
80483f0: ret

In the assembly listing, [ebp-0x4] is the local variable i, since it is allo-

cated after ebp, with the length of 4 bytes (an int). On the other hand,
a and b are arguments and can be accessed with ebp:

✄ [ebp+0x8] accesses a.

✄ [ebp+0xc] access b.

For accessing arguments, the rule is that the closer a variable on stack
to ebp, the closer it is to a function name.

ebp+0xc ebp+0x8 ebp+0x4 ebp

↓ ↓ ↓ ↓
00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f
0x10000 b a Return Address Old ebp

ebp+0x8 ebp+0x4
↓ ↓
00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f
0xffe0 N i
Figure 4.9.6: Function arguments
N = Next local variable starts here and local variables in memory

From the ﬁgure, we can see that a and b are laid out in memory with
the exact order as written in C, relative to the return address.

4.9.5 Function Call and Return

102 operating systems: from 0 to 1

Source
#include <stdio.h>

int add(int a, int b) {

int local = 0x12345;

return a + b;
}

int main(int argc, char *argv[]) {

add(1,1);

return 0;
}

Assembly For every function call, gcc pushes arguments on the stack in
reversed order with the push instructions. That is, the arguments pushed
on stack are in reserved order as it is written in high level C code, to
ensure the relative order between arguments, as seen in previous sec-
tion how function arguments and local variables are laid out. Then,
gcc generates a call instruction, which then implicitly pushes a re-
turn address before transferring the control to add function:

080483f2 <main>:
int main(int argc, char *argv[]) {
80483f2: push ebp
80483f3: mov ebp,esp
add(1,2);
80483f5: push 0x2
80483f7: push 0x1
80483f9: call 80483db <add>
80483fe: add esp,0x8
return 0;
8048401: mov eax,0x0
}
8048406: leave
x86 assembly and c 103

8048407: ret

Upon ﬁnishing the call to add function, the stack is restored by adding
0x8 to stack pointer esp (which is equivalent to 2 pop instructions). Finally,
a leave instruction is executed and main returns with a ret instruction.
A ret instruction transfers the program execution back to the caller to
the instruction right after the call instruction, the add instruction. The
reason ret can return to such location is that the return address implic-
itly pushed by the call instruction, which is the address right after the
call instruction; whenever the CPU executes ret instruction, it retrieves
the return address that sits right after all the arguments on the stack:

At the end of a function, gcc places a leave instruction to clean up

all spaces allocated for local variables and restore the frame pointer to
frame pointer of the caller.

080483db <add>:
#include <stdio.h>
int add(int a, int b) {
80483db: push ebp
80483dc: mov ebp,esp
80483de: sub esp,0x10
int local = 0x12345;
80483e1: DWORD PTR [ebp-0x4],0x12345
return a + b;
80483e8: mov edx,DWORD PTR [ebp+0x8]
80483eb: mov eax,DWORD PTR [ebp+0xc]
80483ee: add eax,edx
}
80483f0: leave
80483f1: ret

Exercise 4.9.3. The above code that gcc generated for function call-
ing is actually the standard method x86 deﬁned. Read chapter 6, “Produce
Calls, Interrupts, and Exceptions”, Intel manual volume 1.
104 operating systems: from 0 to 1

4.9.6 Loop

Loop is simply resetting the instruction pointer to an already executed

instruction and starting from there all over again. A loop is just one ap-
plication of jmp instruction. However, because looping is a pervasive pat-
tern, it earned its own syntax in C.
Source
#include <stdio.h>

int main(int argc, char *argv[]) {

for (int i = 0; i < 10; i++) {
}

return 0;
}

Assembly 080483db <main>:

#include <stdio.h>
int main(int argc, char *argv[]) {
80483db: push ebp
80483dc: mov ebp,esp
80483de: sub esp,0x10
for (int i = 0; i < 10; i++) {
80483e1: mov DWORD PTR [ebp-0x4],0x0
80483e8: jmp 80483ee <main+0x13>
80483ea: add DWORD PTR [ebp-0x4],0x1
80483ee: cmp DWORD PTR [ebp-0x4],0x9
80483f2: jle 80483ea <main+0xf>
}
return 0;
80483f4: b8 00 00 00 00 mov eax,0x0
}
80483f9: c9 leave
80483fa: c3 ret
80483fb: 66 90 xchg ax,ax
80483fd: 66 90 xchg ax,ax
x86 assembly and c 105

80483ff: 90 nop

The colors mark corresponding high level code to assembly code:

1. The red instruction initialize i to 0.

2. The green instructions compare i to 10 by using jle and compare

it to 9. If true, jump to 80483ea for another iteration.

3. The blue instruction increase i by 1, making the loop able to ter-

minate once the terminate condition is satisﬁed.

Exercise 4.9.4. Why does the increment instruction (the blue instruc-
tion) appears before the compare instructions (the green instructions)?

Exercise 4.9.5. What assembly code can be generated for while and
do...while?

4.9.7 Conditional

Again, conditional in C with if...else... construct is just another ap-

plication of jmp instruction under the hood. It is also a pervasive pat-
tern that earned its own syntax in C.
Source
#include <stdio.h>

int main(int argc, char *argv[]) {

int i = 0;

if (argc) {
i = 1;
} else {
i = 0;
}

return 0;
}

Assembly int main(int argc, char *argv[]) {

80483db: push ebp
106 operating systems: from 0 to 1

80483dc: mov ebp,esp

80483de: sub esp,0x10
int i = 0;
80483e1: mov DWORD PTR [ebp-0x4],0x0
if (argc) {
80483e8: cmp DWORD PTR [ebp+0x8],0x0
80483ec: je 80483f7 <main+0x1c>
i = 1;
80483ee: mov DWORD PTR [ebp-0x4],0x1
80483f5: jmp 80483fe <main+0x23>
} else {
i = 0;
80483f7: mov DWORD PTR [ebp-0x4],0x0
}
return 0;
80483fe: mov eax,0x0
}
8048403: leave
8048404: ret

The generated assembly code follows the same order as the correspond-
ing high level syntax:

✄ red instructions represents if branch.

✄ blue instructions represents else branch.
✄ green instruction is the exit point for both if and else branch.

if branch ﬁrst compares whether argc is false (equal to 0) with cmp

instruction. If true, it proceeds to else branch at 80483f7. Otherwise,
if branch continues with the code of its branch, which is the next in-
struction at 80483ee for copying 1 to i. Finally, it skips over else
branch and proceeds to 80483fe, which is the next instruction pasts
the if..else... construct.

else branch is entered when cmp instruction from if branch is true.

else branch starts at 80483f7, which is the ﬁrst instruction of else
branch. The instruction copies 0 to i, and proceeds naturally to the
next instruction pasts the if...else... construct without any jump.
5
The Anatomy of a Program

Every program consists of code and data, and only those two components
made up a program. However, if a program consists purely code and data
of its own, from the perspective of an operating system (as well as hu-
man), it does not know in a program, which block of binary is a program
and which is just raw data, where in the program to start execution, which
region of memory should be protected and which is free to modify. For
that reason, each program carries extra metadata to communicate with
the operating system how to handle the program.

When a source ﬁle is compiled, the generated machine code is stored

into an object file, which is just a block of binary. One or more object files object file
can be combined to produce an executable binary, which is a complete executable binary
program runnable in an operating system.

readelf is a program that recognizes and displays the ELF metadata

of a binary file, be it an object file or an executable binary. ELF, or Executable
and Linkable Format, is the content at the very beginning of an executable
to provide an operating system necessary information to load into main
memory and run the executable. ELF can be thought of similar to the
table of contents of a book. In a book, a table of contents list the page
numbers of the main sections, subsections, sometimes even figures and
tables for easy lookup. Similarly, ELF lists various sections used for code
and data, and the memory addresses of each symbol along with other in-
108 operating systems: from 0 to 1

formation.

An ELF binary is composed of:

✄ An ELF header: the very ﬁrst section of an executable that describes ELF header
the ﬁle’s organization.

✄ A program header table: is an array of ﬁxed-size structures that de- program header table
scribes segments of an executable.

✄ A section header table: is an array of ﬁxed-size structures that describes section header table
sections of an executable.

✄ Segments and sections are the main content of an ELF binary, which Segments and sections
are the code and data, divided into chunks of diﬀerent purposes.

A segment is a composition of zero or more sections and is directly loaded

by an operating system at runtime.

A section is a block of binary that is either:

– actual program code and data that is available in memory when a

program runs.

– metadata about other sections used only in the linking process, and
disappear from the ﬁnal executable.

Linker uses sections to build segments.

ELF header Figure 5.0.1: ELF - Linking

View vs Executable View (Source:
Program header table
Wikipedia)

{ .text

.rodata

{ ...

.data

Section header table

Later we will compile our kernel as an ELF executable with GCC, and
explicitly specify how segments are created and where they are loaded
the anatomy of a program 109

in memory through the use a linker script, a text ﬁle to instruct how a
linker should generate a binary. For now, we will examine the anatomy
of an ELF executable in detail.

5.1 Reference documents:

The ELF speciﬁcation is bundled as a man page in Linux: ELF speciﬁcation

$ man elf

It is a useful resource to understand and implement ELF. However, it

will be much easier to use after you ﬁnish this chapter, as the speciﬁca-
tion mixes implementation details in it.

The default speciﬁcation is a generic one, in which every ELF imple-

mentation follows. However, each platform provides extra features unique
to it. The ELF speciﬁcation for x86 is currently maintained on Github
by H.J. Lu: https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI.

Platform-dependent details are referred to as “processor speciﬁc” in

the generic ELF speciﬁcation. We will not explore these details, but study
the generic details, which are enough for crafting an ELF binary image
for our operating system.

5.2 ELF header

To see the information of an ELF header:

$ readelf -h hello

The output:

Output ELF Header:

Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2’s complement, little endian
110 operating systems: from 0 to 1

Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x400430
Start of program headers: 64 (bytes into file)
Start of section headers: 6648 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 9
Size of section headers: 64 (bytes)
Number of section headers: 31
Section header string table index: 28

Let’s go through each ﬁeld:

Magic Displays the raw bytes that uniquely addresses a ﬁle is an ELF
executable binary. Each byte gives a brief information.

In the example, we have the following magic bytes:

Output Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00

Examine byte by byte:

the anatomy of a program 111

Byte Description

7f 45 4c 46 Predeﬁned values. The ﬁrst byte is always

7F, the remaining 3 bytes represent the
string “ELF”.

02 See Class ﬁeld below.

01 See Data ﬁeld below.

01 See Version ﬁeld below.

00 See OS/ABI ﬁeld below.

00 00 00 00 00 Padding bytes. These bytes are unused

00 00 00 and are always set to 0. Padding bytes are
added for proper alignment, and is
reserved for future use when more
information is needed.

Class A byte in Magic field. It specifies the class or capacity of a file.

Possible values:
Value Description

0 Invalid class
1 32-bit objects
2 64-bit objects

Data A byte in Magic field. It specifies the data encoding of the processor-
specific data in the object file.

Possible values:
Value Description

0 Invalid data encoding

1 Little endian, 2’s complement
2 Big endian, 2’s complement

Version A byte in Magic. It speciﬁes the ELF header version number.

Possible values:
112 operating systems: from 0 to 1

Value Description

0 Invalid version
1 Current version

OS/ABI A byte in Magic ﬁeld. It speciﬁes the target operating system

ABI. Originally, it was a padding byte.

Possible values: Refer to the latest ABI document, as it is a long list

of diﬀerent operating systems.

Type Identiﬁes the object ﬁle type.

Value Description

0 No file type
1 Relocatable file
2 Executable file
3 Shared object file
4 Core file
0xff00 Processor specific, lower bound
0xffff Processor specific, upper bound
The values from 0xff00 to 0xffff are reserved for a processor to de-
fine additional file types meaningful to it.

Machine Speciﬁes the required architecture value for an ELF ﬁle e.g.
x86_64, MIPS, SPARC, etc. In the example, the machine is of x86_64
architecture.

Possible values: Please refer to the latest ABI document, as it is a long

list of diﬀerent architectures.

Version Specifies the version number of the current object file (not the
version of the ELF header, as the above Version field specified).

Entry point address Specifies the memory address where the very first
code to be executed. The address of main function is the default in
a normal application program, but it can be any function by explic-
itly specifying the function name to gcc. For the operating system
we are going to write, this is the single most important field that we
need to retrieve to bootstrap our kernel, and everything else can be
ignored.
the anatomy of a program 113

Start of program headers The oﬀset of the program header table, in

bytes. In the example, this number is 64 bytes, which means the 65th
byte, or <start address> + 64, is the start address of the program
header table. That is, if a program is loaded at address 0x10000 in mem-
ory, then the start address is 0x10000 (the very ﬁrst byte of Magic ﬁeld,
where the value 0x7f resides) and the start address of program header
table is 0x10000 + 0x40 = 0x10040.

Start of section headers The oﬀset of the section header table in bytes,
similar to the start of program headers. In the example, it is 6648 bytes
into ﬁle.

Flags Hold processor-specific flags associated with the file. When the
program is loaded, in a x86 machine, EFLAGS register is set according
to this value. In the example, the value is 0x0, which means EFLAGS
register is in a clear state.

Size of this header Specifies the total size of ELF header’s size in bytes.
In the example, it is 64 bytes, which is equivalent to Start of program
headers. Note that these two numbers are not necessarily equivalent,
as program header table might be placed far away from the ELF header.
The only fixed component in the ELF executable binary is the ELF
header, which appears at the very beginning of the file.

Size of program headers Speciﬁes the size of each program header

in bytes. In the example, it is 64 bytes.

Number of program headers Speciﬁes the total number of program

headers. In the example, the ﬁle has a total of 9 program headers.

Size of section headers Speciﬁes the size of each section header in

bytes. In the example, it is 64 bytes.

Number of section headers Speciﬁes the total number of section head-

ers. In the example, the ﬁle has a total of 31 section headers. In a sec-
tion header table, the ﬁrst entry in the table is always an empty sec-
tion.

Section header string table index Speciﬁes the index of the header
in the section header table that points to the section that holds all
114 operating systems: from 0 to 1

null-terminated strings. In the example, the index is 28, which means

it’s the 28th entry of the table.

5.3 Section header table

As we know already, code and data compose a program. However, not

all types of code and data have the same purpose. For that reason, in-
stead of a big chunk of code and data, they are divided into smaller chunks,
and each chunk must satisfy these conditions (according to gABI):

✄ Every section in an object ﬁle has exactly one section header describ-
ing it. But, section headers may exist that do not have a section.

✄ Each section occupies one contiguous (possibly empty) sequence of

bytes within a ﬁle. That means, there’s no two regions of bytes that
are the same section.

✄ Sections in a ﬁle may not overlap. No byte in a ﬁle resides in more

than one section.

✄ An object file may have inactive space. The various headers and the
sections might not “cover” every byte in an object file. The contents
of the inactive data are unspecified.

To get all the headers from an executable binary e.g. hello, use the fol-
lowing command:

$ readelf -S hello

Here is a sample output (do not worry if you don’t understand the
output. Just skim to get your eyes familiar with it. We will dissect it
soon enough):

Output There are 31 section headers, starting at offset 0x19c8:

Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
the anatomy of a program 115

[ 0] NULL 0000000000000000 00000000

0000000000000000 0000000000000000 0 0 0
[ 1] .interp PROGBITS 0000000000400238 00000238
000000000000001c 0000000000000000 A 0 0 1
[ 2] .note.ABI-tag NOTE 0000000000400254 00000254
0000000000000020 0000000000000000 A 0 0 4
[ 3] .note.gnu.build-i NOTE 0000000000400274 00000274
0000000000000024 0000000000000000 A 0 0 4
[ 4] .gnu.hash GNU_HASH 0000000000400298 00000298
000000000000001c 0000000000000000 A 5 0 8
[ 5] .dynsym DYNSYM 00000000004002b8 000002b8
0000000000000048 0000000000000018 A 6 1 8
[ 6] .dynstr STRTAB 0000000000400300 00000300
0000000000000038 0000000000000000 A 0 0 1
[ 7] .gnu.version VERSYM 0000000000400338 00000338
0000000000000006 0000000000000002 A 5 0 2
[ 8] .gnu.version_r VERNEED 0000000000400340 00000340
0000000000000020 0000000000000000 A 6 1 8
[ 9] .rela.dyn RELA 0000000000400360 00000360
0000000000000018 0000000000000018 A 5 0 8
[10] .rela.plt RELA 0000000000400378 00000378
0000000000000018 0000000000000018 AI 5 24 8
[11] .init PROGBITS 0000000000400390 00000390
000000000000001a 0000000000000000 AX 0 0 4
[12] .plt PROGBITS 00000000004003b0 000003b0
0000000000000020 0000000000000010 AX 0 0 16
[13] .plt.got PROGBITS 00000000004003d0 000003d0
0000000000000008 0000000000000000 AX 0 0 8
[14] .text PROGBITS 00000000004003e0 000003e0
0000000000000192 0000000000000000 AX 0 0 16
[15] .fini PROGBITS 0000000000400574 00000574
0000000000000009 0000000000000000 AX 0 0 4
[16] .rodata PROGBITS 0000000000400580 00000580
0000000000000004 0000000000000004 AM 0 0 4
116 operating systems: from 0 to 1

[17] .eh_frame_hdr PROGBITS 0000000000400584 00000584

000000000000003c 0000000000000000 A 0 0 4
[18] .eh_frame PROGBITS 00000000004005c0 000005c0
0000000000000114 0000000000000000 A 0 0 8
[19] .init_array INIT_ARRAY 0000000000600e10 00000e10
0000000000000008 0000000000000000 WA 0 0 8
[20] .fini_array FINI_ARRAY 0000000000600e18 00000e18
0000000000000008 0000000000000000 WA 0 0 8
[21] .jcr PROGBITS 0000000000600e20 00000e20
0000000000000008 0000000000000000 WA 0 0 8
[22] .dynamic DYNAMIC 0000000000600e28 00000e28
00000000000001d0 0000000000000010 WA 6 0 8
[23] .got PROGBITS 0000000000600ff8 00000ff8
0000000000000008 0000000000000008 WA 0 0 8
[24] .got.plt PROGBITS 0000000000601000 00001000
0000000000000020 0000000000000008 WA 0 0 8
[25] .data PROGBITS 0000000000601020 00001020
0000000000000010 0000000000000000 WA 0 0 8
[26] .bss NOBITS 0000000000601030 00001030
0000000000000008 0000000000000000 WA 0 0 1
[27] .comment PROGBITS 0000000000000000 00001030
0000000000000034 0000000000000001 MS 0 0 1
[28] .shstrtab STRTAB 0000000000000000 000018b6
000000000000010c 0000000000000000 0 0 1
[29] .symtab SYMTAB 0000000000000000 00001068
0000000000000648 0000000000000018 30 47 8
[30] .strtab STRTAB 0000000000000000 000016b0
0000000000000206 0000000000000000 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), l (large)
I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)

The ﬁrst line:

the anatomy of a program 117

There are 31 section headers, starting at offset 0x19c8

summarizes the total number of sections in the ﬁle, and where the
address where it starts. Then, comes the listing section by section with
the following header, is also the format of each section output:

Output
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align

Each section has two lines with diﬀerent ﬁelds:

Nr The index of each section.

Name The name of each section.

Type This ﬁeld (in a section header) identiﬁes the type of each section.
Types are used to classify sections.

Address The starting virtual address of each section. Note that the ad-
dresses are virtual only when a program runs in an OS with support
for virtual memory enabled. In our OS, we run on the bare metal, the
addresses will all be physical.

Offset is a distance in bytes, from the first byte of a file to the start of
an object, such as a section or a segment in the context of an ELF bi-
nary file.

Size The size in bytes of each section.

EntSize Some sections hold a table of ﬁxed-size entries, such as a sym-

bol table. For such a section, this member gives the size in bytes of
each entry. The member contains 0 if the section does not hold a ta-
ble of ﬁxed-size entries.

Flags describes attributes of a section. Flags together with a type de-

ﬁnes the purpose of a section. Two sections can be of the same type,
but serve diﬀerent purposes. For example, even though .data and .text
share the same type, .data holds the initialized data of a program while
118 operating systems: from 0 to 1

.text holds executable instructions of a program. For that reason,

.data is given read and write permission, but not executable. Any at-
tempt to execute code in .data is denied by the running OS: in Linux,
such invalid section usage gives a segmentation fault.

ELF gives information to enable an OS with such protection mecha-

nism. However, running on bare metal, nothing can prevent from do-
ing anything. Our OS can execute code in data section, and vice versa,
writing to code section.

Table 5.3.1: Section Flags

Flag Descriptions

W Bytes in this section are writable during execution.

A Memory is allocated for this section during process execution. Some control sections do
not reside in the memory image of an object file; this attribute is off for those sections.
X The section contains executable instructions.
M The data in the section may be merged to eliminate duplication. Each element in the
section is compared against other elements in sections with the same name, type and flags.
Elements that would have identical values at program run-time may be merged.
S The data elements in the section consist of null-terminated character strings. The size of
each character is specified in the section header’s EntSize field.
l Specific large section for x86_64 architecture. This flag is not specified in the Generic ABI
but in x86_64 ABI.
I The Info field of this section header holds an index of a section header. Otherwise, the
number is the index of something else.
L Preserve section ordering when linking. If this section is combined with other sections in
the output file, it must appear in the same relative order with respect to those sections, as
the linked-to section appears with respect to sections the linked-to section is combined
with. Apply when the Link field of this section’s header references another section (the
linked-to section)
G This section is a member (perhaps the only one) of a section group.
T This section holds Thread-Local Storage, meaning that each thread has its own distinct
instance of this data. A thread is a distinct execution flow of code. A program can have
multiple threads that pack different pieces of code and execute separately, at the same time.
We will learn more about threads when writing our kernel.
the anatomy of a program 119

E Link editor is to exclude this section from executable and shared library that it builds when
those objects are not to be further relocated.
x Unknown flag to readelf. It happens because the linking process can be done manually
with a linker like GNU ld (we will later later). That is, section flags can be specified
manually, and some flags are for a customized ELF that the open-source readelf doesn’t
know of.
O This section requires special OS-specific processing (beyond the standard linking rules) to
avoid incorrect behavior. A link editor encounters sections whose headers contain
OS-specific values it does not recognize by Type or Flags values defined by ELF standard,
the link editor should combine those sections.
o All bits included in this flag are reserved for operating system-specific semantics.
p All bits included in this flag are reserved for processor-specific semantics. If meanings are
specified, the processor supplement explains them.

Link and Info are numbers that references the indexes of sections, sym-
bol table entries, hash table entries. Link ﬁeld only holds the index
of a section, while Info ﬁeld holds an index of a section, a symbol ta-
ble entry or a hash table entry, depends on the type of a section.

Later when writing our OS, we will handcraft the kernel image by ex-
plicitly linking the object files (produced by gcc) through a linker script.
We will specify the memory layout of sections by specifying at what
addresses they will appear in the final image. But we will not assign
any section flag and let the linker take care of it. Nevertheless, know-
ing which flag does what is useful.

Align is a value that enforces the oﬀset of a section should be divisible

by the value. Only 0 and positive integral powers of two are allowed.
Values 0 and 1 mean the section has no alignment constraint.

Example 5.3.1. Output of .interp section:

Output
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 1] .interp PROGBITS 0000000000400238 00000238
000000000000001c 0000000000000000 A 0 0 1

Nr is 1.
120 operating systems: from 0 to 1

Type is PROGBITS, which means this section is part of the program.

Address is 0x0000000000400238, which means the program is loaded

at this virtual memory address at runtime.

Offset is 0x00000238 bytes into ﬁle.

Size is 0x000000000000001c in bytes.

EntSize is 0, which means this section does not have any ﬁxed-size
entry.

Flags are A (Allocatable), which means this section consumes mem-

ory at runtime.

Info and Link are 0 and 0, which means this section links to no sec-
tion or entry in any table.

Align is 1, which means no alignment.

Example 5.3.2. Output of the .text section:

Output
[14] .text PROGBITS 00000000004003e0 000003e0
0000000000000192 0000000000000000 AX 0 0 16

Nr is 14.

Type is PROGBITS, which means this section is part of the program.

Address is 0x00000000004003e0, which means the program is loaded

at this virtual memory address at runtime.

Offset is 0x000003e0 bytes into ﬁle.

Size is 0x0000000000000192 in bytes.

EntSize is 0, which means this section does not have any ﬁxed-size
entry.

Flags are A (Allocatable) and X (Executable), which means this sec-

tion consumes memory and can be executed as code at runtime.

Info and Link are 0 and 0, which means this section links to no sec-
tion or entry in any table.

Align is 16, which means the starting address of the section should
be divisible by 16, or 0x10. Indeed, it is: 0x3e0/0x10 = 0x3e.
the anatomy of a program 121

5.4 Understand Section in-depth

In this section, we will learn diﬀerent details of section types and the pur-
poses of special sections e.g. .bss, .text, .data, etc, by looking at each
section one by one. We will also examine the content of each section as
a hexdump with the commands:

$ readelf -x <section name|section number> <file>

For example, if you want to examine the content of section with index
25 (the .bss section in the sample output) in the ﬁle hello:

$ readelf -x 25 hello

Equivalently, using name instead of index works:

$ readelf -x .data hello

If a section contains strings e.g. string symbol table, the ﬂag -x can
be replaced with -p.

NULL marks a section header as inactive and does not have an associated
section. NULL section is always the ﬁrst entry of section header table.
It means, any useful section starts from 1.

Example 5.4.1. The sample output of NULL section:

Output
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0

Examining the content, the section is empty:

Output Section ” has no data to dump.

122 operating systems: from 0 to 1

NOTE marks a section with special information that other programs will
check for conformance, compatibility, etc, by a vendor or a system builder.

Example 5.4.2. In the sample output, we have 2 NOTE sections:

Output
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 2] .note.ABI-tag NOTE 0000000000400254 00000254
0000000000000020 0000000000000000 A 0 0 4
[ 3] .note.gnu.build-i NOTE 0000000000400274 00000274
0000000000000024 0000000000000000 A 0 0 4

Examine 2nd section with the command:

$ readelf -x 2 hello

we have:

Output Hex dump of section ’.note.ABI-tag’:

0x00400254 04000000 10000000 01000000 474e5500 ............GNU.
0x00400264 00000000 02000000 06000000 20000000 ............ ...

PROGBITS indicates a section holding the main content of a program, ei-

ther code or data.

Example 5.4.3. There are many PROGBITS sections:

Output
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 1] .interp PROGBITS 0000000000400238 00000238
000000000000001c 0000000000000000 A 0 0 1
...
[11] .init PROGBITS 0000000000400390 00000390
000000000000001a 0000000000000000 AX 0 0 4
[12] .plt PROGBITS 00000000004003b0 000003b0
0000000000000020 0000000000000010 AX 0 0 16
[13] .plt.got PROGBITS 00000000004003d0 000003d0
the anatomy of a program 123

0000000000000008 0000000000000000 AX 0 0 8
[14] .text PROGBITS 00000000004003e0 000003e0
0000000000000192 0000000000000000 AX 0 0 16
[15] .fini PROGBITS 0000000000400574 00000574
0000000000000009 0000000000000000 AX 0 0 4
[16] .rodata PROGBITS 0000000000400580 00000580
0000000000000004 0000000000000004 AM 0 0 4
[17] .eh_frame_hdr PROGBITS 0000000000400584 00000584
000000000000003c 0000000000000000 A 0 0 4
[18] .eh_frame PROGBITS 00000000004005c0 000005c0
0000000000000114 0000000000000000 A 0 0 8
...
[23] .got PROGBITS 0000000000600ff8 00000ff8
0000000000000008 0000000000000008 WA 0 0 8
[24] .got.plt PROGBITS 0000000000601000 00001000
0000000000000020 0000000000000008 WA 0 0 8
[25] .data PROGBITS 0000000000601020 00001020
0000000000000010 0000000000000000 WA 0 0 8
[27] .comment PROGBITS 0000000000000000 00001030
0000000000000034 0000000000000001 MS 0 0 1

For our operating system, we only need the following section:

.text This section holds all the compiled code of a program.

.data This section holds the initialized data of a program. Since the
data are initialized with actual values, gcc allocates the section with
actual byte in the executable binary.

.rodata This section holds read-only data, such as ﬁxed-size strings

in a program, e.g. “Hello World”, and others.

.bss This section, shorts for Block Started by Symbol, holds unini-
tialized data of a program. Unlike other sections, no space is allo-
cated for this section in the image of the executable binary on disk.
The section is allocated only when the program is loaded into main
memory.
124 operating systems: from 0 to 1

Other sections are mainly needed for dynamic linking, that is code
linking at runtime for sharing between many programs. To enable
such feature, an OS as a runtime environment must be presented.
Since we run our OS on bare metal, we are eﬀectively creating such
environment. For simplicity, we won’t add dynamic linking to our
OS.

SYMTAB and DYNSYM These sections hold symbol table. A symbol table
is an array of entries that describe symbols in a program. A symbol
is a name assigned to an entity in a program. The types of these en-
tities are also the types of symbols, and these are the possible types
of an entity:

Example 5.4.4. In the sample output, section 5 and 29 are symbol

tables:

Output
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 5] .dynsym DYNSYM 00000000004002b8 000002b8
0000000000000048 0000000000000018 A 6 1 8
...
[29] .symtab SYMTAB 0000000000000000 00001068
0000000000000648 0000000000000018 30 47 8

To show the symbol table:

$ readelf -s hello

Output consists of 2 symbol tables, corresponding to the two sections

above, .dynsym and .symtab:

Output Symbol table ’.dynsym’ contains 4 entries:

Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FUNC GLOBAL DEFAULT UND puts@GLIBC_2.2.5 (2)
2: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __libc_start_main@GLIBC_2.2.5 (2)
3: 0000000000000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__
the anatomy of a program 125

Symbol table ’.symtab’ contains 67 entries:

Num: Value Size Type Bind Vis Ndx Name
..........................................
59: 0000000000601040 0 NOTYPE GLOBAL DEFAULT 26 _end
60: 0000000000400430 42 FUNC GLOBAL DEFAULT 14 _start
61: 0000000000601038 0 NOTYPE GLOBAL DEFAULT 26 __bss_start
62: 0000000000400526 32 FUNC GLOBAL DEFAULT 14 main
63: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _Jv_RegisterClasses
64: 0000000000601038 0 OBJECT GLOBAL HIDDEN 25 __TMC_END__
65: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_registerTMCloneTable
66: 00000000004003c8 0 FUNC GLOBAL DEFAULT 11 _init

TLS The symbol is associated with a Thread-Local Storage entity.

Num is the index of an entry in a table.

Value is the virtual memory address where the symbol is located.

Size is the size of the entity associated with a symbol.

Type is a symbol type according to table.

NOTYPE The type of a symbol is not speciﬁed.

OBJECT The symbol is associated with a data object. In C, any vari-
able definition is of OBJECT type.
FUNC The symbol is associated with a function or other executable
code.
SECTION The symbol is associated with a section, and exists pri-
marily for relocation.
FILE The symbol is the name of a source file associated with an
executable binary.
COMMON The symbol labels an uninitialized variable. That is, when
a variable in C is defined as global variable without an initial
value, or as an external variable using the extern keyword. In
other words, these variables stay in .bss section.

Bind is the scope of a symbol.

LOCAL are symbols that are only visible in the object files that
defined them. In C, the static modifier marks a symbol (e.g.
a variable/function) as local to only the file that defines it.
126 operating systems: from 0 to 1

Example 5.4.5. If we deﬁne variables and functions with static

modifer:

hello.c

static int global_static_var = 0;

static void local_func() {

}

int main(int argc, char *argv[])

{
static int local_static_var = 0;

return 0;
}

Then we get the static variables listed as local symbols after

compiling:

$ gcc -m32 hello.c -o hello

$ readelf -s hello

Output Symbol table ’.dynsym’ contains 5 entries:

Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000000 0 FUNC GLOBAL DEFAULT UND puts@GLIBC_2.0 (2)
2: 00000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__
3: 00000000 0 FUNC GLOBAL DEFAULT UND __libc_start_main@GLIBC_2.0 (2)
4: 080484bc 4 OBJECT GLOBAL DEFAULT 16 _IO_stdin_used
Symbol table ’.symtab’ contains 72 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND

......... output omitted .........

38: 0804a020 4 OBJECT LOCAL DEFAULT 26 global_static_var

39: 0804840b 6 FUNC LOCAL DEFAULT 14 local_func
the anatomy of a program 127

40: 0804a024 4 OBJECT LOCAL DEFAULT 26 local_static_var.1938

......... output omitted .........

GLOBAL are symbols that are accessible by other object files when
linking together. These symbols are primarily non-static func-
tions and non-static global data. The extern modifier marks
a symbol as externally defined elsewhere but is accessible in the
final executable binary, so an extern variable is also considered
GLOBAL.

Example 5.4.6. Similar to the LOCAL example above, the out-

put lists many GLOBAL symbols such as main:

Num: Value Size Type Bind Vis Ndx Name

......... output omitted .........
66: 080483e1 10 FUNC GLOBAL DEFAULT 14 main
......... output omitted .........

WEAK are symbols whose deﬁnitions can be redeﬁned. Normally,

a symbol with multiple definitions are reported as an error by
a compiler. However, this constraint is lax when a definition is
explicitly marked as weak, which means the default implementa-
tion can be replaced by a different definition at link time.

Example 5.4.7. Suppose we have a default implementation of

the function add:

hello.c

#include <stdio.h>

attribute((weak)) int add(int a, int b) {

printf("warning: function is not implemented.\n")
;
return 0;
}

int main(int argc, char *argv[])

{
128 operating systems: from 0 to 1

printf("add(1,2) is %d\n", add(1,2));

return 0;
}

attribute((weak)) is a function attribute. A function at- function attribute

tribute is extra information for a compiler to handle a function
differently from a normal function. In this example, weak attribute
makes the function add a weak function,which means the default
implementation can be replaced by a different definition at link
time. Function attribute is a feature of a compiler, not standard
C.
If we do not supply a different function definition in a different
file (must be in a different file, otherwise gcc reports as an er-
ror), then the default implementation is applied. When the func-
tion add is called, it only prints the message: "warning: function
not implemented"and returns 0:

$ ./hello
warning: function is not implemented.
add(1,2) is 0

However, if we supply a different definition in another file e.g. math.c:

math.c

int add(int a, int b) {

return a + b;
}

and compile the two ﬁles together:

$ gcc math.c hello.c -o hello

Then, when running hello, no warning message is printed and

the correct value is returned.
Weak symbol is a mechanism to provide a default implementa-
tion, but replaceable when a better implementation is available
(e.g. more specialized and optimized) at link-time.
the anatomy of a program 129

Vis is the visibility of a symbol. The following values are available:

Table 5.4.1: Symbol Visibility

Value Description

DEFAULT The visibility is speciﬁed by the binding type of asymbol.

✄ Global and weak symbols are visible outside of their deﬁning component (executable
ﬁle or shared object).
✄ Local symbols are hidden. See HIDDEN below.

HIDDEN A symbol is hidden when the name is not visible to any other program outside of its
running program.
PROTECTED A symbol is protected when it is shared outside of its running program or shared libary
and cannot be overridden. That is, there can only be one definition for this symbol
across running programs that use it. No program can define its own definition of the
same symbol.
INTERNAL Visibility is processor-specific and is defined by processor-specific ABI.

Ndx is the index of a section that the symbol is in. Aside from ﬁxed
index numbers that represent section indexes, index has these spe-
cial values:

Table 5.4.2: Symbol Index

Value Description

ABS The index will not be changed by any symbol relocation.

COM The index refers to an unallocated common block.
UND The symbol is undefined in the current object file, which means the symbol depends on the
actual definition in another file. Undefined symbols appears when the object file refers to
symbols that are available at runtime, from shared library.
LORESERVE LORESERVE is the lower boundary of the reserve indexes. Its value is 0xff00.
HIRESERVE HIREVERSE is the upper boundary of the reserve indexes. Its value is 0xffff.
The operating system reserves exclusive indexes between LORESERVE and HIRESERVE,
which do not map to any actual section header.
XINDEX The index is larger than LORESERVE. The actual value will be contained in the section
SYMTAB_SHNDX, where each entry is a mapping between a symbol, whose Ndx field is a
XINDEX value, and the actual index value.
130 operating systems: from 0 to 1

Others Sometimes, values such as ANSI_COM, LARGE_COM, SCOM, SUND appear. This means that the
index is processor-speciﬁc.

Name is the symbol name.

Example 5.4.8. A C application program always starts from sym-

bol main. The entry for main in the symbol table in .symtab section
is:

Output Num: Value Size Type Bind Vis Ndx Name

62: 0000000000400526 32 FUNC GLOBAL DEFAULT 14 main

The entry shows that:

✄ main is the 62th entry in the table.

✄ main starts at address 0x0000000000400526.

✄ main consumes 32 bytes.

✄ main is a function.

✄ main is in global scope.

✄ main is visible to other object ﬁles that use it.

✄ main is inside the 14th section, which is .text. This is logical, since
.text holds all program code.

STRTAB hold a table of null-terminated strings, called string table. The

ﬁrst and last byte of this section is always a NULL character. A string
table section exists because a string can be reused by more than one
section to represent symbol and section names, so a program like readelf
or objdump can display various objects in a program, e.g. variable, func-
tions, section names, in a human-readable text instead of its raw hex
address.

Example 5.4.9. In the sample output, section 28 and 30 are of STRTAB

type:
the anatomy of a program 131

Output
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[28] .shstrtab STRTAB 0000000000000000 000018b6
000000000000010c 0000000000000000 0 0 1
[30] .strtab STRTAB 0000000000000000 000016b0
0000000000000206 0000000000000000 0 0 1

.shstrtab holds all the section names.

.strtab holds the symbols e.g. variable names, function names, struct
names, etc., in a C program, but not ﬁxed-size null-terminated C strings;
the C strings are kept in .rodata section.

Example 5.4.10. Strings in those section can be inspected with the

command:

$ readelf -p 29 hello

The output shows all the section names, with the oﬀset (also the string
index) into .shstrtab the table to the left:

Output String dump of section ’.shstrtab’:

[ 1] .symtab
[ 9] .strtab
[ 11] .shstrtab
[ 1b] .interp
[ 23] .note.ABI-tag
[ 31] .note.gnu.build-id
[ 44] .gnu.hash
[ 4e] .dynsym
[ 56] .dynstr
[ 5e] .gnu.version
[ 6b] .gnu.version_r
[ 7a] .rela.dyn
[ 84] .rela.plt
132 operating systems: from 0 to 1

[ 8e] .init
[ 94] .plt.got
[ 9d] .text
[ a3] .fini
[ a9] .rodata
[ b1] .eh_frame_hdr
[ bf] .eh_frame
[ c9] .init_array
[ d5] .fini_array
[ e1] .jcr
[ e6] .dynamic
[ ef] .got.plt
[ f8] .data
[ fe] .bss
[ 103] .comment

The actual implementation of a string table is a contiguous array of

null-terminated strings. The index of a string is the position of its ﬁrst
character in the array. For example, in the above string table, .symtab
is at index 1 in the array (NULL character is at index 0). The length
of .symtab is 7, plus the NULL character, which occurs 8 bytes in to-
tal. So, .strtab starts at index 9, and so on.

00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f
00000000 \0 . s y m t a b \0 . s t r t a b

00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f
00000010 \0 . s h s t r t a b \0 . i n t e
.... and so on ....
Figure 5.4.1: String table in
memory of .shstrtab. A red
Similarly, the output of .strtab: number is the starting index of a
string.

Output String dump of section ’.strtab’:

[ 1] crtstuff.c
[ c] __JCR_LIST__
[ 19] deregister_tm_clones
the anatomy of a program 133

[ 2e] __do_global_dtors_aux
[ 44] completed.7585
[ 53] __do_global_dtors_aux_fini_array_entry
[ 7a] frame_dummy
[ 86] __frame_dummy_init_array_entry

[ a5] hello.c
[ ad] __FRAME_END__
[ bb] __JCR_END__
[ c7] __init_array_end
[ d8] _DYNAMIC
[ e1] __init_array_start
[ f4] __GNU_EH_FRAME_HDR
[ 107] _GLOBAL_OFFSET_TABLE_
[ 11d] __libc_csu_fini
[ 12d] _ITM_deregisterTMCloneTable
[ 149] j
[ 14b] _edata
[ 152] __libc_start_main@@GLIBC_2.2.5
[ 171] __data_start
[ 17e] __gmon_start__
[ 18d] __dso_handle
[ 19a] _IO_stdin_used
[ 1a9] __libc_csu_init
[ 1b9] __bss_start
[ 1c5] main
[ 1ca] _Jv_RegisterClasses
[ 1de] __TMC_END__
[ 1ea] _ITM_registerTMCloneTable

HASH holds a symbol hash table, which supports symbol table access.

DYNAMIC holds information for dynamic linking.

NOBITS is similar to PROGBITS but occupies no space.

Example 5.4.11. .bss section holds uninitialized data, which means

134 operating systems: from 0 to 1

the bytes in the section can have any value. Until a operating system
actually loads the section into main memory, there is no need to allo-
cate space for the binary image on disk to reduce the size of a binary
ﬁle. Here is the details of .bss from the example output:

Output
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[26] .bss NOBITS 0000000000601038 00001038
0000000000000008 0000000000000000 WA 0 0 1
[27] .comment PROGBITS 0000000000000000 00001038
0000000000000034 0000000000000001 MS 0 0 1

In the above output, the size of the section is only 8 bytes, while the
oﬀsets of both sections are the same, which means .bss consumes no
byte of the executable binary on disk.

Notice that the .comment section has no starting address. This means
that this section is discarded when the executable binary is loaded
into memory.

REL holds relocation entries without explicit addends. This type will be
explained in details in 8.1

RELA holds relocation entries with explicit addends. This type will be
explained in details in 8.1

INIT_ARRAY is an array of function pointers for program initialization.

When an application program runs, before getting to main(), initial-
ization code in .init and this section are executed ﬁrst. The ﬁrst el-
ement in this array is an ignored function pointer.

It might not make sense when we can include initialization code in

the main() function. However, for shared object files where there are
no main(), this section ensures that the initialization code from an
object file executes before any other code to ensure a proper environ-
ment for main code to run properly. It also makes an object file more
modularity, as the main application code needs not to be responsible
for initializing a proper environment for using a particular object file,
but the object file itself. Such a clear division makes code cleaner.
the anatomy of a program 135

However, we will not use any .init and INIT_ARRAY sections in our
operating system, for simplicity, as initializing an environment is part
of the operating-system domain.

Example 5.4.12. To use the INIT_ARRAY, we simply mark a func-

tion with the attribute constructor:

hello.c

#include <stdio.h>

attribute((constructor)) static void init1(){

printf("%s\n", __FUNCTION__);
}

attribute((constructor)) static void init2(){

printf("%s\n", __FUNCTION__);
}

int main(int argc, char *argv[])

{
printf("hello␣world\n");

return 0;
}

The program automatically calls the constructor without explicitly

invoking it:

$ gcc -m32 hello.c -o hello

$ ./hello
init1
init2
hello world

Example 5.4.13. Optionally, a constructor can be assigned with a

priority from 101 onward. The priorities from 0 to 100 are reserved
136 operating systems: from 0 to 1

for gcc. If we want init2 to run before init1, we give it a higher pri-
ority:

hello.c

#include <stdio.h>

attribute((constructor(102))) static void init1(){

printf("%s\n", __FUNCTION__);
}

attribute((constructor(101))) static void init2(){

printf("%s\n", __FUNCTION__);
}

int main(int argc, char *argv[])

{
printf("hello␣world\n");

return 0;
}

The call order should be exactly as speciﬁed:

$ gcc -m32 hello.c -o hello

$ ./hello
init2
init1
hello world

Example 5.4.14. We can add initialization functions using another

method:

hello.c

#include <stdio.h>
the anatomy of a program 137

void init1() {
printf("%s\n", __FUNCTION__);
}

void init2() {
printf("%s\n", __FUNCTION__);
}

/* Without typedef, init is a definition of a function

pointer.
With typedef, init is a declaration of a type.*/
typedef void (*init)();

attribute((section(".init_array"))) init init_arr[2]

= {init1, init2};

int main(int argc, char *argv[])

{
printf("hello␣world!\n");

return 0;
}

The attribute section(“...”) put a function into a particular sec-

tion rather than the default .text. In this example, it is .init_array.
The section name is not necessary the same as the standard header
in an ELF ﬁle (such as .text or .init_array, but can be anything.
Non-standard section names are often used for controlling the ﬁnal
binary layout of a compiled program. We will explore this techinque
in more details when learning the GNU ld linker and the linking pro-
cess. Again, the program automatically calls the constructors without
explicitly invoking it:
138 operating systems: from 0 to 1

$ gcc -m32 hello.c -o hello

$ ./hello
init1
init2
hello world!

FINI_ARRAY is an array of function pointers for program termination,

called after exiting main(). If the application terminate abnormally,
such as through abort() call or a crash, the .finit_array is ignored.

Example 5.4.15. A destructor is automatically called after exiting

main(), if one or more available:

hello.c

#include <stdio.h>

attribute((destructor)) static void destructor(){

printf("%s\n", __FUNCTION__);
}

int main(int argc, char *argv[])

{
printf("hello␣world\n");

return 0;
}

$ gcc -m32 hello.c -o hello

$ ./hello
hello world
destructor

PREINIT_ARRAY is an array of function pointers that are invoked before

all other initialization functions in INIT_ARRAY.
the anatomy of a program 139

Example 5.4.16. To use the .preinit_array, the only way to put

functions into this section is to use the attribute section():

hello.c

#include <stdio.h>

void preinit1() {
printf("%s\n", __FUNCTION__);
}

void preinit2() {
printf("%s\n", __FUNCTION__);
}

void init1() {
printf("%s\n", __FUNCTION__);
}

void init2() {
printf("%s\n", __FUNCTION__);
}

typedef void (*preinit)();

typedef void (*init)();

__attribute__((section(".preinit_array"))) preinit
preinit_arr[2] = {preinit1, preinit2};
__attribute__((section(".preinit_array"))) init init_arr
[2] = {init1, init2};

int main(int argc, char *argv[])

{
printf("hello␣world!\n");
140 operating systems: from 0 to 1

return 0;
}

$ gcc -m32 hello2.c -o hello2

$ ./hello2
preinit1
preinit2
init1
init2
hello world!

GROUP defines a section group, which is the same section that appears
in different object files but when merged into the final executable bi-
nary file, only one copy is kept and the rest in other object files are
discarded. This section is only relevant in C++ object files, so we will
not examine further.

SYMTAB_SHNDX is a section containing extended section indexes, that are

associated with a symbol table. This section only appears when the
Ndx value of an entry in the symbol table exceeds the LORESERVE value.
This section then maps between a symbol and an actual index value
of a section header.

Upon understanding section types, we can understand the number in Link

and Info ﬁelds:

Exercise 5.4.1. Verify that the value of the Link ﬁeld of a SYMTAB sec-
tion is the index of a STRTAB section.

Exercise 5.4.2. Verify that the value of the Info ﬁeld of a SYMTAB sec-
tion is the index of last local symbol + 1. It means, in the symbol table,
from the index listed by Info ﬁeld onward, no local symbol appears.

Exercise 5.4.3. Verify that the value of the Info ﬁeld of a REL section
is the index of the SYMTAB section.

Exercise 5.4.4. Verify that the value of the Link ﬁeld of a REL section
is the index of the section where relocation is applied. For example. if
the section is .rel.text, then the relocating section should be .text.
the anatomy of a program 141

Type Link Info

DYNAMIC Entries in this section uses the section 0
index of the dynamic string table.
HASH The section index of the symbol table 0
GNU_HASH to which the hash table applies.
REL The section index of the associated The section index to which the
RELA symbol table. relocation applies.
SYMTAB The section index of the associated One greater than the symbol table
DYNSYM string table. index of the last local symbol.
GROUP The section index of the associated The symbol index of an entry in the
symbol table. associated symbol table. The name of
the speciﬁed symbol table entry
provides a signature for the section
group.
SYMTAB_SHNDX The section header index of the
associated symbol table.

Table 5.4.3: The meannings

5.5 Program header table of Link and Info depend on
section types. interpretation

A program header table is an array of program headers that deﬁnes the

memory layout of a program at runtime.
A program header is a description of a program segment.
A program segment is a collection of related sections. A segment con-
tains zero or more sections. An operating system when loading a pro-
gram, only use segments, not sections. To see the information of a pro-
gram header table, we use the -l option with readelf:

$ readelf -l <binary file>

Similar to a section, a program header also has types:

PHDR speciﬁes the location and size of the program header table itself,
both in the ﬁle and in the memory image of the program

INTERP speciﬁes the location and size of a null-terminated path name

to invoke as an interpreter for linking runtime libraries.

LOAD speciﬁes a loadable segment. That is, this segment is loaded into
main memory.
142 operating systems: from 0 to 1

DYNAMIC speciﬁes dynamic linking information.

NOTE speciﬁes the location and size of auxiliary information.

TLS speciﬁes the Thread-Local Storage template, which is formed from

the combination of all sections with the ﬂag TLS.

GNU_STACK indicates whether the program’s stack should be made exe-

cutable or not. Linux kernel uses this type. Table 5.5.1: Segment Permission
Permission Description
A segment also has permission, which is a combination of these 3 values: R Readable
W Writable
✄ Read (R) E Executable

✄ Write (W)

✄ Execute (E)

Example 5.5.1. The command to get the program header table:

$ readelf -l hello

Output:

Output
Elf file type is EXEC (Executable file)
Entry point 0x400430
There are 9 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000400040 0x0000000000400040
0x00000000000001f8 0x00000000000001f8 R E 8
INTERP 0x0000000000000238 0x0000000000400238 0x0000000000400238
0x000000000000001c 0x000000000000001c R 1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x000000000000070c 0x000000000000070c R E 200000
the anatomy of a program 143

LOAD 0x0000000000000e10 0x0000000000600e10 0x0000000000600e10

0x0000000000000228 0x0000000000000230 RW 200000
DYNAMIC 0x0000000000000e28 0x0000000000600e28 0x0000000000600e28
0x00000000000001d0 0x00000000000001d0 RW 8
NOTE 0x0000000000000254 0x0000000000400254 0x0000000000400254
0x0000000000000044 0x0000000000000044 R 4
GNU_EH_FRAME 0x00000000000005e4 0x00000000004005e4 0x00000000004005e4
0x0000000000000034 0x0000000000000034 R 4
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 10
GNU_RELRO 0x0000000000000e10 0x0000000000600e10 0x0000000000600e10
0x00000000000001f0 0x00000000000001f0 R 1
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr
.gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .plt.got .text .fini
.rodata .eh_frame_hdr .eh_frame
03 .init_array .fini_array .jcr .dynamic .got .got.plt .data .bss
04 .dynamic
05 .note.ABI-tag .note.gnu.build-id
06 .eh_frame_hdr
07
08 .init_array .fini_array .jcr .dynamic .got

In the sample output, LOAD segment appears twice:

Output LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000

0x000000000000070c 0x000000000000070c R E 200000
LOAD 0x0000000000000e10 0x0000000000600e10 0x0000000000600e10
0x0000000000000228 0x0000000000000230 RW 200000

Why? Notice the permission:

✄ the upper LOAD has Read and Execute permission. This is a text seg-
144 operating systems: from 0 to 1

ment. A text segment contains read-only instructions and read-only

data.

✄ the lower LOAD has Read and Write permission. This is a data segment.
It means that this segment can be read and written to, but is not al-
lowed to be used as executable code, for security reason.

Then, LOAD contains the following sections:

Output 02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr

.gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .plt.got .text .fini
.rodata .eh_frame_hdr .eh_frame
03 .init_array .fini_array .jcr .dynamic .got .got.plt .data .bss

The ﬁrst number is the index of a program header in program header

table, and the remaining text is the list of all sections within a segment.
Unfortunately, readelf does not print the index, so a user needs to keep
track manually which segment is of which index. First segment starts at
index 0, second at index 1 and so on. LOAD are segments at index 2 and
3. As can be seen from the two lists of sections, most sections are load-
able and is available at runtime.

5.6 Segments vs sections

As mentioned earlier, an operating system loads program segments, not

sections. However, a question arises: Why doesn’t the operating system
use sections instead? After all, a section also contains similar informa-
tion to a program segment, such as the type, the virtual memory address
to be loaded, the size, the attributes, the ﬂags and align. As explained
before, a segment is the perspective of an operating system, while a sec-
tion is the perspective of a linker. To understand why, looking into the
structure of a segment, we can easily see:

✄ A segment is a collection of sections. It means that sections are logi-

cally grouped together by their attributes. For example, all sections
the anatomy of a program 145

in a LOAD segment are always loaded by the operating system; all sec-
tions have the same permission, either a RE (Read + Execute) for ex-
ecutable sections, or RW (Read + Write) for data sections.

✄ By grouping sections into a segment, it is easier for an operating sys-

tem to batch load sections just once by loading the start and end of
a segment, instead of loading section by section.

✄ Since a segment is for loading a program and a section is for linking

a program, all the sections in a segment is within its start and end vir-
tual memory addresses of a segment.

To see the last point clearer, consider an example of linking two object
ﬁles. Suppose we have two source ﬁles:

hello.c

#include <stdio.h>

int main(int argc, char *argv[])

{
printf("Hello World\n");
return 0;
}

and:

math.c

int add(int a, int b) {

return a + b;
}

Now, compile the two source ﬁles as object ﬁles:

$ gcc -m32 -c math.c

$ gcc -m32 -c hello.c

Then, we check the sections of math.o:

146 operating systems: from 0 to 1

$ readelf -S math.o

Output There are 11 section headers, starting at offset 0x1a8:

Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 00000000 000034 00000d 00 AX 0 0 1
[ 2] .data PROGBITS 00000000 000041 000000 00 WA 0 0 1
[ 3] .bss NOBITS 00000000 000041 000000 00 WA 0 0 1
[ 4] .comment PROGBITS 00000000 000041 000035 01 MS 0 0 1
[ 5] .note.GNU-stack PROGBITS 00000000 000076 000000 00 0 0 1
[ 6] .eh_frame PROGBITS 00000000 000078 000038 00 A 0 0 4
[ 7] .rel.eh_frame REL 00000000 00014c 000008 08 I 9 6 4
[ 8] .shstrtab STRTAB 00000000 000154 000053 00 0 0 1
[ 9] .symtab SYMTAB 00000000 0000b0 000090 10 10 8 4
[10] .strtab STRTAB 00000000 000140 00000c 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings)
I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)

As shown in the output, all the section virtual memory addresses of

every section are set to 0. At this stage, each object ﬁle is simply a block
of binary that contains code and data. Its existence is to serve as a ma-
terial container for the ﬁnal product, which is the executable binary. As
such, the virtual addresses in hello.o are all zeroes.

No segment exists at this stage:

$ readelf -l math.o
There are no program headers in this file.

The same happens to other object ﬁle:

the anatomy of a program 147

Output There are 13 section headers, starting at offset 0x224:

Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 00000000 000034 00002e 00 AX 0 0 1
[ 2] .rel.text REL 00000000 0001ac 000010 08 I 11 1 4
[ 3] .data PROGBITS 00000000 000062 000000 00 WA 0 0 1
[ 4] .bss NOBITS 00000000 000062 000000 00 WA 0 0 1
[ 5] .rodata PROGBITS 00000000 000062 00000c 00 A 0 0 1
[ 6] .comment PROGBITS 00000000 00006e 000035 01 MS 0 0 1
[ 7] .note.GNU-stack PROGBITS 00000000 0000a3 000000 00 0 0 1
[ 8] .eh_frame PROGBITS 00000000 0000a4 000044 00 A 0 0 4
[ 9] .rel.eh_frame REL 00000000 0001bc 000008 08 I 11 8 4
[10] .shstrtab STRTAB 00000000 0001c4 00005f 00 0 0 1
[11] .symtab SYMTAB 00000000 0000e8 0000b0 10 12 9 4
[12] .strtab STRTAB 00000000 000198 000013 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings)
I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)

$ readelf -l hello.o
There are no program headers in this file.

Only when object ﬁles are combined into a ﬁnal executable binary, sec-
tions are fully realized:

$ gcc -m32 math.o hello.o -o hello

$ readelf -S hello.

Output There are 31 section headers, starting at offset 0x1804:

Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
148 operating systems: from 0 to 1

[ 0] NULL 00000000 000000 000000 00 0 0 0

[ 1] .interp PROGBITS 08048154 000154 000013 00 A 0 0 1
[ 2] .note.ABI-tag NOTE 08048168 000168 000020 00 A 0 0 4
[ 3] .note.gnu.build-i NOTE 08048188 000188 000024 00 A 0 0 4
[ 4] .gnu.hash GNU_HASH 080481ac 0001ac 000020 04 A 5 0 4
[ 5] .dynsym DYNSYM 080481cc 0001cc 000050 10 A 6 1 4
[ 6] .dynstr STRTAB 0804821c 00021c 00004a 00 A 0 0 1
[ 7] .gnu.version VERSYM 08048266 000266 00000a 02 A 5 0 2
[ 8] .gnu.version_r VERNEED 08048270 000270 000020 00 A 6 1 4
[ 9] .rel.dyn REL 08048290 000290 000008 08 A 5 0 4
[10] .rel.plt REL 08048298 000298 000010 08 AI 5 24 4
[11] .init PROGBITS 080482a8 0002a8 000023 00 AX 0 0 4
[12] .plt PROGBITS 080482d0 0002d0 000030 04 AX 0 0 16
[13] .plt.got PROGBITS 08048300 000300 000008 00 AX 0 0 8
[14] .text PROGBITS 08048310 000310 0001a2 00 AX 0 0 16
[15] .fini PROGBITS 080484b4 0004b4 000014 00 AX 0 0 4
[16] .rodata PROGBITS 080484c8 0004c8 000014 00 A 0 0 4
[17] .eh_frame_hdr PROGBITS 080484dc 0004dc 000034 00 A 0 0 4
[18] .eh_frame PROGBITS 08048510 000510 0000ec 00 A 0 0 4
[19] .init_array INIT_ARRAY 08049f08 000f08 000004 00 WA 0 0 4
[20] .fini_array FINI_ARRAY 08049f0c 000f0c 000004 00 WA 0 0 4
[21] .jcr PROGBITS 08049f10 000f10 000004 00 WA 0 0 4
[22] .dynamic DYNAMIC 08049f14 000f14 0000e8 08 WA 6 0 4
[23] .got PROGBITS 08049ffc 000ffc 000004 04 WA 0 0 4
[24] .got.plt PROGBITS 0804a000 001000 000014 04 WA 0 0 4
[25] .data PROGBITS 0804a014 001014 000008 00 WA 0 0 4
[26] .bss NOBITS 0804a01c 00101c 000004 00 WA 0 0 1
[27] .comment PROGBITS 00000000 00101c 000034 01 MS 0 0 1
[28] .shstrtab STRTAB 00000000 0016f8 00010a 00 0 0 1
[29] .symtab SYMTAB 00000000 001050 000470 10 30 48 4
[30] .strtab STRTAB 00000000 0014c0 000238 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings)
I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
the anatomy of a program 149

O (extra OS processing required) o (OS specific), p (processor specific)

Every loadable section is assigned an address, highlighted in green. The

reason each section got its own address is that in reality, gcc does not
combine an object by itself, but invokes the linker ld. The linker ld uses
the default script that it can ﬁnd in the system to build the executable
binary. In the default script, a segment is assigned a starting address 0x8048000
and sections belong to it. Then:

✄ 1st section address = starting segment address + section offset = 0x8048000 + 0x154 = 0x08048154

✄ 2nd section address = starting segment address + section offset = 0x8048000 + 0x168 = 0x08048168

✄ and so on until the last loadable section.

Indeed, the end address of a segment is also the end address of the ﬁnal
section. We can see this by listing all the segments:

$ readelf -l hello

And check, for example, LOAD segment which starts at 0x08048000 and
end at 0x08048000 + 0x005fc = 0x080485fc:

Output
Elf file type is EXEC (Executable file)
Entry point 0x8048310
There are 9 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000034 0x08048034 0x08048034 0x00120 0x00120 R E 0x4
INTERP 0x000154 0x08048154 0x08048154 0x00013 0x00013 R 0x1
[Requesting program interpreter: /lib/ld-linux.so.2]
LOAD 0x000000 0x08048000 0x08048000 0x005fc 0x005fc R E 0x1000
LOAD 0x000f08 0x08049f08 0x08049f08 0x00114 0x00118 RW 0x1000
DYNAMIC 0x000f14 0x08049f14 0x08049f14 0x000e8 0x000e8 RW 0x4
NOTE 0x000168 0x08048168 0x08048168 0x00044 0x00044 R 0x4
GNU_EH_FRAME 0x0004dc 0x080484dc 0x080484dc 0x00034 0x00034 R 0x4
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x10
150 operating systems: from 0 to 1

GNU_RELRO 0x000f08 0x08049f08 0x08049f08 0x000f8 0x000f8 R 0x1

Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr
.gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .plt.got .text .fini
.rodata .eh_frame_hdr .eh_frame
03 .init_array .fini_array .jcr .dynamic .got .got.plt .data .bss
04 .dynamic
05 .note.ABI-tag .note.gnu.build-id
06 .eh_frame_hdr
07
08 .init_array .fini_array .jcr .dynamic .got

The last section in the first LOAD segment is .eh_frame. The .eh_frame
section starts at 0x0804851 because the start address is 0x08048000, the
offset into the file is 0x510. The end address of .eh_frame should be: 0x08048000 + 0x510 + 0xec = 0x080485fc
because the segment size is 0xec. This is exactly the same as the end ad-
dress of the first LOAD segment above: 0x08048000 + 0x5ec = 0x080485fc.

Chapter 8 will explore this whole process in detail.

6
Runtime inspection and debug

A debugger is a program that allows inspection of a running program. debugger

A debugger can start and run a program then stop at a speciﬁc line for
examining the state of the program at that point. The point where the
debugger stop (but not halt) is called a breakpoint.

We will be using the GDB - GNU Debugger for debugging our ker-
nel. gdb is the program name. gdb can do four main kinds of things:

✄ Start your program, specifying anything that might aﬀect its behav-
ior.

✄ Make your program stop on speciﬁed conditions.

✄ Examine what has happened, when your program has stopped

✄ Change things in your program, so you can experiment with correct-

ing the eﬀects of one bug and go on to learn about another

6.1 A sample program

There must be an existing program for debugging. The good old “Hello
World” program suﬃces for the educational purpose in this chapter:

hello.c
152 operating systems: from 0 to 1

#include <stdio.h>

int main(int argc, char *argv[])

{
printf("Hello World!\n");
return 0;
}

We compile it with debugging information with the option -g:

$ gcc -m32 -g hello.c -o hello

Finally, we start gdb with the program as argument:

$ gdb hello

6.2 Static inspection of a program

Before inspecting a program at runtime, gdb loads it ﬁrst. Upon loading

into memory (but without running), a lot of useful information can be
retrieve for inspection. The commands in this section can be used be-
fore the program runs. However, they are also usable when the program
runs and can display even more information.

6.2.1 Command: info target/info file/info files

This command prints the information of the target being debugged. A

target is the debugging program.

Example 6.2.1. The output of the command from hello program, a

local target in detail:

(gdb) info target

runtime inspection and debug 153

Output
Symbols from "/tmp/hello".
Local exec file:
‘/tmp/hello’, file type elf32-i386.
Entry point: 0x8048310
0x08048154 - 0x08048167 is .interp
0x08048168 - 0x08048188 is .note.ABI-tag
0x08048188 - 0x080481ac is .note.gnu.build-id
0x080481ac - 0x080481cc is .gnu.hash
0x080481cc - 0x0804821c is .dynsym
0x0804821c - 0x08048266 is .dynstr
0x08048266 - 0x08048270 is .gnu.version
0x08048270 - 0x08048290 is .gnu.version_r
0x08048290 - 0x08048298 is .rel.dyn
0x08048298 - 0x080482a8 is .rel.plt
0x080482a8 - 0x080482cb is .init
0x080482d0 - 0x08048300 is .plt
0x08048300 - 0x08048308 is .plt.got
0x08048310 - 0x080484a2 is .text
0x080484a4 - 0x080484b8 is .fini
0x080484b8 - 0x080484cd is .rodata
0x080484d0 - 0x080484fc is .eh_frame_hdr
0x080484fc - 0x080485c8 is .eh_frame
0x08049f08 - 0x08049f0c is .init_array
0x08049f0c - 0x08049f10 is .fini_array
0x08049f10 - 0x08049f14 is .jcr
0x08049f14 - 0x08049ffc is .dynamic
0x08049ffc - 0x0804a000 is .got
0x0804a000 - 0x0804a014 is .got.plt
0x0804a014 - 0x0804a01c is .data
0x0804a01c - 0x0804a020 is .bss

The output displayed reports:

✄ Path of a symbol file. A symbol file is the file that contains the debug-
ging information. Usually, this is the same file as the binary, but it is
154 operating systems: from 0 to 1

common to separate between an executable binary and its debugging

information into 2 ﬁles, especially for remote debugging. In the exam-
ple, it is this line:

Symbols from "/tmp/hello".

✄ The path of the debugging program and its ﬁle type. In the example,
it is this line:

Local exec file:

‘/tmp/hello’, file type elf32-i386.

✄ The entry point to the debugging program. That is, the very ﬁrst code
the program runs. In the example, it is this line:

Entry point: 0x8048310

✄ A list of sections with its starting and ending addresses. In the exam-
ple, it is the remaining output.

Example 6.2.2. If the debugging program runs in a diﬀerent machine,

it is a remote target and gdb only prints a brief information:

(gdb) info target

Output Remote serial target in gdb-specific protocol:

Debugging a target over a serial line.

6.2.2 Command: maint info sections

This command is similar to info target but give extra information about
program sections, specifically the file offset and the flags of each section.

Example 6.2.3. Here is the output when running against hello pro-
gram:

(gdb) maint info sections

runtime inspection and debug 155

Output Exec file:

‘/tmp/hello’, file type elf64-x86-64.
[0] 0x00400238->0x00400254 at 0x00000238: .interp ALLOC LOAD READONLY DATA HAS_CONTENTS
[1] 0x00400254->0x00400274 at 0x00000254: .note.ABI-tag ALLOC LOAD READONLY DATA HAS_CONTENTS
[2] 0x00400274->0x00400298 at 0x00000274: .note.gnu.build-id ALLOC LOAD READONLY DATA HAS_CONTENTS
[3] 0x00400298->0x004002b4 at 0x00000298: .gnu.hash ALLOC LOAD READONLY DATA HAS_CONTENTS
[4] 0x004002b8->0x00400318 at 0x000002b8: .dynsym ALLOC LOAD READONLY DATA HAS_CONTENTS
[5] 0x00400318->0x00400355 at 0x00000318: .dynstr ALLOC LOAD READONLY DATA HAS_CONTENTS
[6] 0x00400356->0x0040035e at 0x00000356: .gnu.version ALLOC LOAD READONLY DATA HAS_CONTENTS
[7] 0x00400360->0x00400380 at 0x00000360: .gnu.version_r ALLOC LOAD READONLY DATA HAS_CONTENTS
....remaining output omitted....

The output is similar to info target, but with more details. Next to
the section names are the section flags, which are attributes of a section.
Here, we can see that the sections with LOAD flag are from LOAD segment.
The command can be combined with the section flags for filtered outputs:

ALLOBJ displays sections for all loaded object ﬁles, including shared
libraries. Shared libraries are only displayed when the program is al-
ready running.

section names displays only named sections.

Example 6.2.4. The command:

(gdb) maint info sections .text .data .bss

only displays .text, .data and .bss sections:

Output Exec file:

‘/tmp/hello’, file type elf64-x86-64.
[13] 0x00400430->0x004005c2 at 0x00000430: .text ALLOC LOAD READONLY CODE HAS_CONTENTS
[24] 0x00601028->0x00601038 at 0x00001028: .data ALLOC LOAD DATA HAS_CONTENTS
[25] 0x00601038->0x00601040 at 0x00001038: .bss ALLOC

section-flags displays only sections with specified section flags. Note that
these section flags are specific to gdb, though it is based on the sec-
tion attributes defined previously. Currently, gdb understands the fol-
lowing flags:
156 operating systems: from 0 to 1

ALLOC Section will have space allocated in the process when loaded.
Set for all sections except those containing debug information.

LOAD Section will be loaded from the ﬁle into the child process mem-
ory. Set for pre-initialized code and data, clear for .bss sections.

RELOC Section needs to be relocated before loading.

READONLY Section cannot be modiﬁed by the child process.

CODE Section contains executable code only.

DATA Section contains data only (no executable code).

ROM Section will reside in ROM.

CONSTRUCTOR Section contains data for constructor/destructor

lists.

HAS_CONTENTS Section is not empty.

NEVER_LOAD An instruction to the linker to not output the sec-

tion.

COFF_SHARED_LIBRARY A notiﬁcation to the linker that the sec-

tion contains COFF shared library information. COFF is an ob-
ject file format, similar to ELF. While ELF is the file format for
an executable binary, COFF is the file format for an object file.

IS_COMMON Section contains common symbols.

Example 6.2.5. We can restrict the output to only display sections

that contain code with the command:

(gdb) maint info sections CODE

The output:

Output Exec file:

‘/tmp/hello’, file type elf64-x86-64.
[10] 0x004003c8->0x004003e2 at 0x000003c8: .init ALLOC LOAD READONLY CODE HAS_CONTENTS
[11] 0x004003f0->0x00400420 at 0x000003f0: .plt ALLOC LOAD READONLY CODE HAS_CONTENTS
[12] 0x00400420->0x00400428 at 0x00000420: .plt.got ALLOC LOAD READONLY CODE HAS_CONTENTS
[13] 0x00400430->0x004005c2 at 0x00000430: .text ALLOC LOAD READONLY CODE HAS_CONTENTS
[14] 0x004005c4->0x004005cd at 0x000005c4: .fini ALLOC LOAD READONLY CODE HAS_CONTENTS
runtime inspection and debug 157

6.2.3 Command: info functions

This commands list all function names and their loaded addresses. The
names can be ﬁltered with a regular expression.

Example 6.2.6. Run the command, we get the following output:

(gdb) info functions

Output All defined functions:

File hello.c:
int main(int, char **);
Non-debugging symbols:
0x00000000004003c8 _init
0x0000000000400400 puts@plt
0x0000000000400410 __libc_start_main@plt
0x0000000000400430 _start
0x0000000000400460 deregister_tm_clones
0x00000000004004a0 register_tm_clones
0x00000000004004e0 __do_global_dtors_aux
0x0000000000400500 frame_dummy
0x0000000000400550 __libc_csu_init
0x00000000004005c0 __libc_csu_fini
0x00000000004005c4 _fini

6.2.4 Command: info variables

This command lists all global and static variable names, or ﬁltered with
a regular expression.

Example 6.2.7. If we add a global variable int i into the sample source
program and recompile then run the command, we get the following out-
put:

(gdb) info variables

158 operating systems: from 0 to 1

Output All defined variables:

File hello.c:
int i;
Non-debugging symbols:
0x00000000004005d0 _IO_stdin_used
0x00000000004005e4 __GNU_EH_FRAME_HDR
0x0000000000400708 __FRAME_END__
0x0000000000600e10 __frame_dummy_init_array_entry
0x0000000000600e10 __init_array_start
0x0000000000600e18 __do_global_dtors_aux_fini_array_entry
0x0000000000600e18 __init_array_end
0x0000000000600e20 __JCR_END__
0x0000000000600e20 __JCR_LIST__
0x0000000000600e28 _DYNAMIC
0x0000000000601000 _GLOBAL_OFFSET_TABLE_
0x0000000000601028 __data_start
0x0000000000601028 data_start
0x0000000000601030 __dso_handle
0x000000000060103c __bss_start
0x000000000060103c _edata
0x000000000060103c completed
0x0000000000601040 __TMC_END__
0x0000000000601040 _end

6.2.5 Command: disassemble/disas

This command displays the assembly code of the executable ﬁle.

Example 6.2.8. gdb can display the assembly code of a function:

(gdb) disassemble main

runtime inspection and debug 159

Output Dump of assembler code for function main:

0x0804840b <+0>: lea ecx,[esp+0x4]
0x0804840f <+4>: and esp,0xfffffff0
0x08048412 <+7>: push DWORD PTR [ecx-0x4]
0x08048415 <+10>: push ebp
0x08048416 <+11>: mov ebp,esp
0x08048418 <+13>: push ecx
0x08048419 <+14>: sub esp,0x4
0x0804841c <+17>: sub esp,0xc
0x0804841f <+20>: push 0x80484c0
0x08048424 <+25>: call 0x80482e0 <puts@plt>
0x08048429 <+30>: add esp,0x10
0x0804842c <+33>: mov eax,0x0
0x08048431 <+38>: mov ecx,DWORD PTR [ebp-0x4]
0x08048434 <+41>: leave
0x08048435 <+42>: lea esp,[ecx-0x4]
0x08048438 <+45>: ret
End of assembler dump.

Example 6.2.9. It would be more useful if source is included:

(gdb) disassemble /s main

Output Dump of assembler code for function main:

5 printf("Hello World!\n");
0x0804841c <+17>: sub esp,0xc
0x0804841f <+20>: push 0x80484c0
0x08048424 <+25>: call 0x80482e0 <puts@plt>
0x08048429 <+30>: add esp,0x10
6 return 0;
0x0804842c <+33>: mov eax,0x0
7 }
0x08048431 <+38>: mov ecx,DWORD PTR [ebp-0x4]
0x08048434 <+41>: leave
0x08048435 <+42>: lea esp,[ecx-0x4]
0x08048438 <+45>: ret
End of assembler dump.

Now the high level source (in green text) is included as part of the as-
sembly dump. Each line is backed by the corresponding assembly code
below it.

Example 6.2.10. If the option /r is added, raw instructions in hex are

included, just like how objdump displays assembly code by default:

(gdb) disassemble /rs main

Output Dump of assembler code for function main:

0x0804841f <+20>: 68 c0 84 04 08 push 0x80484c0

0x08048424 <+25>: e8 b7 fe ff ff call 0x80482e0 <puts@plt>
0x08048429 <+30>: 83 c4 10 add esp,0x10
6 return 0;
0x0804842c <+33>: b8 00 00 00 00 mov eax,0x0
7 }
0x08048431 <+38>: 8b 4d fc mov ecx,DWORD PTR [ebp-0x4]
0x08048434 <+41>: c9 leave
0x08048435 <+42>: 8d 61 fc lea esp,[ecx-0x4]
0x08048438 <+45>: c3 ret
End of assembler dump.

Example 6.2.11. A function in a specific file can also be specified:

(gdb) disassemble /sr ’hello.c’::main

Output Dump of assembler code for function main:

hello.c:
4 {
0x0804840b <+0>: 8d 4c 24 04 lea ecx,[esp+0x4]
0x0804840f <+4>: 83 e4 f0 and esp,0xfffffff0
0x08048412 <+7>: ff 71 fc push DWORD PTR [ecx-0x4]
0x08048415 <+10>: 55 push ebp
0x08048416 <+11>: 89 e5 mov ebp,esp
0x08048418 <+13>: 51 push ecx
0x08048419 <+14>: 83 ec 04 sub esp,0x4
5 printf("Hello World!\n");
0x0804841c <+17>: 83 ec 0c sub esp,0xc
0x0804841f <+20>: 68 c0 84 04 08 push 0x80484c0
0x08048424 <+25>: e8 b7 fe ff ff call 0x80482e0 <puts@plt>
0x08048429 <+30>: 83 c4 10 add esp,0x10
6 return 0;
0x0804842c <+33>: b8 00 00 00 00 mov eax,0x0
7 }
162 operating systems: from 0 to 1

0x08048431 <+38>: 8b 4d fc mov ecx,DWORD PTR [ebp-0x4]

0x08048434 <+41>: c9 leave
0x08048435 <+42>: 8d 61 fc lea esp,[ecx-0x4]
0x08048438 <+45>: c3 ret
End of assembler dump.

The filename must be included in a single quote, and the function must
be prefixed by double colons e.g. ’hello.c’::main to specify disassem-
bling of the function main in the file hello.c.

6.2.6 Command: x

This command examines the content of a given memory range.

Example 6.2.12. We can examine the raw content in main:

(gdb) x main

Output 0x804840b <main>: 0x04244c8d

By default, without any argument, the command only prints the con-
tent of a single memory address. In this case, that is the starting mem-
ory address in main.

Example 6.2.13. With format arguments, the command can print a

range of memory in a speciﬁc format.

(gdb) x/20b main

Output 0x804840b <main>: 0x8d 0x4c 0x24 0x04 0x83 0xe40xf0 0xff
0x8048413 <main+8>: 0x71 0xfc 0x55 0x89 0xe5 0x510x83 0xec
0x804841b <main+16>: 0x04 0x83 0xec 0x0c

/20b main argument means that the command prints 20 bytes, where
main starts in memory.
The general form for format argument is: /<repeated count><format
letter>
runtime inspection and debug 163

If the repeated count is not supplied, by default gdb supplies the count
as 1. The format letter is one the following values:
Letter Description

o Print the memory content in octal format.

x Print the memory content in hex format.
d Print the memory content in decimal format.
u Print the memory content in unsigned decimal format.
t Print the memory content in binary format.
f Print the memory content in ﬂoat format.
a Print the memory content as memory addresses.
i Print the memory content as a series of assembly instructions, similar to disassemble command.
c Print the memory content as an array of ASCII characters.
s Print the memory content as a string
Depends on the circumstance, certain format is advantageous than
the others. For example, if a memory region contains ﬂoating-point num-
bers, then it is better to use the format f than viewing the number as
separated 1-byte hex numbers.

6.2.7 Command: print/p

Examining raw memory is useful but usually it is better to have a more

human-readable output. This command does precisely the task: it pretty-
prints an expression. An expression can be a global variable, a local vari-
able in current stack frame, a function, a register, a number, etc.

6.3 Runtime inspection of a program

The main use of a debugger is to examine the state of a program, when

it is running. gdb provides a set of useful commands for retrieving useful
runtime information.

6.3.1 Command: run

This command starts running the program.

Example 6.3.1. Run the hello program:

164 operating systems: from 0 to 1

(gdb) r

Output
Starting program: /tmp/hello
Hello World!
[Inferior 1 (process 1002) exited normally]

The program runs successfully and printed the message “Hello World”.
However, it would not be useful if all gdb can do is run a program.

6.3.2 Command: break/b

This command sets a breakpoint at a location in the high-level source

code. When gdb runs to a speciﬁc location marked by a breakpoint, it
stops executing for a programmer to inspect the current state of a pro-
gram.

Example 6.3.2. A breakpoint can be set on a line as displayed by an

editor. Suppose we want to set a breakpoint at line 3 of the program, which
is the start of main function:

hello.c

1 #include <stdio.h>
2
3 int main(int argc, char *argv[])
4 {
5 printf("Hello World!\n");
6 return 0;
7 }

When running a program, instead of running from start to ﬁnish, gdb

stopped at line 3:

(gdb) b 3
runtime inspection and debug 165

Output Breakpoint 1 at 0x400535: file hello.c, line 3.

(gdb) r

Output
Starting program: /tmp/hello
Breakpoint 1, main (argc=1, argv=0x7fffffffdfb8) at hello.c:5
5 printf("Hello World!\n");

The breakpoint is at line 3, but gdb stopped line 5. The reason is that
line 3 does not contain code, but a function signature; gdb only stops where
it can execute code. The code in the function starts at line 5, the call to
printf, so gdb stops there.

Example 6.3.3. Line of code is not always the reliable way to specify
a breakpoint, as the source code can be changed. What if gdb should al-
ways stop at main function? In this case, a better method is to use the
function name directly:

b main

Then, regardless of how the source code changes, gdb always stops at
the main function.

Example 6.3.4. Sometimes, the debugging program does not contain

debug info, or gdb is debugging assembly code. In that case, a memory
address can be speciﬁed as a stop point. To get the function address, print
command can be used:

(gdb) print main

Output
$3 = {int (int, char **)} 0x400526 <main>

Knowing the address of main, we can easily set a breakpoint with a

memory address:
166 operating systems: from 0 to 1

b *0x400526

Example 6.3.5. gdb can also set breakpoint in any source file. Suppose
that hello program is composed not just one file but many files e.g. hello1.c,
hello2.c, hello3.c... In that case, simply add the filename before ei-
ther a line number:

b hello.c:3

Example 6.3.6. A function name in a speciﬁc ﬁle can also be set:

b hello.c:main

6.3.3 Command: next/n

This command executes the current line and stops at the next line. When
the current line is a function call, steps over it.

Example 6.3.7. After setting a breakpoint at main, run a program and

stop at the ﬁrst printf:

(gdb) r

Output
Starting program: /tmp/hello
Breakpoint 1, main (argc=1, argv=0x7fffffffdfb8) at hello.c:5
5 printf("Hello World!\n");

Then, to proceed to the next statement, we use the next command:

(gdb) n

Output Hello World!

6 return 0;
runtime inspection and debug 167

In the output, the ﬁrst line shows the output produced after execut-
ing line 5; then, the next line shows where gdb stops currently, which is
line 6.

6.3.4 Command: step/s

This command executes the current line and stops at the next line. When
the current line is a function call, steps into it to the ﬁrst next line in the
called function.

Example 6.3.8. Suppose we have a new function add1 : 1

Why should we add a new function
and function call instead of using the
existing printf call? Stepping into
shared library functions is tricky be-
hello.c cause to make debugging works, the de-
bug info must be installed and loaded.
#include <stdio.h> It is not worth the trouble for demon-
strating this simple command.

int add(int a, int b) {

return a + b;
}

int main(int argc, char *argv[])

{
add(1, 2);
printf("Hello World!\n");
return 0;
}

If step command is used instead of next on the function call printf,

gdb steps inside the function:

(gdb) r

Output
Starting program: /tmp/hello
Breakpoint 1, main (argc=1, argv=0xffffd154) at hello.c:11
11 add(1, 2);
168 operating systems: from 0 to 1

(gdb) s

Output
add (a=1, b=2) at hello.c:6
6 return a + b;

After executing the command s, gdb stepped into the add function where
the ﬁrst statement is a return.

6.3.5 Command: ni

At the core, gdb operates on assembly instruction. Source line by line

debugging is simply an enhancement to make it friendlier for program-
mers. Each statement in C translates to one or more assembly instruc-
tion, as shown with objdump and disassemble command. With the de-
bug info available, gdb knows how many instructions belong to one line
of high-level code; line by line debugging is just a execution of assembly
instructions of a line when moving from the current line to the next.

This command executes the one assembly instruction belongs to the

current line. Until all assembly instructions of the current line are exe-
cuted, gdb will not move to the next line. If the current instruction is a
call, step over it to the next instruction.

Example 6.3.9. When breakpoint is on the printf call and ni is used,

it steps through each assembly instruction:

(gdb) disassemble /s main

Output Dump of assembler code for function main:

hello.c:
4 {
0x0804840b <+0>: lea ecx,[esp+0x4]
0x0804840f <+4>: and esp,0xfffffff0
0x08048412 <+7>: push DWORD PTR [ecx-0x4]
0x08048415 <+10>: push ebp
runtime inspection and debug 169

0x08048416 <+11>: mov ebp,esp

0x08048418 <+13>: push ecx
0x08048419 <+14>: sub esp,0x4
5 printf("Hello World!\n");
0x0804841c <+17>: sub esp,0xc
0x0804841f <+20>: push 0x80484c0
0x08048424 <+25>: call 0x80482e0 <puts@plt>
0x08048429 <+30>: add esp,0x10
6 return 0;
=> 0x0804842c <+33>: mov eax,0x0
7 }
0x08048431 <+38>: mov ecx,DWORD PTR [ebp-0x4]
0x08048434 <+41>: leave
0x08048435 <+42>: lea esp,[ecx-0x4]
0x08048438 <+45>: ret
End of assembler dump.

(gdb) r

Output
Starting program: /tmp/hello
Breakpoint 1, main (argc=1, argv=0xffffd154) at hello.c:5
5 printf("Hello World!\n");

(gdb) ni

Output
0x0804841f 5 printf("Hello World!\n");

(gdb) ni

Output
0x08048424 5 printf("Hello World!\n");
170 operating systems: from 0 to 1

(gdb) ni

Output Hello World!

0x08048429 5 printf("Hello World!\n");

(gdb)

Output 6 return 0;

Upon entering ni, gdb executes current instruction and display the
next instruction. That’s why from the output, gdb only displays 3 ad-
dresses: 0x0804841f, 0x08048424 and 0x08048429. The instruction at
0x0804841c, which is the first instruction of printf, is not displayed be-
cause it is the first instruction that gdb stopped at. Assume that gdb stopped
at the first instruction of printf at 0x0804841c, the current instruction
can be displayed using x command:

(gdb) x/i $eip

Output => 0x804841c <main+17>: sub esp,0xc

6.3.6 Command: si

Similar to ni, this command executes the current assembly instruction

belongs to the current line. But if the current instruction is a call, step
into it to the ﬁrst next instruction in the called function.

Example 6.3.10. Recall that the assembly code generated from printf
contains a call instruction:

(gdb) disassemble /s main

runtime inspection and debug 171

Output Dump of assembler code for function main:

hello.c:
4 {
0x0804840b <+0>: lea ecx,[esp+0x4]
0x0804840f <+4>: and esp,0xfffffff0
0x08048412 <+7>: push DWORD PTR [ecx-0x4]
0x08048415 <+10>: push ebp
0x08048416 <+11>: mov ebp,esp
0x08048418 <+13>: push ecx
0x08048419 <+14>: sub esp,0x4
5 printf("Hello World!\n");
0x0804841c <+17>: sub esp,0xc
0x0804841f <+20>: push 0x80484c0
0x08048424 <+25>: call 0x80482e0 <puts@plt>
0x08048429 <+30>: add esp,0x10
6 return 0;
=> 0x0804842c <+33>: mov eax,0x0
7 }
0x08048431 <+38>: mov ecx,DWORD PTR [ebp-0x4]
0x08048434 <+41>: leave
0x08048435 <+42>: lea esp,[ecx-0x4]
0x08048438 <+45>: ret
End of assembler dump.

We try instruction by instruction stepping again, but this time by run-

ning si at 0x08048424, where call resides:

(gdb) si

Output
0x0804841f 5 printf("Hello World!\n");

(gdb) si
172 operating systems: from 0 to 1

Output
0x08048424 5 printf("Hello World!\n");

(gdb) x/i $eip

Output => 0x8048424 <main+25>: call 0x80482e0 <puts@plt>

(gdb) si

Output
0x080482e0 in puts@plt ()

The next instruction right after 0x8048424 is the ﬁrst instruction at

0x080482e0 in puts function. In other words, gdb stepped into puts in-
stead of stepping over it.

6.3.7 Command: until

This command executes until the next line is greater than the current
line.

Example 6.3.11. Suppose we have a function that execute a long loop:

hello.c

#include <stdio.h>

int add1000() {
int total = 0;

for (int i = 0; i < 1000; ++i){

total += i;
}

printf("Done adding!\n");
runtime inspection and debug 173

return total;
}

int main(int argc, char *argv[])

{
add1000(1, 2);
printf("Hello World!\n");
return 0;
}

Using next command, we need to press 1000 times for ﬁnishing the
loop. Instead, a faster way is to use until:

(gdb) b add1000

Output Breakpoint 1 at 0x8048411: file hello.c, line 4.

(gdb) r

Output
Starting program: /tmp/hello
Breakpoint 1, add1000 () at hello.c:4
4 int total = 0;

(gdb) until

Output
5 for (int i = 0; i < 1000; ++i){

(gdb) until

Output 6 total += i;
174 operating systems: from 0 to 1

(gdb) until

Output
5 for (int i = 0; i < 1000; ++i){

(gdb) until

Output
8 printf("Done adding!\n");

Executing the ﬁrst until, gdb stopped at line 5 since line 5 is greater
than line 4.
Executing the second until, gdb stopped at line 6 since line 6 is greater
than line 5.
Executing the third until, gdb stopped at line 5 since the loop still
continues. Because line 5 is less than line 6, with the fourth until, gdb
kept executing until it does not go back to line 5 anymore and stopped
at line 8. This is a great way to skip over loop in the middle, instead of
setting unneeded breakpoint.

Example 6.3.12. until can be supplied with an argument to explic-

itly execute to a speciﬁc line:

(gdb) r

Output
Starting program: /tmp/hello
Breakpoint 1, add1000 () at hello.c:4
4 int total = 0;

(gdb) until 8

Output
add1000 () at hello.c:8
8 printf("Done adding!\n");
runtime inspection and debug 175

6.3.8 Command: finish

This command executes until the end of a function and displays the re-
turn value. finish is actually just a more convenient version of until.

Example 6.3.13. Using the add1000 function from the previous exam-
ple and use finish instead of until:

(gdb) r

Output
Starting program: /tmp/hello
Breakpoint 1, add1000 () at hello.c:4
4 int total = 0;

(gdb) finish

Output
Run till exit from #0 add1000 () at hello.c:4
Done adding!
0x08048466 in main (argc=1, argv=0xffffd154) at hello.c:15
15 add1000(1, 2);
Value returned is $1 = 499500

6.3.9 Command: bt

This command prints the backtrace of all stack frames. A backtrace is a backtrace
list of currently active functions:

Example 6.3.14. Suppose we have a chain of function calls:

hello.c

void d(int d) { };
void c(int c) { d(0); }
void b(int b) { c(1); }
void a(int a) { b(2); }

int main(int argc, char *argv[])

176 operating systems: from 0 to 1

{
a(3);
return 0;
}

bt can visualize such a chain in action:

(gdb) b a

Output Breakpoint 1 at 0x8048404: file hello.c, line 9.

(gdb) r

Output
Starting program: /tmp/hello
Breakpoint 1, a (a=3) at hello.c:9
9 void a(int a) { b(2); }

(gdb) s

Output
b (b=2) at hello.c:7
7 void b(int b) { c(1); }

(gdb) s

Output
c (c=1) at hello.c:5
5 void c(int c) { d(0); }

(gdb) s
runtime inspection and debug 177

Output
d (d=0) at hello.c:3
3 void d(int d) { };

(gdb) bt

Output
#0 d (d=0) at hello.c:3
#1 0x080483eb in c (c=1) at hello.c:5
#2 0x080483fb in b (b=2) at hello.c:7
#3 0x0804840b in a (a=3) at hello.c:9
#4 0x0804841b in main (argc=1, argv=0xffffd154) at hello.c:13

Most-recent calls are placed on top and least-recent calls are near the
bottom. In this case, d is the most current active function, so it has the
index 0. Next is c, the 2nd active function, has the index 1 and so on with
function b, function a, and ﬁnally function main at the bottom, the least-
recent function. That is how we read a backtrace.

6.3.10 Command: up

This command goes up one frame earlier the current frame.

Example 6.3.15. Instead of staying in d function, we can go up to c

function and look at its state:

(gdb) bt

(gdb) up
178 operating systems: from 0 to 1

Output
#1 0x080483eb in c (c=1) at hello.c:3
3 void b(int b) { c(1); }

The output displays the current frame is moved to c and where the
call to c is made, which is in function b at line 3.

6.3.11 Command: down

Similar to up, this command goes down one frame later then the current
frame.

Example 6.3.16. After inspecting c function, we can go back to d:

(gdb) bt

(gdb) up

Output
#1 0x080483eb in c (c=1) at hello.c:3
3 void b(int b) { c(1); }

(gdb) down

Output
#0 d (d=0) at hello.c:1
1 void d(int d) { };
runtime inspection and debug 179

6.3.12 Command: info registers

This command lists the current values in commonly used registers. This
command is useful when debugging assembly and operating system code,
as we can inspect the current state of the machine.

Example 6.3.17. Executing the command, we can see the commonly

used registers:

(gdb) info registers

Output eax 0xf7faddbc -134554180

ecx 0xffffd0c0 -12096
edx 0xffffd0e4 -12060
ebx 0x0 0
esp 0xffffd0a0 0xffffd0a0
ebp 0xffffd0a8 0xffffd0a8
esi 0xf7fac000 -134561792
edi 0xf7fac000 -134561792
eip 0x804841c 0x804841c <main+17>
eflags 0x286 [ PF SF IF ]
cs 0x23 35
ss 0x2b 43
ds 0x2b 43
es 0x2b 43
fs 0x0 0
gs 0x63 99

The above registers suﬃce for writing our operating system in later
part.

6.4 How debuggers work: A brief introduction

6.4.1 How breakpoints work

When a programmer places a breakpoint somewhere in his code, what

actually happens is that the ﬁrst opcode of the ﬁrst instruction of a state-
180 operating systems: from 0 to 1

ment is replaced with another instruction, int 3 with opcode CCh:

83 ec 0c → cc ec 0c
Figure 6.4.1: Opcode replace-
sub esp,0x4 int 3
ment, with int 3
int 3 only costs a single byte, making it eﬃcient for debugging. When
int 3 instruction is executed, the operating system calls its breakpoint
interrupt handler. The handler then checks what process reaches a break-
point, pauses it and notiﬁes the debugger it has paused a debugged pro-
cess. The debugged process is only paused and that means a debugger
is free to inspect its internal state, like a surgeon operates on an anes-
thetic patient. Then, the debugger replaces the int 3 opcode with the
original opcode and executes the original instruction normally.

cc ec 0c → 83 ec 0c
Figure 6.4.2: Restore the original
int 3 sub esp,0x4
opcode, after int 3 was executed
Example 6.4.1. It is simple to see int 3 in action. First, we add an
int 3 instruction where we need gdb to stop:

hello.c

#include <stdio.h>

int main(int argc, char *argv[])

{
asm("int 3");
printf("Hello World\n");
return 0;
}

int 3 precedes printf, so gdb is expected to stop at printf. Next,

we compile with debug enable and with Intel syntax:

$ gcc -masm=intel -m32 -g hello.c -o hello

Finally, start gdb:

$ gdb hello
runtime inspection and debug 181

Running without setting any breakpoint, gdb stops at printf call, as

expected:

(gdb) r

Output
Starting program: /tmp/hello
Program received signal SIGTRAP, Trace/breakpoint trap.
main (argc=1, argv=0xffffd154) at hello.c:6
6 printf("Hello World\n");

The blue text indicates that gdb encountered a breakpoint, and indeed
it stopped at the right place: the printf call, where int 3 preceded it.

6.4.2 Single stepping

When breakpoint is implemented, it is easy to implement single stepping:

a debugger simply places another int 3 opcode in the next instruction.
So, when a programmer sets a breakpoint at an instruction, the next in-
struction is automatically set by the debugger, thus enable instruction
by instruction debugging. Similarly, source line by line debugging is just
the placements of the very ﬁrst opcodes in the two statements with two
int 3 opcodes.

6.4.3 How a debugger understands high level source code

DWARF is a debugging ﬁle format used by many compilers and debug-

gers to support source level debugging. DWARF contains information
that maps between entities in the executable binary with the source ﬁles.
A program entity can either be data or code. A DIE, or Debugging I nformation
Debugging Information
Entry, is a description of a program entity. A DIE consists of a tag, which spec-
Entry
iﬁes the entity that the DIE describes, and a list of attributes that de-
scribes the entity. Of all the attributes, these two attributes enables source-
level debugging:

✄ Where the entity appears in the source files: which ﬁle

and which line the entity appears.
182 operating systems: from 0 to 1

✄ Where the entity appears in the executable binary:

in which memory address the entity is loaded at runtime. With the
precise address, gdb can retrieve correct value for a data entity, or place
a correct breakpoint and stop accordingly for a code entity. Without
the information of these addresses, gdb would not know where the en-
tities are to inspect them.

hello.c DIE
Line 1 #include <stdio.h> ....
Line 2 ....
⇒ Line 3 int main(int argc, char *argv[]) → main in hello.c is at
Line 5 .......... 0x804840b in hello
Line 6 .......... ....
....

↓↑

hello (at 0x804840b)

...8d 4c 24 04 83 e4 f0
ff 71 fc ....

Figure 6.4.3: Source-binary

In addition to DIEs, another binary-to-source mapping is the line num- mapping with DIE
ber table. The line number table maps between a line in the source code
and at which memory address is the start of the line in the executable
binary.

In sum, to successfully enable source-level debugging, a debugger needs

to know the precise location of the source ﬁles and the load addresses
at runtime. Address matching, between the image layout of the ELF bi-
nary and the address where it is loaded, is extremely important since de-
bug information relies on correct loading address at runtime. That is, it
assumes the addresses as recorded in the binary image at compile-time
the same as at runtime e.g. if the load address for .text section is recorded
in the executable binary at 0x800000, then when the binary actually runs,
.text should really be loaded at 0x800000 for gdb to be able to correctly
match running instructions with high-level code statement. Address mis-
matching makes debug information useless, as actual code at one address
is displayed as code at another address. Without this knowledge, we will
runtime inspection and debug 183

not be able to build an operating system that can be debugged with gdb.

Example 6.4.2. When an executable binary contains debug info, readelf

can display such information in a readable format. Using the good old
hello world program:

hello.c

#include <stdio.h>

int main(int argc, char *argv[])

{
printf("Hello World\n");

return 0;
}

and compile with debug info:

$ gcc -m32 -g hello.c -o hello

With the binary ready, we can look at the line number table with the
command:

$ readlelf -wL hello

-w option prints all the debug information. In combination with its

sub-option, only speciﬁc information is displayed. For example, with -L,
only the line number table is displayed:

Output Decoded dump of debug contents of section .debug_line:

CU: hello.c:
File name Line number Starting address
hello.c 6 0x804840b
hello.c 7 0x804841c
hello.c 9 0x804842c
184 operating systems: from 0 to 1

hello.c 10 0x8048431

From the above output:

CU shorts for C ompilation U nit, a separately compiled source ﬁle. In

the example, we only have one ﬁle, hello.c.

File name displays the ﬁlename of the current compilation unit.

Line number is the line number in the source ﬁle of which the line is not
an empty line. In the example, line 8 is an empty line, so it does not
appear.

Starting address is the memory address where the line actually starts
in the executable binary.

With such crystal clear information, this is how gdb is able to set a break-
point on a line easily. For placing breakpoints on variables and functions,
it is time to look at the DIEs. To get the DIEs information from an ex-
ecutable binary, run the command:

$ readlelf -wi hello

-wi option lists all the DIE entries. This is one typical DIE entry:

<0><b>: Abbrev Number: 1 (DW_TAG_compile_unit)

<c> DW_AT_producer : (indirect string, offset: 0xe): GNU C11 5.4.0 20160609 -masm=intel -m32
<10> DW_AT_language : 12 (ANSI C99)
<11> DW_AT_name : (indirect string, offset: 0xbe): hello.c
<15> DW_AT_comp_dir : (indirect string, offset: 0x97): /tmp
<19> DW_AT_low_pc : 0x804840b
<1d> DW_AT_high_pc : 0x2e
<21> DW_AT_stmt_list : 0x0

Red This left-most number indicates the current nesting level of a DIE
entry. 0 is the outer-most level DIE with its entity is the compilation
unit. This means subsequent DIE entries with higher nesting level are
all the children of this tag, the compilation unit. It makes sense, as
all the entities must originate from a source ﬁle.
runtime inspection and debug 185

Blue These numbers in hex format indicate the offsets into .debug_info
section. Each meaningful information is displayed along with its off-
set. When an attribute references to another attribute, the offset is
used to precisely identify the referenced attribute.

Green These names with DW_AT_ preﬁx are the attributes attached to a
DIE that describe an entity. Notable attributes:

DW_AT_name

DW_AT_comp_dir The ﬁlename of the compilation unit and the direc-

tory where compilation occurred. Without the ﬁlename and the path,
gdb would not be able to display the high-level source, despite the
availability of the debug info. Debug info only contains the map-
ping between source and binary, not the source code itself.

DW_AT_low_pc

DW_AT_high_pc The start and end of the current entity, which is the
compilation unit, in the executable binary. The value in DW_AT_low_pc
is the starting address. DW_AT_high_pc is the size of the compila-
tion unit, when adding up to DW_AT_low_pc results in the end ad-
dress of the entity. In this example, code compiled from hello.c
starts at 0x804840b and end at 0x804840b + 0x2e = 0x8048439.
To really make sure, we verify with objdump:

Output
int main(int argc, char *argv[])
{
804840b: 8d 4c 24 04 lea ecx,[esp+0x4]
804840f: 83 e4 f0 and esp,0xfffffff0
8048412: ff 71 fc push DWORD PTR [ecx-0x4]
8048415: 55 push ebp
8048416: 89 e5 mov ebp,esp
8048418: 51 push ecx
8048419: 83 ec 04 sub esp,0x4
printf("Hello World\n");
804841c: 83 ec 0c sub esp,0xc
804841f: 68 c0 84 04 08 push 0x80484c0
186 operating systems: from 0 to 1

8048424: e8 b7 fe ff ff call 80482e0 <puts@plt>

8048429: 83 c4 10 add esp,0x10
return 0;
804842c: b8 00 00 00 00 mov eax,0x0
}
8048431: 8b 4d fc mov ecx,DWORD PTR [ebp-0x4]
8048434: c9 leave
8048435: 8d 61 fc lea esp,[ecx-0x4]
8048438: c3 ret
8048439: 66 90 xchg ax,ax
804843b: 66 90 xchg ax,ax
804843d: 66 90 xchg ax,ax
804843f: 90 nop

It is true: main starts at 804840b and end at 8048439, right after

the ret instruction at 8048438. The instructions after 8048439 are
just padding bytes inserted by gcc for alignment, which do not be-
long to main. Note that the output from objdump shows much more
code past main. It is not counted, as the code is outside of hello.c,
added by gcc for the operating system. hello.c contains only one
function: main and this is why hello.c also starts and ends the
same as main.

Pink This number displays the abbreviation form of a tag. An abbre-

viation is the form of a DIE. When debug info is displayed with -wi,
the DIEs are displayed with their values. -wa option shows abbrevi-
ations in the .debug_abbrev section:

Output Contents of the .debug_abbrev section:

Number TAG (0x0)
1 DW_TAG_compile_unit [has children]
DW_AT_producer DW_FORM_strp
DW_AT_language DW_FORM_data1
DW_AT_name DW_FORM_strp
DW_AT_comp_dir DW_FORM_strp

Making Embedded Systems
100% (1)
Making Embedded Systems
314 pages
Module 6 Questions HSC Chemistry
0% (1)
Module 6 Questions HSC Chemistry
156 pages
Embedded Real Time Operating Systems PDF
100% (2)
Embedded Real Time Operating Systems PDF
491 pages
Operating Systems From 0 To 1
No ratings yet
Operating Systems From 0 To 1
309 pages
Operating System From 0 To 1
100% (2)
Operating System From 0 To 1
313 pages
Drawing Made Easy
67% (3)
Drawing Made Easy
100 pages
The Little Book About OS Development
No ratings yet
The Little Book About OS Development
47 pages
Operating System From 0 To 1
No ratings yet
Operating System From 0 To 1
313 pages
The Little Book About OS Development PDF
No ratings yet
The Little Book About OS Development PDF
47 pages
amforth
No ratings yet
amforth
132 pages
01 Introduction and Course Overview
No ratings yet
01 Introduction and Course Overview
31 pages
Assignment SystemCall Hk251
No ratings yet
Assignment SystemCall Hk251
21 pages
wwwww
No ratings yet
wwwww
7 pages
Types of Operating Systems
100% (1)
Types of Operating Systems
25 pages
CS124 - Operating Systems Winter 2016-2017, Lecture 1
No ratings yet
CS124 - Operating Systems Winter 2016-2017, Lecture 1
31 pages
assignment_SystemCall_Hk2-2024
No ratings yet
assignment_SystemCall_Hk2-2024
21 pages
pgdca_101_OS
No ratings yet
pgdca_101_OS
63 pages
Operating Systems From 0 To 1
No ratings yet
Operating Systems From 0 To 1
309 pages
Think OS: A Brief Introduction To Operating Systems
No ratings yet
Think OS: A Brief Introduction To Operating Systems
93 pages
Think OS A Brief Introduction To Operating Systems
No ratings yet
Think OS A Brief Introduction To Operating Systems
93 pages
Forum Romanum: Tabularia (Offices), Saepta (Voting Precincts), Macella (Market Building), Porticoes and Horrea
100% (1)
Forum Romanum: Tabularia (Offices), Saepta (Voting Precincts), Macella (Market Building), Porticoes and Horrea
22 pages
OS
No ratings yet
OS
4 pages
Making of OS Book ? (Me Gpt)
No ratings yet
Making of OS Book ? (Me Gpt)
3 pages
VMware NSX T Data Center - 2
No ratings yet
VMware NSX T Data Center - 2
183 pages
EV PG Certification Programme 2024-25-1
No ratings yet
EV PG Certification Programme 2024-25-1
54 pages
Assignment SystemCall HonoredProg Hk251
No ratings yet
Assignment SystemCall HonoredProg Hk251
22 pages
Csbu
No ratings yet
Csbu
309 pages
Hesi 11
No ratings yet
Hesi 11
61 pages
Operating System Module
No ratings yet
Operating System Module
72 pages
Studyguide PDF
No ratings yet
Studyguide PDF
52 pages
Clat Question Paper 2022 50
No ratings yet
Clat Question Paper 2022 50
49 pages
01 Introduction
No ratings yet
01 Introduction
26 pages
Telling The Time Past 15 Minutes Lesson Plan
70% (10)
Telling The Time Past 15 Minutes Lesson Plan
5 pages
Think OS: A Brief Introduction To Operating Systems
No ratings yet
Think OS: A Brief Introduction To Operating Systems
101 pages
DLBCSCAOS01 Course Book Computer Architecture and Operating Systems
No ratings yet
DLBCSCAOS01 Course Book Computer Architecture and Operating Systems
212 pages
Principles of Operating Systems:: Design & Applications
No ratings yet
Principles of Operating Systems:: Design & Applications
28 pages
Story A Honest Cow
100% (1)
Story A Honest Cow
3 pages
Osdev Report
No ratings yet
Osdev Report
29 pages
Making Embedded Systems
No ratings yet
Making Embedded Systems
6 pages
The Frightful Goddess Birds Snakes and W
No ratings yet
The Frightful Goddess Birds Snakes and W
37 pages
Think Os
No ratings yet
Think Os
99 pages
2. Design Principles - Copy - Copy
No ratings yet
2. Design Principles - Copy - Copy
28 pages
Os Dev
No ratings yet
Os Dev
77 pages
Operating Systems From 0 To 1
No ratings yet
Operating Systems From 0 To 1
309 pages
Think Os
No ratings yet
Think Os
93 pages
ManualAntenna V 2R
No ratings yet
ManualAntenna V 2R
8 pages
Unit 1 Operating System Notes
No ratings yet
Unit 1 Operating System Notes
20 pages
Philips P89C51RD2 6
No ratings yet
Philips P89C51RD2 6
52 pages
m1
No ratings yet
m1
5 pages
Fire Prevention & Fire Fighting Equipment
100% (1)
Fire Prevention & Fire Fighting Equipment
60 pages
Arm Baremetal Ebook
100% (1)
Arm Baremetal Ebook
96 pages
(eBook PDF) Computer Systems: A Programmer's Perspective 3nd Edition pdf download
100% (1)
(eBook PDF) Computer Systems: A Programmer's Perspective 3nd Edition pdf download
50 pages
Malaysian Highway Capacity Manual
100% (3)
Malaysian Highway Capacity Manual
178 pages
(eBook PDF) Computer Systems: A Programmer's Perspective 3nd Edition pdf download
100% (2)
(eBook PDF) Computer Systems: A Programmer's Perspective 3nd Edition pdf download
53 pages
Operating Systems
No ratings yet
Operating Systems
85 pages
2 Anos 2 Bim Teste de Ingles Com Interpretacao de Texto
100% (1)
2 Anos 2 Bim Teste de Ingles Com Interpretacao de Texto
3 pages
Microcontroller Education: Do It Yourself, Reinvent The Wheel, Code To Learn
No ratings yet
Microcontroller Education: Do It Yourself, Reinvent The Wheel, Code To Learn
195 pages
Embedde, D Systems
No ratings yet
Embedde, D Systems
91 pages
Crop Production and Management Bio (Class 8th)
No ratings yet
Crop Production and Management Bio (Class 8th)
16 pages
Embedded Systems
No ratings yet
Embedded Systems
92 pages
Essay (Smartphone and Radiation)
No ratings yet
Essay (Smartphone and Radiation)
2 pages
Introduction To Operating System
No ratings yet
Introduction To Operating System
5 pages
Coastal Winds Clouds Se Honors
100% (2)
Coastal Winds Clouds Se Honors
5 pages
TM 9-1005-208-12 M1918a2 Bar
100% (2)
TM 9-1005-208-12 M1918a2 Bar
57 pages
Project 3: Writing A Kernel From Scratch: 15-410 Operating Systems
No ratings yet
Project 3: Writing A Kernel From Scratch: 15-410 Operating Systems
44 pages
Making Embedded Systems
No ratings yet
Making Embedded Systems
7 pages
University of Zakho College of Engineering Petroleum Department
No ratings yet
University of Zakho College of Engineering Petroleum Department
10 pages
20-Sdms-02 (Overhead Line Accessories) Rev01
No ratings yet
20-Sdms-02 (Overhead Line Accessories) Rev01
15 pages
Transcendental Deduction Kant
No ratings yet
Transcendental Deduction Kant
3 pages
Auto May 2019
No ratings yet
Auto May 2019
4 pages
Research Funding Agencies: Annexure - 3.1 Criterion - 03 Metric - 3.1.1 & 3.1.2
No ratings yet
Research Funding Agencies: Annexure - 3.1 Criterion - 03 Metric - 3.1.1 & 3.1.2
3 pages
The nth Term of a Linear Sequence
No ratings yet
The nth Term of a Linear Sequence
3 pages
The Ship Building Story
No ratings yet
The Ship Building Story
2 pages
Chlorination of Alcohol Using PPh3
No ratings yet
Chlorination of Alcohol Using PPh3
3 pages
656a178801f71b16b0045a38 - ENERCON E-82 EP2 E4 en-AEROG 1
No ratings yet
656a178801f71b16b0045a38 - ENERCON E-82 EP2 E4 en-AEROG 1
1 page
Off The Grid Survival
No ratings yet
Off The Grid Survival
22 pages
Clang Compiler Frontend: Get to grips with the internals of a C/C++ compiler frontend and create your own tools
From Everand
Clang Compiler Frontend: Get to grips with the internals of a C/C++ compiler frontend and create your own tools
Ivan Murashko
No ratings yet
Grow with Python Programming: From Basics to Advanced
From Everand
Grow with Python Programming: From Basics to Advanced
Mark Fliks
No ratings yet
Design Principles in Architecture
From Everand
Design Principles in Architecture
Rajendra Asan
No ratings yet
Xbox Architecture: Architecture of Consoles: A Practical Analysis, #13
From Everand
Xbox Architecture: Architecture of Consoles: A Practical Analysis, #13
Rodrigo Copetti
No ratings yet
Build Your Own Distributed Compilation Cluster - A Practical Walkthrough
From Everand
Build Your Own Distributed Compilation Cluster - A Practical Walkthrough
Hunter Davis
No ratings yet
JAVA PROGRAMMING FOR BEGINNERS: Master Java Fundamentals and Build Your Own Applications (2023 Crash Course)
From Everand
JAVA PROGRAMMING FOR BEGINNERS: Master Java Fundamentals and Build Your Own Applications (2023 Crash Course)
Theo Houle
No ratings yet
TensorFlow Developer Certificate Exam Practice Tests 2024 Made Easy
From Everand
TensorFlow Developer Certificate Exam Practice Tests 2024 Made Easy
Mr Troy
No ratings yet
Learn Java Programming in 24 Hours
From Everand
Learn Java Programming in 24 Hours
PublishDrive
No ratings yet
Foundation Course for Advanced Computer Studies
From Everand
Foundation Course for Advanced Computer Studies
Franck Ismael Djédjé
No ratings yet
Artificial Intelligence Programming with Python: From Zero to Hero
From Everand
Artificial Intelligence Programming with Python: From Zero to Hero
Perry Xiao
4/5 (1)
Plain JavaScript: Learning the Front-End
From Everand
Plain JavaScript: Learning the Front-End
Roger Beans-Rivet
No ratings yet
Advanced Multiplayer Game Development with Ureal Engine 5: A Comprehensive Guide to C++ Scripting
From Everand
Advanced Multiplayer Game Development with Ureal Engine 5: A Comprehensive Guide to C++ Scripting
Vladimir Kiselev
No ratings yet
Mastering Python Advanced Concepts and Practical Applications
From Everand
Mastering Python Advanced Concepts and Practical Applications
Aissa Younes
No ratings yet
Gray Hat Hacking the Ethical Hacker's
From Everand
Gray Hat Hacking the Ethical Hacker's
Çağatay Şanlı
5/5 (1)

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.