0% found this document useful (0 votes)
13 views78 pages

Operating Systems 1

Chapter 1 introduces operating systems (OS) as an interface between users and hardware, managing resources like memory and processors, and providing services for various management tasks. It discusses the evolution of OS from early hardware interactions to multiprogramming and time-sharing systems, highlighting the importance of OS in simplifying programming. Chapter 2 focuses on process scheduling, explaining process states, the role of schedulers, performance criteria, and various scheduling algorithms, including First-Come-First-Served (FCFS).

Uploaded by

greatreward4106
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views78 pages

Operating Systems 1

Chapter 1 introduces operating systems (OS) as an interface between users and hardware, managing resources like memory and processors, and providing services for various management tasks. It discusses the evolution of OS from early hardware interactions to multiprogramming and time-sharing systems, highlighting the importance of OS in simplifying programming. Chapter 2 focuses on process scheduling, explaining process states, the role of schedulers, performance criteria, and various scheduling algorithms, including First-Come-First-Served (FCFS).

Uploaded by

greatreward4106
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 78

EE442 Operating Systems Ch.

1 Introduction to OS

Chapter 1
Introduction to Operating Systems

1.1 General Definition


An OS is a program which acts as an interface between computer system users and the
computer hardware. It provides a user-friendly environment in which a user may easily
develop and execute programs. Otherwise, hardware knowledge would be mandatory for
computer programming. So, it can be said that an OS hides the complexity of hardware from
uninterested users.

In general, a computer system has some resources which may be utilized to solve a
problem. They are

• Memory
• Processor(s)
• I/O
• File System
etc.

The OS manages these resources and allocates them to specific programs and users. With
the management of the OS, a programmer is rid of difficult hardware considerations. An OS
provides services for

• Processor Management
• Memory Management
• File Management
• Device Management
• Concurrency Control

Another aspect for the usage of OS is that; it is used as a predefined library for hardware-
software interaction and this is why, system programs apply to the installed OS since they
cannot reach hardware directly.

Lecture Notes by Ugur Halıcı 1


EE442 Operating Systems Ch.1 Introduction to OS

Since we have an already written library, namely the


Application Programs
OS, to add two numbers we simply write the following
line to our program:
c=a+b;
whereas in a system where there is no OS installed,
System Programs
we should consider some hardware work as:

(Assuming an MC 6800 computer hardware)


Operating System
LDAA $80 Loading the number at memory location 80
LDAB $81 Loading the number at memory location 81
ADDB Adding these two numbers
STAA $55 Storing the sum to memory location 55
Machine Language
As seen, we considered memory locations and used
our hardware knowledge of the system.

HARDWARE In an OS installed machine, since we have an


intermediate layer, our programs obtain some
advantage of mobility by not dealing with hardware.
For example, the above program segment would not
work for an 8086 machine, where as the “c = a + b ;”
syntax will be suitable for both.

A simple program A more sophisticated


segment with no program segment with Machine Hardware
OS Language response
hardware hardware consideration
consideration

With the advantage of easier programming provided by the OS, the hardware, its machine
language and the OS constitutes a new combination called as a virtual (extended)
machine.

Operating System
Machine Language Virtual
Machine Language (Extended)
Machine
Hardware Machine
Hardware

In a more simplistic approach, in fact, OS itself is a program. But it has a priority which
application programs don’t have. OS uses the kernel mode of the microprocessor, whereas
other programs use the user mode. The difference between two is that; all hardware
instructions are valid in kernel mode, where some of them cannot be used in the user mode.

Lecture Notes by Ugur Halıcı 2


EE442 Operating Systems Ch.1 Introduction to OS

1.2 History of Operating Systems


It all started with computer hardware in about 1945s. Computers were using vacuum tube
technology. Programs were loaded into memory manually using switches, punched cards, or
paper tapes.

As time went on, card readers, printers, and magnetic tape units were developed as
additional hardware elements. Assemblers, loaders and simple utility libraries were
developed as software tools. Later, off-line spooling and channel program methods were
developed sequentially.

Finally, the idea of multiprogramming came. Multiprogramming means sharing of


resources between more than one processes. Now the CPU time was not wasted. Because,
while one process moves on some I/O work, the OS picks another process to execute till the
current one passes to I/O operation.

With the development of interactive computation in 1970s, time-sharing systems emerged.


In these systems, multiple users have terminals (not computers) connected to a main
computer and execute her task in the main computer.

Main computer; having a


CPU executing processes
by utilization of the OS,
(e.g. UNIX).

Terminals are
connected to the
main computer and
used for input and
output. No processing
is made. They do not

Another computer system is the multiprocessor system having multiple processors sharing
memory and peripheral devices. With this configuration, they have greater computing power
and higher reliability. Multiprocessor systems are classified into two as tightly-coupled and
loosely-coupled (distributed). In the former one, each processor is assigned a specific duty
but processors work in close association, possibly sharing the memory. In the latter one,
each processor has its own memory and copy of the OS.

Lecture Notes by Ugur Halıcı 3


EE442 Operating Systems Ch.1 Introduction to OS

Use of the networks required OSs appropriate for them. In network systems, each process
runs in its own machine. The OS can access to other machines. By this way, file sharing,
messaging, etc. became possible.In networks, users are aware of the fact that s/he is
working in a network and when information is exchanged. The user explicitly handles the
transfer of information.

Each is a computer having its own


CPU, RAM, etc. An OS supporting
networks is installed on them.

Distributed systems are similar to networks. However in such systems, there is no need to
exchange information explicitly, it is handled by the OS itself whenever necessary.

With continuing innovations, new architectures and compatible OSs are developed. But their
details are not in the scope of this text since the objective here is to give only a general view
about developments in OS concept.

Lecture Notes by Ugur Halıcı 4


EE442 Operating Systems Ch. 2 Process Scheduling

Chapter 2
Processor Scheduling

2.1 Processes
A process is an executing program, including the current values of the program counter,
registers, and variables.The subtle difference between a process and a program is that the
program is a group of instructions whereas the process is the activity.

In multiprogramming systems, processes are performed in a pseudoparallelism as if each


process has its own processor. In fact, there is only one processor but it switches back and
forth from process to process.

Henceforth, by saying execution of a process, we mean the processor’s operations on the


process like changing its variables, etc. and I/O work means the interaction of the process
with the I/O operations like reading something or writing to somewhere. They may also be
named as “processor (CPU) burst” and “I/O burst” respectively. According to these
definitions, we classify programs as

• Processor-bound program: A program having long processor bursts (execution


instants).
• I/O-bound program: A program having short processor bursts.

Assume we have two processes A and B. Both execute for 1 second and do some I/O work
for 1 second. This pattern is repeated 3 times for process A and 2 times for process B.
.
If we have no multiprogramming, the processes are executed sequentially as below.

A enters B enters Time

ExecA1 I/OA1 ExecA2 I/OA2 ExecA3 I/OA3 ExecB1 I/OB1 ExecB2 I/OB2

A leaves B leaves

So, the processor executes these two processes in a total time of 10 seconds. However, it is
idle at I/O instants of processes. So, it is idle for 5 seconds and utilized for 5 seconds.

5
Then the processor utilization is 100 50%
10

Now let’s consider multiprogramming case:


A enters B enters B leaves A leaves

CPU A1 B1 A2 B2 A3 Idle

I/0 idle A1 B1 A2 B2 A3

Lecture Notes by Uğur Halıcı 5


EE442 Operating Systems Ch. 2 Process Scheduling

In this case, when process A passes to some I/O work (i.e. does not use the processor),
processor utilizes its time to execute process B instead of being idle.
5
Here the processor utilization is 100 83%
6

2.2 Process States

START READY RUNNING HALTED

I/O I/O
completed requested
WAITING

Start : The process has just arrived.


Ready : The process is waiting to grab the processor.
Running : The process has been allocated by the processor.
Waiting : The process is doing I/O work or blocked.
Halted : The process has finished and is about to leave the system.

In the OS, each process is represented by its PCB (Process Control Block). The PCB,
generally contains the following information:

• Process State,
• Process ID,
• Program Counter (PC) value,
• Register values
• Memory Management Information (page tables, base/bound registers etc.)
• Processor Scheduling Information ( priority, last processor burst time etc.)
• I/O Status Info (outstanding I/O requests, I/O devices held, etc.)
• List of Open Files
• Accounting Info.

If we have a single processor in our system, there is only one running process at a time.
Other ready processes wait for the processor. The system keeps these ready processes
(their PCBs) on a list called Ready Queue which is generally implemented as a linked-list
structure.

When a process is allocated by the processor, it executes and after some time it either
finishes or passes to waiting state (for I/O). The list of processes waiting for a particular I/O
device is called a Device Queue. Each device has its own device queue.

In multiprogramming systems, the processor can be switched from one process to another.
Note that when an interrupt occurs, PC and register contents for the running process (which
is being interrupted) must be saved so that the process can be continued correctly
afterwards.Switching between processes occurs as depicted below.

Lecture Notes by Uğur Halıcı 6


EE442 Operating Systems Ch. 2 Process Scheduling

Execution of A Execution of A

Save PCA Save PCB


Save REGISTERS A Save REGISTERS B
Load PCB Load PCA
Load REGISTERS B Load REGISTERS A
Execution of B (Context Switching)

2.3 Scheduler

If we consider batch systems, there will often be more processes submitted than the number
of processes that can be executed immediately. So incoming processes are spooled (to a
disk). The long-term scheduler selects processes from this process pool and loads selected
processes into memory for execution.

The short-term scheduler selects the process to get the processor from among the
processes which are already in memory.

The short-time scheduler will be executing frequently (mostly at least once every 10
milliseconds). So it has to be very fast in order to achieve a better processor utilization. The
short-time scheduler, like all other OS programs, has to execute on the processor. If it takes
1 millisecond to choose a process that means ( 1 / ( 10 + 1 )) = 9% of the processor time is
being used for short time scheduling and only 91% may be used by processes for execution.

The long-term scheduler on the other hand executes much less frequently. It controls the
degree of multiprogramming (no. of processes in memory at a time). If the degree of
multiprogramming is to be kept stable (say 10 processes at a time), then the long-term
scheduler may only need to be invoked when a process finishes execution.

The long-term scheduler must select a good process mix of I/O-bound and processor bound
processes. If most of the processes selected are I/O-bound, then the ready queue will
almost be empty while the device queue(s) will be very crowded. If most of the processes
are processor-bound, then the device queue(s) will almost be empty while the ready queue
is very crowded and that will cause the short-term scheduler to be invoked very frequently.

Time-sharing systems (mostly) have no long-term scheduler. The stability of these systems
either depends upon a physical limitation (no. of available terminals) or the self-adjusting
nature of users (if you can't get response, you quit).

It can sometimes be good to reduce the degree of multiprogramming by removing processes


from memory and storing them on disk. These processes can then be reintroduced into
memory by the medium-term scheduler. This operation is also known as swapping.
Swapping may be necessary to improve the process mix or to free memory.

2.4. Performance Criteria


In order to achieve an efficient processor management, OS tries to select the most
appropriate process from the ready queue. For selection, the relative importance of the
followings may be considered as performance criteria.

Lecture Notes by Uğur Halıcı 7


EE442 Operating Systems Ch. 2 Process Scheduling

Processor Utilization: The ratio of busy time of the processor to the total time passes for
processes to finish. We would like to keep the processor as busy as possible.

Processor Utilization = (Processor buy time) / (Processor busy time + Processor idle time)

Throughput: The measure of work done in a unit time interval.

Throughput = (Number of processes completed) / (Time Unit)

Turnaround Time (tat): The sum of time spent waiting to get into the ready queue,
execution time and I/O time.

tat = t(process completed) – t(process submitted)

Waiting Time (wt): Time spent in ready queue. Processor scheduling algorithms only affect
the time spent waiting in the ready queue. So, considering only waiting time instead of
turnaround time is generally sufficient.

Response Time (rt): The amount of time it takes to start responding to a request. This
criterion is important for interactive systems.

rt = t(first response) – t(submission of request)

We, normally, want to maximize the processor utilization and throughput, and minimize tat,
wt, and rt. However, sometimes other combinations may be required depending on to
processes.

2.5 Processor Scheduling algorithms


Now, let’s discuss some processor scheduling algorithms again stating that the goal is to
select the most appropriate process in the ready queue. For the sake of simplicity, we will
assume that we have a single I/O server and a single device queue, and we will assume our
device queue always implemented with FIFO method. We will also neglect the switching
time between processors (context switching).

2.3.1 First-Come-First-Served (FCFS)

In this algorithm, the process to be selected is the process which requests the processor
first. This is the process whose PCB is at the head of the ready queue. Contrary to its
simplicity, its performance may often be poor compared to other algorithms.

FCFS may cause processes with short processor bursts to wait for a long time. If one
process with a long processor burst gets the processor, all the others will wait for it to
release it and the ready queue will be filled very much. This is called the convoy effect.

Example 2.1

Consider the following information and draw the timing (Gannt) chart for the processor and
the I/O server using FCFS algorithm for processor scheduling.

Lecture Notes by Uğur Halıcı 8


EE442 Operating Systems Ch. 2 Process Scheduling

Process Arrival time 1st exec 1st I/O 2nd exec 2nd I/O 3rd exec
A 0 4 4 4 4 4
B 2 8 1 8 - -
C 3 2 1 2 - -
D 7 1 1 1 1 1

A B C D B C A D

CPU A1 B1 C1 D1 A2 B2 C2 D2 A3 D3
0 2 3 4 7 12 14 15 19 27 29 30 34 35

I/O A1 B1 C1D1 A2 D2
0 4 8 12 13 14 15 16 19 23 30 31

Processor utilization = (35 / 35) * 100 = 100 %

Throughput = 4 / 35=0.11

tatA = 34 – 0 = 34
tatB = 27 – 2 = 25
tatC = 29 – 3 = 26
tatD = 35 – 7 = 28

tatAVG = (34 + 25 + 26 + 28) / 4 = 28.25

wtA = (0 – 0) + (15 – 8) + (30 – 23) = 14


wtB = (4 – 2) + (19 – 13) = 12
wtC = (12 – 3) + (27 – 15) = 21
wtD = (14 – 7) + (29 – 16) + (34 – 31) = 23

wtAVG = (14 + 12 + 21 + 23) / 4 = 17.3

rtA = 0 – 0 = 0
rtB = 4 – 2 = 2
rtC = 12 – 3 = 9
rtD = 14 – 7 = 7

rtAVG = (0 + 2 + 9 + 7) / 4 = 4.5

2.3.2 Shortest-Process-First (SPF)

In this method, the processor is assigned to the process with the smallest execution
(processor burst) time. This requires the knowledge of execution time. In our examples, it is
given as a table but actually these burst times are not known by the OS. So it makes
prediction. One approach for this prediction is using the previous processor burst times for
the processes in the ready queue and then the algorithm selects the shortest predicted next
processor burst time.

Lecture Notes by Uğur Halıcı 9


EE442 Operating Systems Ch. 2 Process Scheduling

Example 2.2 :

Consider the same process table in Example 2.1 and draw the timing charts of the processor
and I/O assuming SPF is used for processor scheduling. (Assume FCFS for I/0)

A B C D C D B A

CPU A1 C1 B1 D1 C2 D2 A2 D3 B2 A3
0 2 3 4 6 14 15 17 18 22 23 31 35

I/O A1 C1 B1D1 D2 A2
0 4 8 9 14 15 16 18 19 22 26 35

Processor utilization = (35 / 35) * 100 = 100 %

Throughput = 4 / 35 = 0.11

tatA = 35 – 0 = 35
tatB = 31 – 2 = 29
tatC = 17 – 3 =14
tatD = 23 – 7 = 16

tatAVG = (35 + 29 + 15 + 16) / 4 = 23.5

wtA = (0 – 0) + (18 – 8) + (31 – 26) = 15


wtB = (6 – 2) + (23 – 15) = 12
wtC = (4 – 3) + (15 – 9) = 7
wtD = (14 – 7) + (17 – 16) + (22 – 19) = 11

wtAVG = (15 + 12 + 7 + 11) / 4 = 11.25

rtA = 0 – 0 = 0
rtB = 6 – 2 = 4
rtC = 4 – 3 = 1
rtD = 14 – 7 = 7

rtAVG = (0 + 4 + 1 + 7) / 4 = 3

2.3.3 Shortest-Remaining-Time-First (SRTF)

The scheduling algorithms we discussed so far are all non-preemptive algorithms. That is,
once a process grabs the processor, it keeps the processor until it terminates or it requests
I/O.

To deal with this problem (if so), preemptive algorithms are developed. In this type of
algorithms, at some time instant, the process being executed may be preempted to execute
a new selected process. The preemption conditions are up to the algorithm design.

SPF algorithm can be modified to be preemptive. Assume while one process is executing on
the processor, another process arrives. The new process may have a predicted next
processor burst time shorter than what is left of the currently executing process. If the SPF
algorithm is preemptive, the currently executing process will be preempted from the

Lecture Notes by Uğur Halıcı 10


EE442 Operating Systems Ch. 2 Process Scheduling

processor and the new process will start executing. The modified SPF algorithm is named as
Shortest-Remaining-Time-First (SRTF) algorithm.

Example 2.3

Consider the same process table in Example 2.1 and draw the timing charts of the processor
and I/O assuming SRTF is used for processor scheduling.

A B C D C D A B

CPU A1 C1 B1 D1 A2 C2 D2A2D3 A2 B1 A3 B2
0 2 3 4 6 7 8 9 11 12 13 14 16 23 27 35

I/O A1 C1 D1 D2 A2 B1
0 4 8 9 10 12 13 16 20 23 24 35

Processor utilization = (35 / 35) * 100 = 100 %

Throughput = 4 / 35 = 0.11

tatA = 27 – 0 = 27
tatB = 35 – 2 = 33
tatC = 11 – 3 = 8
tatD = 14 – 7 = 7

tatAVG = (27 + 33 + 8 + 7) / 4 = 18.75

wtA = (0 – 0) + (8 – 8) + (12 - 9) + (14 – 13) + (23 - 20) = 7


wtB = (6 – 2) + (16 – 7) + (27-24) = 16
wtC = (4 – 3) + (9 – 9) = 1
wtD = (7 – 7) + (11 – 10) + (13 – 13) = 1

wtAVG = (7 + 16 + 1 + 1) / 4 = 6.25

rtA = 0 – 0 = 0
rtB = 6 – 2 = 4
rtC = 4 – 3 = 1
rtD = 7 – 7 = 0

rtAVG = (0 + 4 + 1 + 0) / 4 = 1.25

2.3.4 Round-Robin Scheduling (RRS)

In RRS algorithm the ready queue is treated as a FIFO circular queue. The RRS traces the
ready queue allocating the processor to each process for a time interval which is smaller
than or equal to a predefined time called time quantum (slice).

The OS using RRS, takes the first process from the ready queue, sets a timer to interrupt
after one time quantum and gives the processor to that process. If the process has a
processor burst time smaller than the time quantum, then it releases the processor

Lecture Notes by Uğur Halıcı 11


EE442 Operating Systems Ch. 2 Process Scheduling

voluntarily, either by terminating or by issuing an I/O request. The OS then proceed with the
next process in the ready queue.

On the other hand, if the process has a processor burst time greater than the time quantum,
then the timer will go off after one time quantum expires, and it interrupts (preempts) the
current process and puts its PCB to the end of the ready queue.

The performance of RRS depends heavily on the selected time quantum.

• Time quantum RRS becomes FCFS

• Time quantum 0 RRS becomes processor sharing (It acts as if each of the n
processes has its own processor running at processor speed divided by n)

For an optimum time quantum, it can be selected to be greater than 80 % of processor


bursts and to be greater than the context switching time.

Example 2.4

Consider the following information and draw the timing chart for the processor and the I/O
server using RRS algorithm with time quantum of 3 for processor scheduling.

A B C D C D B A

CPU A1 B1 C1 A1 B1 D1C2 B1 A2 D2B2 A2D3B2 A3 B2 A3


0 3 6 8 9 12 13 15 17 20 21 24 25 26 29 32 34 35

I/O C1 A1 D1 B1 D2 A2
0 8 9 13 14 17 18 21 22 25 29 35

Processor utilization = (35 / 35) * 100 = 100 %

Throughput = 4 / 35

tatA = 35 – 0 = 35
tatB = 34 – 2 = 32
tatC = 15 – 3 =12
tatD = 26 – 7 = 19

tatAVG = (35 + 32 + 12 + 19) / 4 = 24.5

wtA = (0 – 0) + (8 – 3) + (17 - 13) + (24 – 20) + (29 - 29) + (34 – 32) = 15


wtB = (3 – 2) + (9 – 6) + (15 -12) + (21 – 18) + (26 – 24) + (32 – 29) = 15
wtC = (6 – 3) + (13 – 9) = 7
wtD = (12 – 7) + (20 – 14) + (25 – 22) = 14

wtAVG = (15 + 12 + 7 + 11) / 4 = 11.25

rtA = 0 – 0 = 0
rtB = 36 – 2 = 1
rtC = 6 – 3 = 3
rtD = 12 – 7 = 5

rtAVG = (0 + 1 + 3 + 5) / 4 = 2.25

Lecture Notes by Uğur Halıcı 12


EE442 Operating Systems Ch. 2 Process Scheduling

FCFS SPF SRT RR


tatavg 28.25 23.5 18.75 24.5
wtavg 16.5 10.5 6.25 12.25
rtavg 4.5 3 1.25 2.25
Easy to implement Not possible to Not possible to Implementable, rtmax
know next CPU know next CPU is important for
burst exactly, it can burst exactly, it can interactive systems
only be guessed only be guessed

2.3.5 Priority Scheduling

In this type of algorithms a priority is associated with each process and the processor is
given to the process with the highest priority. Equal priority processes are scheduled with
FCFS method.

To illustrate, SPF is a special case of priority scheduling algorithm where

Priority(i) = 1 / next processor burst time of process i

Priorities can be fixed externally or they may be calculated by the OS from time to time.
Externally, if all users have to code time limits and maximum memory for their programs,
priorities are known before execution. Internally, a next processor burst time prediction such
as that of SPF can be used to determine priorities dynamically.

A priority scheduling algorithm can leave some low-priority processes in the ready queue
indefinitely. If the system is heavily loaded, it is a great probability that there is a higher-
priority process to grab the processor. This is called the starvation problem. One solution for
the starvation problem might be to gradually increase the priority of processes that stay in
the system for a long time.

Example 2.5

Following may be used as a priority defining function:


Priority (n) = 10 + tnow – ts(n) – tr(n) – cpu(n)
where
ts(n) : the time process n is submitted to the system
tr(n) : the time process n entered to the ready queue last time
cpu(n) : next processor burst length of process n
tnow : current time

QUESTIONS

1. Construct an example to compare the Shortest Job First strategy of processor scheduling
with a Longest Job First strategy, in terms of processor utilization, turnaround time and
throughput.

2. The following jobs are in the given order in the ready queue:

Lecture Notes by Uğur Halıcı 13


EE442 Operating Systems Ch. 2 Process Scheduling

Job CPU Burst(msec) Priority


A 6 3
B 1 1
C 2 3
D 1 4
E 3 2

None of these jobs have any I/O requirement.

a. what is the turnaround time of each job with First come First Served, Shortest Job First,
Round Robin (time quantum=1) and non-preemptive priority scheduling? Assume that the
operating system has a context switching overhead of 1 msec. for saving and another 1
msec. for loading the registers of a process.

b. what is the waiting time for each job with each of the four scheduling techniques and
assumption as in part a?

3. The following data is given for a computer system employing Round-Robin processor
Scheduling with time slice=5, if two processes are to enter to the ready queue at the same
time then they will enter in the alphabetical order of their names:

process arrival CPU I/0 CPU


A 0 4 5 6
B 3 3 - -
C 10 2 7 7

a. Assuming that context switch time is 0, draw the Gannt Chart for the above processes,
and calculate the average waiting time and CPU utilization.

b. Assuming context switch time is 1, repeat part 'a',

c. Discuss briefly how the time slice should be selected to increase the system performance,
given average CPU burst time, average I/O time, and the context switch time.

4. Consider the following job arrival and CPU burst times given:

Job Arrival time CPU burst


A 0 7
B 2 4
C 3 1
D 6 6

a. Using the shortest job first scheduling method, draw the Gannt chart (showing the order of
execution of these jobs), and calculate the average waiting time for this set of jobs.

b. Repeat a. using the shortest remaining time first scheduling method.

c. What is the main difference between these two methods ?

5. Explain the following briefly:

a. What is an I/O bound job?

Lecture Notes by Uğur Halıcı 14


EE442 Operating Systems Ch. 2 Process Scheduling

b. What is CPU bound job?

c. Suppose there is one I/O bound job and one CPU bound job in a ready queue. Which one
should get the processor in order to achieve a better CPU

d. Repeat c. for a better I/O device utilization.

6. A processor scheduling algorithm determines an order of execution of active processes in


a computer system.

a. If there are n processes to be scheduled on one processor, how many possible different
schedules are there? Give a formula in terms of n.

b. Repeat part a. for n processes to be scheduled on m processors.

7. Explain the following terms :

a. Long-term scheduler

b. Short-term scheduler

Which one controls the degree of multiprogramming?

8. a. Find and draw the Gannt Chart for the following processes assuming a preemptive
shortest remaining time first processor scheduling algorithm.

process arrival next CPU


A 0 12
B 2 9
C 5 2
D 5 7
E 9 3
F 10 1

Clearly indicate every preemption on your Gannt Chart.

b. Calculate the turnaround times for each process, and find the average turnaround time for
the above processes.

9. Consider a priority_based processor scheduling algorithm in which priorities are computed


as the ratio of processor execution time to real time (total elapsed time).

a. Does this algorithm give higher priority to processor bound processes, or to I/O bound
processes? Explain.

b. If priorities are recomputed every second, and if there is no I/O, to which algorithm does
this processor scheduling algorithm degenerate to?

10. Show the execution of the following processes on Gannt Charts for the processor and
the I/O device if Shortest Process First Algorithm for the Processor is used. Assume that
the processes in the I/O queues are processes in First Come First Served manner . Find
also the average waiting time in ready queue.

Lecture Notes by Uğur Halıcı 15


EE442 Operating Systems Ch. 2 Process Scheduling

CPU and I/0 Bursts

Arrival Process 1st CPU 1st I/O 2nd CPU 2nd I/O 3rd CPU
0 A 4 4 4 4 4
2 B 6 2 6 2 -
3 C 2 1 2 1 -
7 D 1 1 1 1 1

11. Show the execution of the following processes on Gannt Charts for the processor and
the I/O device if Shortest Remaining Time First Algorithm for the Processor is used. Find
also the average waiting time in ready queue. If the processess are to enter to the ready
queue at the same time due to a. preemption of the processor, b. new submission, or c.
completion of I/O operation, then the process of case a., will be in front of the process of
case b., which will be in front of the process of case c.. Assume that I/O queue works in
First Come First Served manner.

CPU and I/0 Bursts

Arrival Process 1st CPU 1st I/O 2nd CPU 2nd I/O 3rd CPU
0 A 4 4 4 4 4
2 B 8 1 8 1 -
3 C 1 1 1 1 -

12.Consider a priority based processor scheduling algorithm such that whenever there is a
process in the ready queue having priority higher priority than the one executing in CPU, it
will force the executing process to prempt the CPU, and itself will grab the CPU. If there are
more then one such processes having highests priority, then among these the one that
entered into the ready queue first should be chosen. Assume that all the processes have a
single CPU burst and have no I/O operation and consider the following table:

Process id Submit (i) msec Burst(i) msec


P1 0 8
P2 2 4
P3 3 2

where Submit(i) denotes when process Pi is submitted into the system (assume they are
submitted just before the tic of the clock) and Burst(i) denotes the length of the CPU burst
for process Pi . Let let Tnow to denote the present time and Execute(i) to denote the total
execution of process Pi in cpu until Tnow. Assuming that the priorities are calculated every
msec (just after the tic of the clock), draw the corresponding Gannt charts for the following
priority formulas, which are valid when the process has been submitted but not terminated
yet,

a. Priority(i)= Tnow-Submit(i)

b. Priority(i)= Burst(i)-Execute(i)

c. Priority(i)=(Burst(i)-Execute(i))-1

13. Consider the following job arrival and CPU, I/O burst times given. Assume that context
switching time is negligible in the system and there is a single I/O device, which operates in
first come first served manner,

Lecture Notes by Uğur Halıcı 16


EE442 Operating Systems Ch. 2 Process Scheduling

process arrival t. CPU-1 I/O-1 CPU-2


A 0 2 4 4
B 1 1 5 2
C 8 2 1 3

Draw the Gannt charts both for the CPU and the I/O device., and then find out what is the
average turn around time and cpu utilization for

a. first come first served

b. round robin

processor scheduling algorihms.

14. Consider the following job arrival:

process arrival time next CPU burst


A 0 7
B 1 5
C 2 3
D 5 1

a. Draw the Gannt chart for the CPU if nonpreemptive SPF scheduling is used
b. Repeat part a. if preemptive SPF scheduling is used
c. Repeat part a. if nonpreemptive priority based CPU scheduling is used with

priority (i)= 10 + tnow - tr(i) – cpu(i)

where ts(i)=the time process i is submitted to the sytem,


tr(i)= the time proceess i entered to the ready queue last time,
cpu(i)=next cpu burst length of process i,
tnow=current time.

The process having the highest priority will grab the CPU. In the case more than one
processes having the same priority, then the one being the first in the alphabetical order will
be chosen.

d. For each of above cases indicate if starvation may occur or not

Lecture Notes by Uğur Halıcı 17


EE442 Operating Systems Ch. 3 Memory Management

Chapter 3
Memory Management

In a multiprogramming system, in order to share the processor, a number of processes must


be kept in memory. Memory management is achieved through memory management
algorithms. Each memory management algorithm requires its own hardware support. In this
chapter, we shall see the partitioning, paging and segmentation methods.

In order to be able to load programs at anywhere in memory, the compiler must generate
relocatable object code. Also we must make it sure that a program in memory, addresses
only its own area, and no other program’s area. Therefore, some protection mechanism is
also needed.

3.1 Fixed Partitioning

Memory In this method, memory is divided into partitions


whose sizes are fixed. OS is placed into the
OS lowest bytes of memory. Processes are
classified on entry to the system according to
(n KB) Small their memory they requirements. We need one
(3n KB) Process Queue (PQ) for each class of process.
Medium
If a process is selected to allocate memory,
then it goes into memory and competes for the
processor. The number of fixed partition gives
(6n KB) the degree of multiprogramming. Since each
Large
queue has its own memory region, there is no
competition between queues for the memory.

Fixed Partitioning with Swapping

This is a version of fixed partitioning that uses RRS with some time quantum. When time
quantum for a process expires, it is swapped out of memory to disk and the next process in
the corresponding process queue is swapped into the memory.

OS

2K P1 P3

P2
6K
P4 P5

empty
12K empty

Lecture Notes by Uğur Halıcı 18


EE442 Operating Systems Ch. 3 Memory Management

Normally, a process swapped out will eventually be swapped back into the same partition.
But this restriction can be relaxed with dynamic relocation.

In some cases, a process executing may request more memory than its partition size. Say
we have a 6 KB process running in 6 KB partition and it now requires a more memory of 1
KB. Then, the following policies are possible:

• Return control to the user program. Let the program decide either quit or modify its
operation so that it can run (possibly slow) in less space.

• Abort the process. (The user states the maximum amount of memory that the process
will need, so it is the user’s responsibility to stick to that limit)

• If dynamic relocation is being used, swap the process out to the next largest PQ and
locate into that partition when its turn comes.

The main problem with the fixed partitioning method is how to determine the number of
partitions, and how to determine their sizes.

If a whole partition is currently not being used, then it is called an external fragmentation.
And if a partition is being used by a process requiring some memory smaller than the
partition size, then it is called an internal fragmentation.

OS
In this composition of memory, if a
2K P1 (2K)
new process, P3, requiring 8 KB of
6K Empty (6k) External memory comes, although there is
fragmentation
enough total space in memory, it can
not be loaded because fragmentation.
12K P2 (9K)
Internal
Empty (3K) fragmentation

3.2 Variable Partitioning


With fixed partitions we have to deal with the problem of determining the number and sizes
of partitions to minimize internal and external fragmentation. If we use variable partitioning
instead, then partition sizes may vary dynamically.

In the variable partitioning method, we keep a table (linked list) indicating used/free areas in
memory. Initially, the whole memory is free and it is considered as one large block. When a
new process arrives, the OS searches for a block of free memory large enough for that
process. We keep the rest available (free) for the future processes. If a block becomes free,
then the OS tries to merge it with its neighbors if they are also free.

There are three algorithms for searching the list of free blocks for a specific amount of
memory.

First Fit : Allocate the first free block that is large enough for the new process. This is a fast
algorithm.

Lecture Notes by Uğur Halıcı 19


EE442 Operating Systems Ch. 3 Memory Management

Best Fit : Allocate the smallest block among those that are large enough for the new
process. In this method, the OS has to search the entire list, or it can keep it sorted and stop
when it hits an entry which has a size larger than the size of new process. This algorithm
produces the smallest left over block. However, it requires more time for searching all the list
or sorting it.

Worst Fit : Allocate the largest block among those that are large enough for the new
process. Again a search of the entire list or sorting it is needed. This algorithm produces the
largest over block.

Example 3.1
Consider the following memory map and assume a new process P4 comes with a memory
requirement of 3 KB. Locate this process.

OS
P1
<free> 10 KB
a. First fit algorithm allocates from the 10 KB block.
P2 b. Best fit algorithm allocates from the 4 KB block.
<free> 16 KB c. Worst fit algorithm allocates from the 16 KB block.
P3
<free> 4 KB

New memory arrangements with respect to each algorithms will be as follows:

OS OS OS
P1 P1 P1
P4 <free> 10 KB <free> 10 KB
<free> 7 KB P2 P2
P2 <free> 16 KB P4
<free> 16 KB P3 <free> 13 KB
P3 P4 P3
<free> 4 KB <free> 1 KB <free> 4 KB
First Fit Best Fit Worst Fit

At this point, if a new process, P5 of 14K arrives, then it would wait if we used worst fit
algorithm, whereas it would be located in cases of the others.

Compaction: Compaction is a method to overcome the external fragmentation problem. All


free blocks are brought together as one large block of free space. Compaction requires
dynamic relocation. Certainly, compaction has a cost and selection of an optimal compaction
strategy is difficult. One method for compaction is swapping out those processes that are to
be moved within the memory, and swapping them into different memory locations.

OS
P1 OS
<free> 20 KB P1
P2 Compaction P2
<free> 7 KB P3
P3 <free> 37 KB
<free> 10 KB

Lecture Notes by Uğur Halıcı 20


EE442 Operating Systems Ch. 3 Memory Management

3.3 Paging
Paging permits a program to allocate noncontiguous blocks of memory. The OS divide
programs into pages which are blocks of small and fixed size. Then, it divides the physical
memory into frames which are blocks of size equal to page size. The OS uses a page table
to map program pages to memory frames. Page size (S) is defined by the hardware.
Generally page size is chosen as a power of 2 such as 512 words/page or 4096 words/page
etc.

With this arrangement, the words in the program have an address called as logical address.
Every logical address is formed of

• A page number p where p = logical address div S


• An offset d where d = logical address mod S

When a logical address <p, d> is generated by the processor, first the frame number f
corresponding to page p is determined by using the page table and then the physical
address is calculated as (f*S+d) and the memory is accessed.

Logical memory Physical memory

P0 page frame Attributes f0


0 4
P1 1 3 P2 f1
2 1
P2 3 5 f2

P3 Page Table P1 f3

P0 f4

P3 f5

The address translation in paging is shown below

Logical address Physical address

p d f d

d page frame Attributes d


p f
p f

Logical Memory Page Table Physical Memory

Lecture Notes by Uğur Halıcı 21


EE442 Operating Systems Ch. 3 Memory Management

Example 3.2

Consider the following information to form a physical memory map.


Page Size = 8 words
Physical Memory Size = 128 words
A program of 3 pages where P0 f3; P1 f6; P2 f4

Logical Program Physical Memory

Word 0 … …
Word 1 Page 0 Word 0
… (P0) Page Table Word 1 Frame 3
Word 7 … (f3)
Word 8 Page Frame Word 7
Word 9 Page 1 0 3 Word 16
… (P1) 1 6 Word 17 Frame 4
Word 15 2 4 … (f4)
Word 16 Word 23
Word 17 Page 2
… (P2) … …
Word 23
Word 8
Word 9 Frame 6
… (f6)
Word 15
… …

Program Logical Offset Page Frame Physical


Line Address Number Number Address
Word 0 00 000 000 00 011 011 000
Word 1 00 001 001 00 011 011 001
… … … … … …
Word 7 00 111 111 00 011 011 111
Word 8 01 000 000 01 110 110 000
Word 9 01 001 001 01 110 110 001
… … … … … …
Word 15 01 111 111 01 110 110 111
Word 16 10 000 000 10 100 100 000
Word 17 10 001 001 10 100 100 001

Word 23 10 111 111 10 100 100 111

How to Implement The Page Table?

Every access to memory should go through the page table. Therefore, it must be
implemented in an efficient way.

a. Using fast dedicated registers

Keep page table in fast dedicated registers. Only the OS is able to modify these registers.
However, if the page table is large, this method becomes very expensive since requires too
many registers.

Lecture Notes by Uğur Halıcı 22


EE442 Operating Systems Ch. 3 Memory Management

Logical address
PTLR: Page Table Length Register
p d

Physical address
f d
YES Access PT Access
p<PTLR? in Registers memory

rat mat
NO

ERROR

Given a logical address to access the word in physical memory, first access the PT stored in
registers, which requires register access time (rat), and then find out the physical address
and access the physical memory, which requires memory access time (mat). Therefore
effective memory access time (emat) becomes:

emat= rat + mat

b. Keep the page table in main memory

In this method, the OS keeps a page table in the memory. But this is a time consuming
method. Because for every logical memory reference, two memory accesses are required:
1. To access the page table in the memory, in order to find the corresponding frame number.
2. To access the memory word in that frame

Logical address
PTBR: Page Table Base Register
p d PTLR: Page Table Length Register

Physical address
f d
Access PT Access
p<PTLR? entry memory
in Memory
at address mat
PTBR + p
NO
mat
. ERROR

In this approach emat is:

emat= 2 * mat

Lecture Notes by Uğur Halıcı 23


EE442 Operating Systems Ch. 3 Memory Management

c. Use content-addressable associative registers

These are small, high speed registers built in a special way so that they permit an
associative search over their contents. That is, all registers may be searched in one machine
cycle simultaneously. However, associative registers are quite expensive. So, a small
number of them should be used.

Logical address
PTBR: Page Table Base Register
p d PTLR: Page Table Length Register

Physical address
f d
YES Search PT
p<PTLR? in AR FOUND?
YES (hit)
rat
NO
NO (miss) Physical address
f d
ERRO Access PT Access
entry memory
in Memory
at address mat
PTBR + p
mat

When a logical memory reference is made, first the corresponding page number is searched
in associative registers. If that page number is found in one associative register (hit) then the
corresponding frame number is get, else (miss) the page table in memory is accessed to find
the frame number and that <page number, frame number> pair is stored into associative
registers. Once the frame number is obtained, the memory word is accessed.

The hit ratio is defined as the percentage of times that a page number is found in associative
registers. Hit ratio is important in performance of the system since it affects the effective
memory access time. In the case of finding the page number in associative registers, only
one memory access time is required whereas if it cannot be found two memory accesses are
needed. So, greater the hit ratio, smaller the effective memory access time. Effective
memory access time is calculated as fallows:

emat= h *ematHIT + (1-h) * ematMISS

where

h = The hit ratio


ematHIT = effective memory access time when there is a hit = rat + mat
ematMISS = effective memory access time when there is a miss = rat + mat + mat

Lecture Notes by Uğur Halıcı 24


EE442 Operating Systems Ch. 3 Memory Management

Example 3.3

Assume we have a paging system which uses associative registers. These associative
registers have an access time of 30 ns, and the memory access time is 470 ns. The system
has a hit ratio of 90 %.

Now, if the page number is found in one of the associative registers, then the effective
access time:

ematHIT = 30 + 470 = 500 ns.

Because one access to associative registers and one access to the main memory is
sufficient.

On the other hand, if the page number is not found in associative registers, then the effective
access time:
ematMISS = 30 + (2 * 470) = 970 ns.

Since one access to associative registers and two accesses to the main memory iare
required.

Then, the emat is calculated as follows:

emat = 0.9 * 500 + 0.1 * 970


= 450 + 97 = 547 ns

Sharing Pages

Sharing pages is possible in a paging system, and is an important advantage of paging. It is


possible to share system procedures or programs, user procedures or programs, and
possibly data area. Sharing pages is especially advantageous in time-sharing systems. A
reentrant program (non-self-modifying code = read only) never changes during execution.
So, more than one process can execute the same code at the same time. Each process will
have its own data storage and its own copy of registers to hold the data for its own execution
of the shared program.

Example 3.4

Consider a system having page size=30 MB. There are 3 users executing an editor program
which is 90 MB (3 pages) in size, with a 30 MB (1 page) data space.

To support these 3 users, the OS must allocate 3 * (90+30) = 360 MB space. However, if the
editor program is reentrant (non-self-modifying code = read only), then it can be shared
among the users, and only one copy of the editor program is sufficient. Therefore, only 90 +
30 * 3 = 180 MB of memory space is enough for this case.

Lecture Notes by Uğur Halıcı 25


EE442 Operating Systems Ch. 3 Memory Management

User-1 PT-1 Physical


P0 e1 Page# Frame# Memory
P1 e2 0 8 f0
P2 e3 1 4 f1
P3 data1 2 5 f2
3 7 f3
f4 e2
User-2 PT-2 f5 e3
P0 e1 Page# Frame# f6
P1 e2 0 8 f7 data1
P2 e3 1 4 f8 e1
P3 data2 2 5 f9
3 12 f10 data3
f11
User-3 PT-3 f12 data 2
P0 e1 Page# Frame# f13
P1 e2 0 8 f14
P2 e3 1 4 f15
P3 data3 2 5
3 10

3.2 Segmentation
In segmentation, programs are divided into variable size segments, instead of fixed size
pages. Every logical address is formed of a segment name and an offset within that
segment. In practice, segments are numbered. Programs are segmented automatically by
the compiler or assembler.

For example, a C compiler will create separate segments for:


1. the code of each function
2. the local variables for each function
3. the global variables.

Main

Func 1 Func 2

Data 1
Data 1

Data 3

Lecture Notes by Uğur Halıcı 26


EE442 Operating Systems Ch. 3 Memory Management

For logical to physical address mapping, a segment table is used. When a logical address
<s, d> is generated by the processor:
1. Base and limit values corresponding to segment s are determined using the
segment table
2. The OS checks whether d is in the limit. (0 d < limit)
3. If so, then the physical address is calculated as (base + d), and the memory is
accessed.

logical address
s d

base

d
acess the Segment S
YES word at
0 d < limit physical
Seg. # Limit base Attr. address =
NO base + d

ERROR

Example 3.5

Generate the memory map according to the given segment table. Assume the generated
logical address is <1,123>; find the corresponding physical address.

Segment Limit Base physical


memory
0 1500 1000
0
1 200 5500
2 700 6000 1000
3 2000 3500 s0
2500
3500
Now, check segment table entry for segment 1. s3
The limit for segment 1 is 200. Since 123 < 200,
we carry on. The physical address is calculated as s1 5500
5500 + 123 = 5623, and the memory word 5623 is 5700
accessed. s2 6000
6700

Segment tables are also implemented in the main memory or in associative registers, in the
same way it is done for page tables.

Lecture Notes by Uğur Halıcı 27


EE442 Operating Systems Ch. 3 Memory Management

Sharing Segments

Also sharing of segments is applicable as in paging. Shared segments should be read only
and should be assigned the same segment number.

Example 3.6:

Consider a system in which 3 users executing an editor program which is 1500 KB in size,
each having their own data space.

user-1 ST-1
seg lim base
editor 0 1500 1000
1 2000 3500 physical
data-1 memory
0
1000
editor

user-2 2500
ST-2
3500
seg lim base
editor data-1
0 1500 100
1 200 5500
data-2 5500
data-2 5700
data-3 6000
6700

user-3 ST-3
seg lim base
editor 0 1500 100
1 700 6000

data-3

Lecture Notes by Uğur Halıcı 28


EE442 Operating Systems Ch. 3 Memory Management

3.3. Paged segmentation


The idea is to page the segments and eliminate the external fragmentation problem. In
paged segmentation the logical address is made of <s,p,d> triplet. The ST entry for segment
S now, contains:

• the length of segment S


• the base address of the PT for segment S.

There is a separate PT for every segment. On the average, now there is half a page of
internal fragmentation per segment. However, more table space is needed. In the worst
case, again three memory accesses are needed for each memory reference.

The flowchart for accessing a word with logical address <s,p,d> is shown below.

0 pd limit PT for
s pd p d segment S
ST
STBR NO
+
ERROR f
limit base
+
f d

d
f

STBR: Segment Table Base Register

QUESTIONS
1. Why do we need memory management and CPU scheduling algorithms for a multiuser
system ? Can we do without these algorithms? Explain briefly.

2. a. Explain the terms: internal fragmentation and external fragmentation.

b. List the memory management methods discussed, and indicate the types of fragmentation
caused by each method.

Lecture Notes by Uğur Halıcı 29


EE442 Operating Systems Ch. 3 Memory Management

3. Consider a multiuser computer system which uses paging. The system has four
associative registers. the content of these registers and page table for user_12 are given
below:

Page table for user_12 associative registers


0 9 user # page # frame #
1 6 12 3 7
2 15 5 2 18
3 7 12 4 42
4 42 9 0 10

PTLR[12]:5 PTBR[12]:50000
PAGE SIZE :1024 words

For the following logical addresses generated by user_12's program, calculate the physical
addresses, explain how those physical addresses are found, and state the number of
physical memory accesses needed to find the corresponding word in memory. If a given
logical address is invalid, explain the reason.

i. <2,1256> ii. <3,290>


iii. <4,572> iv. <5,290>
v. <0,14>

4. The following memory map is given for a computer system with variable partitioning
memory management.

0 Job Requested memory


J1 (50K)
1) J4 arrives 10K
free (45K) 2) J5 arrives 20K
3) J6 arrives 15K
J2 (40K) 4) J7 arrives 20K
free (10K) 5) J3 leaves
J3 (20K) 6) J8 arrives 50K

free (30K)

Find and draw the resulting memory maps, after each step of the above job sequence is
processed, for :

a. first-fit b. best-fit c. worst-fit

5. Consider a computer system which uses paging. Assume that the system also has
associative registers.

a. Explain how logical addresses are translated to physical addresses.

Lecture Notes by Uğur Halıcı 30


EE442 Operating Systems Ch. 3 Memory Management

b. Calculate the effective memory access time given:

assoc. register access time = 50 nanosec.


memory access time = 250 nanosec.
hit ratio = 80%

c. With the above associative register and memory access times, calculate the minimum hit
ratio to give an effective memory access time less than 320 nanoseconds.

6. A system which utilizes segmentation gives you the ability to share segments. If the
segment numbers are fixed, then a shared segment table is needed, and the ST must be
modified. We shall assume that the system is capable of dynamic relocation, but to reduce
the overhead, we want to avoid it unless it is absolutely necessary. The following example is
given for such a system :

ST-6 ST-9
s# base size shares s# base size shares
0 - - 256 0 190 100 -
1 0 100 - 1 - - 256
2 100 90 - 2 290 10 -
3 600 15 -

SST
s# base size no. of sh.
256 400 200 2

Assume maximum number of segments per process is 256, and segments are numbered
starting with 0.

a. What would be done when a segment previously unshared, becomes a shared segment?

b. When do we need dynamic relocation in this system?

c. Assume segment-2 of process 6 is being executed. A reference is made to segment-0 of


process 6. How is the corresponding physical address going to be found?

d. How would self-references within shared segments be handled?

e. What is the no. of sharers field in SST used for?

7. In the X-2700 computer, logical addresses are 24 bits long. The machine implements
paged segmentation with a maximum segment size of 64K words and 4K-word pages:

a. Show the logical address structure indicating the segment, page and displacement bits.

b. How many segments can a user process contain?

c. If a process has to be fully loaded into memory to execute, what is the minimum physical
memory capacity?

d. If the memory unit contains 65536 frames, show the physical address structure.

Lecture Notes by Uğur Halıcı 31


EE442 Operating Systems Ch. 3 Memory Management

e. Show the functional block structure of a suitable architecture for implementing paged
segmentation in the X-2700. Indicate the sizes of all necessary tables.

8. Given the memory map in the figure, where areas not shaded indicate free regions,
assume that the following events occur:

Step event required contiguous P1


memory size (K)
----------------------------------------------- <free> 30 K
i) process 5 arrives 16
P2
ii) process 6 arrives 40
iii) process 7 arrives 20 <free> 20 K
iv) process 8 arrives 14
v) process 5 leaves - P3
vi) process 9 arrives 30
<free> 50 K

P4

a. Draw the memory maps after step (iv) and (vi) using first fit, best-fit and worst-fit allocation
techniques, without compaction

b. Draw the same memory maps as in part (a) if compaction is performed whenever
required. Also show the maps after each compaction.

9. In a paging system , the logical address is formed of 20 bits. the most significant 8 bits
denote the page number, the least significant 12 bits denote the offset. Memory size is 256K
bits.

a. What is the page size (in bits)?

b. What is the maximum number of pages per process?

c. How many frames does this system have?

d. Give the structure of the page table of a process executing in this system. Assume 2 bits
are reserved for attributes.

e. How many bits are required for each page table entry?

f. If physical memory is upgraded to 1024 K bits, how will your answer to c and e change?

10. Consider a segmentation system with the following data given:

STBR=1000
STLR=5
Associative Registers access time = 50 nsec
Memory Access time = 500 nsec

Lecture Notes by Uğur Halıcı 32


EE442 Operating Systems Ch. 3 Memory Management

ST AR
s# base limit s# base limit
0 10000 1000 0 10000 1000
1 12000 2000 1 12000 2000
2 25000 4000
3 15000 8000
4 38000 4000

Assume data can be written into associative registers in parallel with a memory read or write
operation. For replacement of data in associative registers, LRU policy is used.

For each of the following logical addresses, find the corresponding physical address to be
accessed, and the total execution time required for that access, including the time spent for
address translation operations. Also indicate which memory locations are accessed during
address translation. Clearly show all changes made in associative registers.

a. <0,150> b. <0,3700> c. <2,900>


d. <2,3780> e. <5,200> f. <1,200>

11.
P1 Consider the memory map given in the figure. If worst fit
policy is to be applied, then what will be the memory map
<free> 9K after arrival of the processes
P2
P5=3K, P6=5K, P7=7K P8=6K.
<free> 20 K
Indicate if compaction is needed.
P3

<free> 14 K

P4

12. The following memory map is given for a computer system with variable partitioning
memory management.

P1 9K event required contiguous memory size (K)


i) P4 arrives 16
<free> 20 K ii) P5 arrives 40
iii) P6 arrives 20
P2 11K
iv) P7 arrives 14
<free> 10 K
Find and draw the resulting memory maps after the above job
P3 18K sequence is processed completely for

<free> 30 K a. first fit b. best fit c. worst fit

indicating whenever a compaction is needed.

Lecture Notes by Uğur Halıcı 33


EE442 Operating Systems Ch. 4 Virtual Memory

Chapter 4
Virtual Memory

All the memory management policies we have discussed so far, try to keep a number of
processes in the memory simultaneously to allow multiprogramming. But they require an
entire process to be loaded in the memory before it can execute.

With the virtual memory technique, we can execute a process which is only partially loaded
in memory. Thus, the logical address space may be larger than physical memory, and we
can have more processes executing in memory at a time, hence a greater degree of
multiprogramming.

4.1 Demand Paging


Demand paging is the most common virtual memory management system. It is based on the
locality model of program execution. As a program executes, it moves from locality to
locality.

Locality is defined as a set of pages actively used together. A program is composed of


several different localities which may overlap. For example, a small procedure when called,
defines a new locality. A while-do loop when being executed defines a new locality.

As a program executes, it moves from locality to locality.

Procedural languages with while-do, repeat-until, for-do structures (i.e. Pascal, Algol, C, etc.)
have less frequently changing localities than other high-level languages with go-to structure
(i.e. Basic, Fortran).

In demand paging, programs reside on a swapping device commonly known as the backing
store. The backing store, for most of today’s operating systems is a disk.

When the operating system decides to start a new process, it swaps only a small part of this
new process (a few pages) into memory. The page table of this new process is prepared and
loaded into memory, and the valid/invalid bits of all pages that are brought into memory are
set to “valid”. All the other valid/invalid bits are set to “invalid” showing that those pages are
not brought into memory.

If the process currently executing tries to access a page which is not in the memory, a page
fault occurs, and the OS brings that page into the memory. If a process causes page fault
then the following procedure is applied:

1. Trap the OS.


2. Save registers and process state for the current process.
3. Check if the trap was caused because of a page fault and whether the page reference is
legal.

Lecture Notes by Uğur Halıcı 34


EE442 Operating Systems Ch. 4 Virtual Memory

4. If yes, determine the location of the required page on the backing store
5. Find a free frame.
6. Read (swap in) the required page from the backing store into the free frame. (During
this I/O, the processor may be scheduled to some other process)
7. When I/O is completed, restore registers and process state for the process which
caused the page fault and save state of the currently executing process.
8. Modify the corresponding page table entry to show that the recently copied page is now
in memory.
9. Resume execution with the instruction that caused the page fault.

While executing a process, in the case of a page fault, the OS finds the desired page on the
backing store and if there is no free frames, the OS must choose a page in the memory
(which is not the one being used) as a victim, and must swap it out (slow) to the backing
store.

Then, the OS changes the valid/invalid bit for the victim page in the page table to indicate
that it is no longer in memory. It swaps the desired page into newly freed frame, modifies the
frame table, and sets the valid/invalid bit of the new page to valid. The executing process
may now go on.

This operations can be summarized as:

1. Checking the address and finding a free frame or victim page (fast)
2. Swap out the victim page to secondary storage (slow)
3. Swap in the page from secondary storage (slow)
4. Context switch for the process and resume its execution (fast)

In servicing a page fault, the time is spent mainly for swap-out and swap-in. The time
required for other operations are negligible.

Virtual memory can be implemented in:

• paging systems
• paged segmentation systems
• segmentation systems (However, segment replacement is much more sophisticated
than page replacement since segments are of variable size.)

4.2. Performance Calculation of Demand Paging System


If there is no page fault, effective access time is effective memory acces time is

eatNO-PF = emat

If there is a page fault, we have

eatPF = pfst + emat pfst

where pfst is page fault service time. Since emat is very small compared to pfst it can be
ignored.

Now, let p be the probability for a page fault (0 p 1) to occur. Then, the effective access
time must be calculated as follows:

Lecture Notes by Uğur Halıcı 35


EE442 Operating Systems Ch. 4 Virtual Memory

eat = p * eat PF + (1 – p) * eatNO-PF


p * pfst + (1 – p) * emat

Example 3.1:

Effective memory access time for a system is given as 1 microseconds, and the average
page fault service time is given as 10 milliseconds. Let p=0.001. Then, effective access time
is

eat = 0.001 * 10 msec +(1 – 0.001) * 1 sec


= 0.001 * 10 000 sec + 0.999 * 1 sec
10 sec + 1 sec = 11 sec

4.3 Dirty bit

In order to reduce the page fault service time, a special bit called the dirty bit can be
associated with each page.

The dirty bit is set to "1" by the hardware whenever the page is modified. (written into).

When we select a victim by using a page replacement algorithm, we examine its dirty bit. If it
is set, that means the page has been modified since it was swapped in. In this case we have
to write that page into the backing store.

However if the dirty bit is reset, that means the page has not been modified since it was
swapped in, so we don't have to write it into the backing store. The copy in the backing store
is valid.

Let the probability of a page being dirty be d. In this case effective access time becomes:

eat = p * ( (1-d) * swap_in + d * (swap_in + swap_out) ) + (1 – p) * emat

4.4 Page Replacement Algorithms


A page replacement algorithm determines how the victim page (the page to be replaced) is
selected when a page fault occurs. The aim is to minimize the page fault rate.

The efficiency of a page replacement algorithm is evaluated by running it on a particular


string of memory references and computing the number of page faults.

Reference strings are either generated randomly, or by tracing the paging behavior of a
system and recording the page number for each logical memory reference.

The performance of a page replacement algorithm is evaluated by running it on a particular


string of memory references and computing the number of page faults.

Consecutive references to the same page may be reduced to a single reference, because
they won't cause any page fault except possibly the first one:

(1,4,1,6,1,1,1,3) (1,4,1,6,1,3).

Lecture Notes by Uğur Halıcı 36


EE442 Operating Systems Ch. 4 Virtual Memory

We have to know the number of page frames available in order to be able to decide on the
page replacement scheme of a particular reference string. Optionally, a frame allocation
policy may be followed.

4.4.1 Optimal Page Replacement Algorithm (OPT)

In this algorithm, the victim is the page which will not be used for the longest period. For a
fixed number of frames, OPT has the lowest page fault rate between all of the page
replacement algorithms, but there is problem for this algorithm. OPT is not possible to be
implemented in practice. Because it requires future knowledge. However, it is used for
performance comparison.

Example 4.2

Assume we have 3 frames and consider the reference string below.

Reference string: 5, 7, 6, 0, 7, 1, 7, 2, 0, 1, 7, 1, 0

Show the content of memory after each memory reference if OPT page replacement
algorithm is used. Find also the number of page faults

5 7 6 0 7 1 7 2 0 1 7 1 0
f1 5 5 5 0 0 0 0 0 0 0 0 0 0
f2 7 7 7 7 7 7 2 2 2 7 7 7
f3 6 6 6 1 1 1 1 1 1 1 1
pf 1 2 3 4 same 5 same 6 same same 7 same same

According to the given information, this algorithm generates a page replacement scheme
with 7 page faults.

4.4.2 First-In-First-Out (FIFO)

This is a simple algorithm, and easy to implement. The idea is straight forward: choose the
oldest page as the victim.

Example 4.3

Assume there are 3 frames, and consider the reference string given in example 4.2. Show
the content of memeory after each memory reference if FIFO page replacement algorithm is
used. Find also the number of page faults

5 7 6 0 7 1 7 2 0 1 7 1 0
f1 5 5 5 0 0 0 0 2 2 2 7 7 7
f2 7 7 7 7 1 1 1 0 0 0 0 0
f3 6 6 6 6 7 7 7 1 1 1 1
pf 1 2 3 4 same 5 6 7 8 9 10 same same

10 page faults are caused by FIFO

Lecture Notes by Uğur Halıcı 37


EE442 Operating Systems Ch. 4 Virtual Memory

Belady’s Anomaly

Normally, one would expect that with the total number of frames increasing, the number of
page faults decreases. However, for FIFO, there are cases where this generalization fails.
This is called Belady’s Anomaly.

As an exercise consider the reference string below. Apply the FIFO method and find the
number of page faults considering different number of frames. Then, examine whether the
replacement suffer Belady’s anomaly.

Reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5

4.4.3 Least Recently Used (LRU)

In this algorithm, the victim is the page that has not been used for the longest period. So, this
algorithm makes us be rid of the considerations when no swapping occurs.

The OS using this method, has to associate with each page, the time it was last used which
means some extra storage. In the simplest way, the OS sets the reference bit of a page to
"1" when it is referenced. This bit will not give the order of use but it will simply tell whether
the corresponding frame is referenced recently or not. The OS resets all reference bits
periodically.

Example 4.4

Assume there are 3 frames, and consider the reference string given in example 4.2. Show
the content of memory after each memory reference if LRU page replacement algorithm is
used. Find also the number of page faults

5 7 6 0 7 1 7 2 0 1 7 1 0
f1 5 5 5 0 0 0 0 2 2 2 7 7 7
f2 7 7 7 7 7 7 7 7 1 1 1 1
f3 6 6 6 1 1 1 0 0 0 0 0
pf 1 2 3 4 same 5 same 6 7 8 9 same same

This algorithm resulted in 9 page faults.

4.5 Frame Allocation


In order to be able to decide on the page replacement scheme of a particular reference
string, we have to know the number of page frames available

In page replacement, some frame allocation policies may be followed.

• Global Replacement: A process can replace any page in the memory.

• Local Replacement: Each process can replace only from its own reserved set of
allocated page frames.

In case of local replacement, the operating system should determine how many frames
should the OS allocate to each process. The number of frames for each process may be
adjusted by using two ways:

Lecture Notes by Uğur Halıcı 38


EE442 Operating Systems Ch. 4 Virtual Memory

• Equal Allocation: If there are n frames and p processes, n/p frames are allocated to
each process.

• Proportional Allocation: Let the virtual memory size for process p be v(p). Let there
are m processes and n frames. Then the total virtual memory size will be: V = v(p).
Allocate (v(p) /V) * n frames to process p.

Example 4.5

Consider a system having 64 frames and there are 4 processes with the following virtual
memory sizes: v(1) = 16, v(2) = 128, v(3) = 64 and v(4) = 48.

Equal Allocation: Assume that there are n frames, and p processes, then n/p frames are
allocated to each process allocates 64 / 4 = 16 frames to each process.

Proportional Allocation: V = 16 + 128 + 64 + 48 = 256. It allocates:

(16 / 256) * 64 = 4 frames to process 1,


(128 / 256) * 64 = 32 frames to process 2,
(64 / 256) * 64 = 16 frames to process 3,
(48 / 256) * 64 = 12 frames to process 4.

4.6.Thrashing
A process is thrashing if it is spending more time for paging in/out (due to frequent page
faults) than executing.

Thrashing causes considerable degradation in system performance. If a process does not


have enough number of frames allocated to it, it will issue a page fault. A victim page must
be chosen, but if all pages are in active use. So, the victim page selection and a new page
replacement will be needed to be done in a very short time. This means another page fault
will be issued shortly, and so on and so forth.

In case a process thrashes, the best thing to do is to suspend its execution and page out all
its pages in the memory to the backing store.

Local replacement algorithms can limit the effects of thrashing. If the degree of
multiprogramming is increased over a limit, processor utilization falls down considerably
because of thrashing.

Processor
Utilization Thrashing

Degree of multiprogramming

To prevent thrashing, we must provide a process as many frames as it needs. For this, a
model called the working set model is developed which depends on the locality model of
program execution. But here we only mention its name and skip its details to limit the scope.

Lecture Notes by Uğur Halıcı 39


EE442 Operating Systems Ch. 4 Virtual Memory

4.7 Working Set Model


To prevent thrashing, we must provide a process as many frames as it needs. For this, we
shall use the working set model, which depends on the locality model of program execution,
discussed earlier.

We shall use a parameter, , called the working set window size. We shall examine the last
page references.

The set of pages in the last page references shall be called the working set of a process.

Example 4.6 : Assume = 10 , and consider the reference string given below, on which
the window is shown at difeferent time instants

2 2 1 5 7 7 7 7 5 1 3 4 4 4 3 4 3 4 4 4

t1 t2 t3

Working sets of this process at these time instants will be:

WS(t1) = {2,1,5,7}
WS(t2) = {7,5,1,3,4}
WS(t3) = {3,4}

Note that in calculating the working sets, we do not reduce consequent references to the
same page to a single reference. Choice of is crucial. If is to small, it will not cover the
entire working set. If it is too large, several localities of a process may overlap. Madnick and
Donovan suggested to be about 10.000 references.

Now, compute the WS size (WSS) for each process, and find the total demand, D of the
system at that instance in time, as the summation of all the WS sizes.

p
D(tnow)= WSSi(tnow)
i 1

If the number of frames is n, then

a. If D > n , the system is thrashing.


b. If D < n, the system is all right, the degree of multiprogramming can possibly be
increased.

In order to be able to use the working set model for virtual memory management, the OS
keeps track of the WS of each process. It allocates each process enough frames to provide
it with its WS.

If at one point D > n, OS selects a process to suspend. The frames that were used by the
selected process are reallocated to other processes.

We can also use the page fault frequency to decide on decreasing or increasing the no. of
frames allocated to a process.

Lecture Notes by Uğur Halıcı 40


EE442 Operating Systems Ch. 4 Virtual Memory

QUESTIONS
1. Consider a demand paging system with associative registers.

a. If the hit ratio for the system is 80%, register access time is 50 nanoseconds and the main
memory access time is 950 nanoseconds, calculate the effective memory access time
(emat). Assume that associative registers can be loaded in parallel with an access to the
main memory.

b. Let the probability of a page fault be 0.005 and calculate the effective access time if the
page fault service time is given as 5 msec. Use the emat calculated in part 'a'.

2. Consider a demand paging system. assume working set window is 7, and the following
reference string is given for process P:

1, 2, 1, 4, 3, 4, 1, 2, 1, 4, 5, 2, 5, 3, 5, 2, 3
tnow

a. What is the current working set of this process?

b. What is the current working set size of this process?

c. What happens if the summation of working set sizes for all the currently executing
processes is larger than the total number of frames in the memory?

d. What would you suggest to solve the problem in 'c'?

3. The following reference string is given for a process executing in a demand paging system
which uses FIFO page replacement algorithm:

4, 3, 2, 1, 4, 3, 5, 4, 3, 2, 1, 5

a. If there are three memory frames allocated to these process, give a picture of all pages in
memory for the given reference string, after each page fault.

b. Repeat part 'a' for four memory frames.

c. Comment on the total number of page faults in part 'a' and part 'b'.

4. Consider a demand paging system with the following observed average device utilization
values:

processor utilization = 30%


disk utilization = 95%
I/O devices utilization = 10%

Discuss whether the following measures can improve the processor utilization: (considering
them one by one)

a. Replace the processor with a slower one.

b. Replace the processor with a faster one.

Lecture Notes by Uğur Halıcı 41


EE442 Operating Systems Ch. 4 Virtual Memory

c. Replace the disk with a faster one.

d. Increase the degree of multiprogramming.

e. Decrease the degree of multiprogramming.

f. Replace the I/O devices with faster ones.

5. In a demand paging system, the page replacement policy is: examine each page in
memory regularly, and swap those pages, which have not been used since the last
examination, to disk. Discuss whether this policy would result in better performance than
LRU page replacement policy or not.

6. Consider a demand paging system. The effective memory access time is 8000
nanoseconds. It takes 9 milliseconds to service a page fault if a free frame is found in
memory, or the page to be replaced is not dirty, and takes 22 milliseconds to service a page
fault if there are no free frames and the victim page is dirty. In 7 out of every 10 page faults,
a victim page has to be chosen and in 5 of these 7 cases, the victim page is dirty. What is
the acceptable page fault rate for this system in order to achieve an effective access time of
no more than 2 milliseconds ?

7. In a virtual memory system, what determines the size of a logical address space? What
determines the size of a physical address space? What (if any) relationship is there between
them?

8. A working set window is the most recent k page references made by a process. Assume
the most recent page references of a process are :

..., 6, 5, 3, 3, 4, 3, 2, 1, 4, 2 and is 7.

a. What is the current working set window of this process?

b. What is the current working set of this process?

9. a. Explain how the LRU page replacement algorithm works.

b. Why do we need extra hardware such as a dirty bit and a reference bit in order to
implement the LRU page replacement algorithm in a demand paging system?

c. A demand paging system with 4 page frames uses the LRU algorithm for page
replacement. Give a diagram of all pages in memory after each page fault, for the following
reference string:

2,4,5,4,5,9,2,7,4,8,9,1,6,2,2,5,3,8

d. In a demand paging system, why is it generally more desirable to replace an unmodified


page than a modified one?

e. Under what circumstances might it be more desirable to replace a modified page?

10. An n_bit counter is associated with each frame in a virtual memory system with f frames
to implement demand paging. Every time any page is referenced by a process , a '1' is
shifted into the most significant bit of the counter associated with the referenced page and

Lecture Notes by Uğur Halıcı 42


EE442 Operating Systems Ch. 4 Virtual Memory

zeros are shifted into the counters of all other frames allocated to the referencing process.
This operation simply discards the least significant bit. The lowest numbered frame with the
lowest counter value is replaced in case of a page fault.

a. With this implementation, show the frame contents after each page reference for the
following reference string for n=2 (i.e. 2 bit counters) and f=4. Assume that initially all frames
are empty with counters=00. There is only one process.

Reference string: 4,3,4,7,2,3,5,8.

b. If this implementation is to be used in a multiprogramming environment with local page


replacement and equal allocation, give an expression for the maximum sensible allocation,
a, per process, in terms of n and f. Also give an expression for the minimum
multiprogramming degree p so that this implementation to function exactly as LRU. Explain
your reasoning.

c. How would your answer to part b. change if the underlined word lowest is changed to
highest in the explanation above ?

11. In each of the following cases, show the memory contents for the given page reference
string, initial memory contents and page replacement policy:

a. Assume that there is single process in the system.

Page reference string: 1,6,1,4,1,2,5,7,3


Initial memory contents
Frame no: 1 2 3 4
Content:: 4 8 3 5

Apply OPT policy.

b. Assume that there are two process in the system say A and B.

Page reference string: A4, A8, B2, B3, A3, B1, A5, A7, B3, B8, B7, B1, A4, A3
Initial memory contents
Frame no: 1 2 3 4
Content: A5 A2 B3 B4

Apply local OPT policy with equal frame allocation

c. Same as b. with global OPT policy

d. Same as b. with local LRU policy

e. For the cases b., c. and d., calculate the average access time if total of page swap in and
swap out time is 40 msec, memory access time is 10 sec and an average of 100 accesses
are made to a page once it is called.

12. A working set demand paging algorithm with window size = 4, ( that is, the last 4 page
references of each process constitute the windows of those processes) is to be applied on
the reference string

time: 1 2 3 4 5 6 7 8 9 10 11 12 13 14
ref: A3 B1 A1 A2 B3 B4 A3 A4 B2 B4 A1 A3 A1 B2

Lecture Notes by Uğur Halıcı 43


EE442 Operating Systems Ch. 4 Virtual Memory

Assuming that this working set size correctly reflects program locality characteristics,

a. write the working sets


i. for t=9
ii. for t=14

b. Under the conditions and at the time instants in part a. should the operating system
increase the multiprogramming degree, or decrease it to improve CPU utilization? Why?

c. Answer part b. if real memory is extended to 7 frames.

13. In a demand paging system with equal local frame allocation policy, a process is
allocated f frames. All these f frames were initially empty. The process has a page reference
string length=s. There are exactly n distinct page numbers in this string. consider OPT, LRU,
and FIFO page replacement algorithms.

a. If n<= f,

i. What is the minimum number of page faults for this process with this reference
string? With which page replacement policy(ies) can this minimum number of page
faults be achieved?

ii. What is the maximum number of page faults for this process with this reference
string? Which page replacement policy(ies) may cause this many page faults?

b. If n>f, repeat i. and ii. of part a..

14. Consider a demand paging system with 3 frames in memory. Page replacement policy is
LRU, and a global replacement is done. However, it is given that each process needs at
least one page in memory for execution. There are two processes, P1 and P2 executing in
the system. Assume the following reference string is given.

(1,4,r) (1,4,w) (1,2,r) (1,3,w) (2,1,r) (2,1,w) (2,2,r) (2,3,w) (1,4,r) (1,2,w) (1,3,r) (2,1,w)

where it means :

(i,j,r): i'th process references page j for a read.


(i,j,w): i'th process references page j for a write.

Find and indicate the page faults caused by this string, and after each page fault, show the
page tables for P1 and P2, indicating the values of valid/invalid and the dirty bits. Assume
both processes have 5 pages.

Initially assume that all the 3 frames are empty. Also assume that the processes have not
completed their execution at the end of the given reference string.

Is this system trashing? If yes, what is the minimum number of frames this system should
have to execute this reference string of P1 and P2 efficiently?

Lecture Notes by Uğur Halıcı 44


EE442 Operating Systems Ch. 4 Virtual Memory

15. Consider a computer system which uses demand paging system. Assume the system
also has associative registers to implement the page table. Given :

Associative register access time=100 nanoseconds


Main memory access time = 300 nanoseconds
Hit ratio in associative registers =80%.
Page fault ratio = 0.1%
Page-fault service time= 200 000 nanoseconds

a. What is the total memory access time if there is a hit in associative registers and no page
fault?

b. What is the total memory access time if there is a miss in associative registers but no
page fault?

c. What is the effective memory access time, no page fault?

d. What is the effective access time when page faults are considered?

16. In a system having 4 frames, consider the following page references of process A and B.

time: 1 2 3 4 5 6 7 8 9 10 11 12
ref: A1 A2 B1 B2 A1 A3 B3 B1 A1 A3 B1 B4

Show the content of memory, and indicate whenever there is a page fault if

a. FIFO algorithm is used with global replacement


b. FIFO algorithm is used with local replacement with equal allocation
c. repeat a. with LRU
d. repeat b. with LRU

17. A working set demand paging algorithm with window size=4 (that is, last 4 page
references of each process constitute the windows of those processes) is to be applied on
the reference string

time: 1 2 3 4 5 6 7 8 9 10 11 12
ref: A1 A2 B1 B2 A1 A3 B3 B1 A1 A3 B1 B4

a. write the working sets

i. for t=8, ii. for t=12.

b. if the system has 4 frames, decide on if the system is trashing or not for

i. t=8, and ii. t=12

Lecture Notes by Uğur Halıcı 45


EE442 Operating Systems Ch. 4 Virtual Memory

18. Consider a computer system having virtual memory with paging. Assume that the
system also has the associative registers. Let:

associative register access time=50 nanosec


memory access time=350 nanosec
hit ratio=90%
swap_in= 100000 msec
swap_out= 200000 msec

a. calculate effective memory access time if no page fault occurs

b. calculate effective access time if page-fault rate=0.001

c. calculate effective access time if dirty bit is used, and the probability for a page being dirty
is %60

19. In a system using working set model for frame allocation, explain what happens if the
window size is chosen

a. smaller than the proper size

b. larger than the proper size

Lecture Notes by Uğur Halıcı 46


EE 442 Operating Systems Ch. 5 Deadlocks

Chapter 5
Deadlocks

5.1 Definition
In a multiprogramming system, processes request resources. If those resources are being
used by other processes then the process enters a waiting state. However, if other
processes are also in a waiting state, we have deadlock.

The formal definition of deadlock is as follows:

Definition: A set of processes is in a deadlock state if every process in the set is waiting for
an event (release) that can only be caused by some other process in the same set.

Example 5.1

Process-1 requests the printer, gets it


Process-2 requests the tape unit, gets it Process-1 and
Process-1 requests the tape unit, waits Process-2 are
Process-2 requests the printer, waits deadlocked!

In this chapter, we shall analyze deadlocks with the following assumptions:

• A process must request a resource before using it. It must release the resource after
using it. (request use release)

• A process cannot request a number more than the total number of resources available in
the system.

For the resources of the system, a resource table shall be kept, which shows whether each
process is free or if occupied, by which process it is occupied. For every resource, queues
shall be kept, indicating the names of processes waiting for that resource.

A deadlock occurs if and only if the following four conditions hold in a system
simultaneously:

1. Mutual Exclusion: At least one of the resources is non-sharable (that is; only a limited
number of processes can use it at a time and if it is requested by a process while it is
being used by another one, the requesting process has to wait until the resource is
released.).

Lecture Notes by Uğur Halıcı 47


EE 442 Operating Systems Ch. 5 Deadlocks

2. Hold and Wait: There must be at least one process that is holding at least one
resource and waiting for other resources that are being hold by other processes.

3. No Preemption: No resource can be preempted before the holding process


completes its task with that resource.

4. Circular Wait: There exists a set of processes: {P1, P2, ..., Pn} such that

P1 is waiting for a resource held by P2


P2 is waiting for a resource held by P3
...
Pn-1 is waiting for a resource held by Pn
Pn is waiting for a resource held by P1

Methods for handling deadlocks are:

• Deadlock prevention
• Deadlock avoidance
• Deadlock detection and recovery.

5.2 Resource Allocation Graphs

Resource allocation graphs are drawn in order to see the allocation relations of processes
and resources easily. In these graphs, processes are represented by circles and resources
are represented by boxes. Resource boxes have some number of dots inside indicating
available number of that resource, that is number of instances.

pi rj Process pi is waiting for resource rj

pi rj Process pi has allocated resource rj

• If the resource allocation graph contains no cycles then there is no deadlock in the
system at that instance.

• If the resource allocation graph contains a cycle then a deadlock may exist.

• If there is a cycle, and the cycle involves only resources which have a single
instance, then a deadlock has occurred.

Lecture Notes by Uğur Halıcı 48


EE 442 Operating Systems Ch. 5 Deadlocks

Example 5.2

r1 r3

p2

p1
p3
r2

There are three cycles, so a deadlock may exists. Actually p1, p2 and p3 are deadlocked

Example 5.3

p1

r1 r2

p2
p3 p4

There is a cycle, however there is no deadlock. If p4 releases r2, r2 may be allocated to p3,
which breaks the cycle.

5.3 Deadlock Prevention


To prevent the system from deadlocks, one of the four discussed conditions that may create
a deadlock should be discarded. The methods for those conditions are as follows:

Mutual Exclusion:

In general, we do not have systems with all resources being sharable. Some resources like
printers, processing units are non-sharable. So it is not possible to prevent deadlocks by
denying mutual exclusion.

Hold and Wait:

One protocol to ensure that hold-and-wait condition never occurs says each process must
request and get all of its resources before it begins execution.

Another protocol is “Each process can request resources only when it does not occupies any
resources.”

Lecture Notes by Uğur Halıcı 49


EE 442 Operating Systems Ch. 5 Deadlocks

The second protocol is better. However, both protocols cause low resource utilization and
starvation. Many resources are allocated but most of them are unused for a long period of
time. A process that requests several commonly used resources causes many others to
wait indefinitely.

No Preemption:

One protocol is “If a process that is holding some resources requests another resource and
that resource cannot be allocated to it, then it must release all resources that are currently
allocated to it.”

Another protocol is “When a process requests some resources, if they are available, allocate
them. If a resource it requested is not available, then we check whether it is being used or it
is allocated to some other process waiting for other resources. If that resource is not being
used, then the OS preempts it from the waiting process and allocate it to the requesting
process. If that resource is used, the requesting process must wait.” This protocol can be
applied to resources whose states can easily be saved and restored (registers, memory
space). It cannot be applied to resources like printers.

Circular Wait:

One protocol to ensure that the circular wait condition never holds is “Impose a linear
ordering of all resource types.” Then, each process can only request resources in an
increasing order of priority.

For example, set priorities for r1 = 1, r2 = 2, r3 = 3, and r4 = 4. With these priorities, if


process P wants to use r1 and r3, it should first request r1, then r3.

Another protocol is “Whenever a process requests a resource rj, it must have released all
resources rk with priority(rk) priority (rj).

5.4 Deadlock avoidance


Given some additional information on how each process will request resources, it is possible
to construct an algorithm that will avoid deadlock states. The algorithm will dynamically
examine the resource allocation operations to ensure that there won't be a circular wait on
resources.

When a process requests a resource that is already available, the system must decide
whether that resource can immediately be allocated or not. The resource is immediately
allocated only if it leaves the system in a safe state.

A state is safe if the system can allocate resources to each process in some order avoiding
a deadlock. A deadlock state is an unsafe state.

Example 5.4

Consider a system with 12 tape drives. Assume there are three processes : p1, p2, p3.
Assume we know the maximum number of tape drives that each process may request:

p1 : 10, p2 : 4, p3 : 9

Suppose at time tnow, 9 tape drives are allocated as follows :

Lecture Notes by Uğur Halıcı 50


EE 442 Operating Systems Ch. 5 Deadlocks

p1 : 5, p2 : 2, p3 : 2

So, we have three more tape drives which are free.

This system is in a safe state because it we sequence processes as: <p2, p1, p3>, then p2
can get two more tape drives and it finishes its job, and returns four tape drives to the
system. Then the system will have 5 free tape drives. Allocate all of them to p1, it gets 10
tape drives and finishes its job. p1 then returns all 10 drives to the system. Then p3 can get
7 more tape drives and it does its job.

It is possible to go from a safe state to an unsafe state:

Example 5.5

Consider the above example. At time tnow+1, p3 requests one more tape drive and gets it.
Now, the system is in an unsafe state.

There are two free tape drives, so only p2 can be allocated all its tape drives. When it
finishes and returns all 4 tape drives, the system will have four free tape drives.

p1 is allocated 5, may request 5 more has to wait


p3 is allocated 3, may request 6 more has to wait

We allocated p3 one more tape drive and this caused a deadlock.

Banker's Algorithm (Dijkstra and Habermann)

It is a deadlock avoidance algorithm. The following data structures are used in the algorithm:

m = number of resources
n = number of processes

Available [m] : One dimensional array of size m. It indicates the number of available
resources of each type. For example, if Available [i] is k, there are k instances of resource ri.

Max [n,m] : Two dimensional array of size n*m. It defines the maximum demand of each
process from each resource type. For example, if Max [i,j] is k, process pi may request at
most k instances of resource type rj.

Allocation [n,m] : Two dimensional array of size n*m. It defines the number of resources of
each type currently allocated to each process.

Need [n,m] : Two dimensional array of size n*m. It indicates the remaining need of each
process, of each resource type. If Need [i,j] is k, process pi may need k more instances of
resource type rj. Note that Need [i,j] = Max [i,j] - Allocation [i,j].

Request [n,m] : Two dimensional array of size n*m. It indicates the pending requests of each
process, of each resource type.

Now, take each row vector in Allocation and Need as Allocation(i) and Need(i). (Allocation(i)
specifies the resources currently allocated to process pi. )

Lecture Notes by Uğur Halıcı 51


EE 442 Operating Systems Ch. 5 Deadlocks

Define the relation between two vectors X and Y , of equal size = n as :

X Y X [ i ] Y [ i ] , i = 1,2, ..., n
X! Y X [ i ] > Y [ i ] for some i

The algorithm is as follows:

1. Process pi makes requests for resources. Let Request(i) be the corresponding request
vector. So, if pi wants k instances of resource type rj, then Request(i)[j] = k.

2. If Request(i) ! Need(i), there is an error.

3. Otherwise, if Request(i) ! Available, then pi must wait.

4. Otherwise, Modify the data structures as follows :

Available = Available - Request(i)


Allocation(i) = Allocation(i) + Request(i)
Need(i) = Need(i) - Request(i)

5. Check whether the resulting state is safe. (Use the safety algorithm presented below.)

6. If the state is safe, do the allocation. Otherwise, pi must wait for Request(i).

Safety Algorithm to perform Step 5:

Let Work and Finish be vectors of length m and n, respectively.

1. Initialize Work = Available, Finish [j] = false, for all j.

2. Find an i such that Finish [ i ] = false and Need(i) Work

If no such i is found, go to step 4.

3. If an i is found, then for that i, do :

Work = Work + Allocation(i)


Finish [i] = true

Go to step 2.

4. If Finish [j] = true for all j, then the system is in a safe state.

Banker's algorithm is O(m (n2)).

Lecture Notes by Uğur Halıcı 52


EE 442 Operating Systems Ch. 5 Deadlocks

Example 5.6: (Banker's algorithm)

Given

Available = 1 4 1

1 3 1
Max =
1 4 1

0 0 0
Allocation=
0 0 0

1 3 1
Need =
1 4 1

1 2 0
Request =
0 2 1

Request(1) is to be processed. If it is satisfied data would become:

Available = 0 2 1

1 2 0
Allocation=
0 0 0

0 1 1
Need =
1 4 1

Now, apply the safety algorithm:

Work = [ 0 2 1 ]

false
Finish =
false

i=1:

Need(1) = [ 0 1 1 ] Work ? Yes.


Work = Work + Allocation(1) = [ 1 4 1 ]
Finish (1) = true

i=2:

Need(2) = [ 1 4 1 ] Work ? Yes.


Work = Work + Allocation(2) = [ 1 4 1 ]
Finish (2)= true

Lecture Notes by Uğur Halıcı 53


EE 442 Operating Systems Ch. 5 Deadlocks

System is in a safe state, so do the allocation. If the algorithm is repeated for Request(2),
the system will end up in an unsafe state.

5.5 Deadlock Detection

If a system has no deadlock prevention and no deadlock avoidance scheme, then it needs a
deadlock detection scheme with recovery from deadlock capability. For this, information
should be kept on the allocation of resources to processes, and on outstanding allocation
requests. Then, an algorithm is needed which will determine whether the system has
entered a deadlock state. This algorithm must be invoked periodically.

Deadlock Detection Algorithm (Shoshani and Coffman)

Data Structure is as:

Available [m]
Allocation [n,m] as in Banker's Algorithm
Request [n,m] indicates the current requests of each process.

Let Work and Finish be vectors of length m and n, as in the safety algorithm.

The algorithm is as follows:

1. Initialize Work = Available


For i = 1 to n do
If Allocation(i) = 0 then Finish[i] = true else Finish[i] = false

2. Search an i such that

Finish[i] = false and Request(i) Work

If no such i can be found, go to step 4.

3. For that i found in step 2 do:

Work = Work + Allocation(i)


Finish[i] = true

Go to step 2.

4. If Finish[i] true for a some i then the system is in deadlock state else the system is safe

Lecture Notes by Uğur Halıcı 54


EE 442 Operating Systems Ch. 5 Deadlocks

Example 5.7

Examine whether the system whose resource allocation graph is given below is deadlocked
or not.

r1 r3

p1

p3
p2
r2

p4

First, let’s form the required structures:

Available = [0 0 0]

1 0 0
0 1 0
Allocation =
0 0 1
0 0 1

0 1 1
0 0 1
Request =
1 0 0
0 0 0

False
False
Finish =
False
False

Work = [0 0 0]

Request(4) Work i = 4:

Work = Work + Allocation(4) = [0 0 0] + [0 0 1] = [0 0 1] ;


Finish[4] = True

Request(2) Work i = 2:

Lecture Notes by Uğur Halıcı 55


EE 442 Operating Systems Ch. 5 Deadlocks

Work = Work + Allocation(2) = [0 0 1] + [0 1 0] = [0 1 1] ;


Finish[2] = True

Request(1) Work i = 1:

Work = Work + Allocation(1) = [0 1 1] + [1 0 0] = [1 1 1] ;


Finish[1] = True

Request(3) Work i = 3:

Work = Work + Allocation(3) = [1 1 1] + [0 0 1] = [1 1 2] ;


Finish[3] = True

Since Finish[i] = true for all i, there is no deadlock in the system .

Recovery From Deadlock

If the system is in a deadlock state, some methods for recovering it from the deadlock state
must be applied. There are various ways for recovery:

• Allocate one resource to several processes, by violating mutual exclusion.


• Preempt some resources from some of the deadlocked processes.
• Abort one or more processes in order to break the deadlock.

If preemption is used:

1. Select a victim. (Which resource(s) is/are to be preempted from which process?)

2. Rollback: If we preempt a resource from a process, roll the process back to some safe
state and mak it continue.

Here the OS may be probably encounter the problem of starvation. How can we guarantee
that resources will not always be preempted from the same process?

In selecting a victim, important parameters are:

• Process priorities
• How long the process has occupied?
• How long will it occupy to finish its job
• How many resources of what type did the process use?
• How many more resources does the process need to finish its job?
• How many processes will be rolled back? (More than one victim may be selected.)

For rollback, the simplest solution is a total rollback. A better solution is to roll the victim
process back only as far as it’s necessary to break the deadlock. However, the OS needs to
keep more information about process states to use the second solution.

To avoid starvation, ensure that a process can be picked as a victim for only a small number
of times. So, it is a wise idea to include the number of rollbacks as a parameter.

Lecture Notes by Uğur Halıcı 56


EE 442 Operating Systems Ch. 5 Deadlocks

QUESTIONS

1. For each of the following resource allocation graphs, find out and explain whether there is
a deadlock or not

a.
r1 r2

p1

p3
p2
r3

b.

r3
r1 p4

p1
r4 p2

p3

r2

c.

r1 r2

p1
p2 p3
p4

r4
r3

Lecture Notes by Uğur Halıcı 57


EE 442 Operating Systems Ch. 5 Deadlocks

d.
r1

p2

p1 p3 p4

r3

r2

2. A computer system has m resources of the same type and n processes share these
resources. Prove or disprove the following statement for the system:

This system is deadlock free if sum of all maximum needs of processes is less than m+n.

3. There are four processes which are going to share nine tape drives. Their current and
maximum number of allocation numbers are as follows :

process current maximum


p1 3 6
p2 1 2
p3 4 9
p4 0 2

a. Is the system in a safe state? Why or why not?

b. Is the system deadlocked? Why or why not?

Lecture Notes by Uğur Halıcı 58


EE 442 Operating Systems Ch. 5 Deadlocks

4. Given the following resource allocation diagram,

r1

p1 p3

p2 r3
r2

p4

a. If another instance of resource r1 is made available, is the deadlock resolved ? If yes


specify the allocation sequence, if no explain why?

b.& c. repeat part a. for resource r2 and r3.

5. Given that all the resources are identical, they can be acquired and released strictly one
at a time, and no process ever needs more than the total resources on the system, state
whether deadlock can occur in each of the following systems. Explain why or how.

Number of Number of
processes resources

a. 1 1
b. 1 2
c. 2 1
d. 2 2
e. 2 3

6. a. What are the four conditions necessay for deadlock to appear?

b. Cinderella and the Prince are getting divorced. To divide their property, they have agreed
on the following algorithm. Every morning, each of one may send a letter to the other's
lawyer requesting one item of property. Since it takes a day for letters to be delivered, they
have agreed that if both discover that they have requested the same item on the same day,
the next day they will send a letter cancelling the request. Among their property is the glass
shoe, their dog Woofer, Woofer's doghouse, their canary Tweeter, Tweeter's cage and a
sword. The animals love their houses, so it has been agreed that any division of property
separating an animal from its house is invalid, requiring the lawyers to negotiate on which
items they already have should be returned back. Unfortunately the lawyers are stubborn
and never agree. Is deadlock or starvation possible in such a scheme? Explain.

c. What happens if it has been agreed that in the case of any division of property separating
an animal from its house the whole division to start over from scratch, instead of letting the
lawyers to discuss. Explain if starvation or deadlock possible now.

Lecture Notes by Uğur Halıcı 59


EE 442 Operating Systems Ch. 5 Deadlocks

7. Given the following resource allocation diagram:

p3
p1 p2

r1 r2 r3 r4
p4

p5

a. Apply the deadlock detection algorithm and either indicate why the system is deadlocked,
or specify a safe allocation sequence.

b. If the process P2 also request 2 instances of resource r1, does the system enter a
deadlock? Why?

c. If a deadlock occurs in part a. and/or b., killing which process would resolve the
deadlock?

d. If the maximum declared needs are:

process r1 r2 r3 r4
P1 4 0 0 0
P2 1 3 1 1
P3 0 0 2 1
P4 1 1 1 0
P5 1 0 1 1

does the current allocation given in part a constitute a safe state? Why?

Lecture Notes by Uğur Halıcı 60


EE 442 Operating Systems Ch. 5 Deadlocks

8. Explain if the system in the figure is dedlocked. If not, give an execution order of the
processess which successfully terminates.

r2
r1

p3
p1 p5
p4 r4

r3

p2

9. a. Explain if the following system is deadlocked or not?

b. For the following resource allocation graph, for deadlock detection show the current
contents of the AVAILABLE, ALLOCATION, REQUEST .

r1
r2

p1 p2
p3 p4

p5
r3 r4

Lecture Notes by Uğur Halıcı 61


EE442 Operating Systems Ch. 6 Interprocess Communication

Chapter 6
Interprocess Communication

6.1 Introduction
Processes frequently need to communicate with other processes. For example, in a shell
pipeline, the output of the first process must be passed to the second process, and so on
down the line. Thus, there is a need for communication between processes, preferably in a
well-structured way not using interrupts. Because, interrupts decrease system performance.
That communication between processes in the control of the OS is called as Interprocess
Communication or simply IPC.

In some operating systems, processes that are working together often share some common
storage area that each can read and write. To see how IPC works in practice, let us consider
a simple but common example, a print spooler. When a process wants to print a file, it enters
the file name in a special spooler directory. Another process, the printer daemon, periodically
checks to see if there are any files to be printed, and if there are, it sends them to printer and
removes their names from the directory.

Assume that our spooler directory has a large number of slots, numbered 0, 1, 2, …, each
one capable of holding a file name. Also suppose we have two shared variables, out, which
points to the next file to be printed, and in, which points to the next free slot in the directory.
These two variables might be kept on a two-word file available to all processes. Think of a
certain instant, slots 0 to 3 are empty (the files have already been printed) and slots 4 to 6
are dull (with the file names to be printed). More or less frequently, processes A and B
decide they want to queue a file for printing as illustrated below.


4 abc.p in = 4
Process A 5 prog.c
6 prog.n
7 out = 7
Process B

Spooler Directory

The following might happen about the printing requests of two processes. Process A reads
in and stores the value, 7, in a local variable next_free_slot. Just then a clock interrupt
occurs and the processor decides that process A has run long enough, so it switches to
process B. Process B also reads in, and also gets a 7, so it stores the name of its file in slot
7 and updates in to be an 8. Then it goes off and does other things.

Eventually, process A runs again, starting from the place it left off. It looks at next_free_slot,
in its local variable finds a 7 there, and writes its file name in slot 7, erasing the name that
process B just put there. Then, it computes next_free_slot + 1, which is 8, and sets in to 8.

Lecture Notes by Uğur Halıcı 62


EE442 Operating Systems Ch. 6 Interprocess Communication

The spooler directory is now internally consistent, so the printer daemon will not notice
anything wrong, but process B will never get an output. Situations like this, where two or
more processes are reading or writing some shared data and the final result depends on
who runs precisely when, are called race conditions .

The key for preventing trouble here and in many other situations involving shared memory,
shared files, and shared everything else, is to find some way to prohibit more than one
process from reading and writing the shared data at the same time. Put in other words, what
we need is mutual exclusion (some way of making sure that if one process is using a shared
variable or file, the other process will be excluded from doing the same thing. The difficulty
above occurred because process B started using one of the shared variables before process
A was finished with it.

6.2 Critical Section Problem


The problem of avoiding race conditions can also be formulated in an abstract way. Part of
the time, a process is busy doing internal computations and other things that do not lead to
race conditions. However, sometimes a process may be accessing shared memory or files,
or doing other critical things that can lead to races. That part of the program where the
shared memory is accessed is called the critical section (CS). If we could arrange matters
such that no two processes were ever in their critical sections at the same time, we could
avoid race conditions.

Although this requirement avoids race conditions, this is not sufficient for having parallel
processes cooperate correctly and efficiently using shared data. We need four conditions to
hold to have a good solution:

1. No two processes may be simultaneously inside their critical sections.


2. No assumptions may be made about speeds or the number of processors.
3. No process running outside its CS may block other processes.
4. No process should have to wait forever to enter its CS.

In this section, we will examine various proposals for achieving mutual exclusion, so that
while one process is busy updating the shared memory in its CS, no other process will enter
its own CS region and cause problem. In our discussions we will consider two process pi
(i=0 and i=1) in the form:

{common variable declarations and initializations}

Pi:
{
while (TRUE) {
{CS entry code}
CS( ) ;
{CS exit code}
Non-CS( ) ;
}
}

Lecture Notes by Uğur Halıcı 63


EE442 Operating Systems Ch. 6 Interprocess Communication

6.2.1 Disabling Interrupts

The simplest solution is to have each process disable all interrupts just after entering its CS
and re-enable them just before leaving it. With interrupts disabled, the processor can not
switch to another process. Thus, once a process has disabled interrupts, it can examine and
update the shared memory without fear that any other process will intervene.

This approach is generally unattractive because it is unwise to give user processes the
power to turn off interrupts. Suppose one of them did it, and never turned them on again.
That can be the end of the system. Furthermore, if the system has more than one processor,
this method will fail again since the process can disable the interrupts of the processor it is
being executed by.

6.2.2 Lock Variables

As a second attempt, let us look for a software solution. Consider having a single, shared
lock variable initialized to 0. When a process wants to enter its CS, it first tests the lock. If the
lock is 0, the process sets it to 1 and enters the CS. If the lock is already 1, the process just
waits until it becomes 0.

#define FALSE 0
#define TRUE 1
int lock=FALSE

PO: P1:
{ {
while (TRUE) { while (TRUE) {
while (lock) { }; /* wait */ while (lock) { }; /* wait */
lock=TRUE; lock=TRUE;
CS( ) ; CS( ) ;
lock=FALSE; lock=FALSE;
Non-CS( ) ; Non-CS( ) ;
} }
} }

Unfortunately, this idea contains a fatal flaw. Suppose one process reads the lock and sees
that it is 0. Before it can set the lock to 1, another process is scheduled, runs, and sets the
lock to 1. When the first process runs again, it will also set the lock to 1, and two processes
will be in their CSs at the same time. This is the situation we saw in our printer spooler
example.

Now, it may be thought that we could get around this problem by first reading the lock value,
then checking it again just before storing into it However this solution really does not help.
The race now occurs if the second process modifies the lock just after the first process has
finished its second check.

Lecture Notes by Uğur Halıcı 64


EE442 Operating Systems Ch. 6 Interprocess Communication

int lock=FALSE

PO: P1:
{ {
while (TRUE) { while (TRUE) {
lock=TRUE; lock=TRUE;
while (lock) { }; /* wait */ while (lock) { }; /* wait */
CS( ) ; CS( ) ;
lock=FALSE; lock=FALSE;
Non-CS( ) ; Non-CS( ) ;
} }
} }

6.2.3 Strict Alternation

A third approach to the mutual exclusion problem is given below:

int turn=0

PO: P1:
{ {
while (TRUE) { while (TRUE) {
while (turn ! = 0) { } ; /* wait */ while (turn != 1) { } ; /* wait */
CS( ) ; CS( ) ;
turn = 1; turn = 0;
Non-CS( ) ; Non-CS( ) ;
} }
} }

Here, the integer variable turn, initially 0, keeps track of whose turn it is to enter the CS and
examine or update the shared memory. Initially, process 0 inspects turn, finds it to be 0, and
enters its CS. Process 1 also finds it to be 0, and therefore sits in a tight loop continually
testing turn to see when it becomes 1. Continuously testing a variable waiting for some value
to appear is called busy waiting. It should usually be avoided, since it wastes processor
time.

When process 0 leaves the CS, it sets turn to 1, to allow process 1 to enter its CS. Suppose
process 1 finishes its CS quickly, so both processes are in their non-CS sections, with turn
set to 0. Now process 0 executes its whole loop quickly, coming back to its non-CS with turn
set to 1. At this point, process 0 finishes its non-CS and goes back to the top of its loop.
Unfortunately, it is not permitted to enter its CS now, because turn is 1 and process 1 is busy
with its non-CS. This situation violates condition 3 set out before; process 0 is being blocked
by a process not in its CS. Therefore, taking turns is not a good idea when one of the
processes is much slower than the other.

6.2.4 Peterson’s Solution

By combining the idea of taking turns with the idea of lock variables and warning variables,
in 1965, a Dutch mathematician, T. Dekker, was the first one to devise a software solution to

Lecture Notes by Uğur Halıcı 65


EE442 Operating Systems Ch. 6 Interprocess Communication

the mutual exclusion problem that does not require strict alternation. In 1981, G.L. Peterson
discovered a much simpler way to achieve mutual exclusion, thus rendering Dekker’s
algorithm obsolete.

Before entering its CS, each process calls CS_entry with its own process number, 0 or 1, as
parameter. This call will cause it to wait, if need be, until it is safe to enter. After it has
finished with the shared variables, the process calls CS_exit to indicate that it is done and to
allow the other process to enter, if it so desires.

Let us see how this solution works. Initially, neither process is in its CS. Now process 0 calls
CS_entry. It indicates its interest by setting its array element, and sets turn to 0. Since
process 1 is not interested, CS_entry returns immediately. If process 1 now calls CS_entry, it
will hang there until interested[0] goes to FALSE, an event that only happens when process
0 calls exit.
Now consider the case that both processes call enter_region almost simultaneously. Both
will store their process number in turn. Whicever store is done last is the one that counts; the
first one is lost. Suppose process 1 stores last, so turn is 1. When both processes come to
the while statement, process 0 executes it zero times, and enters its CS. Process 1 loops
and does not enter its CS.

... /* required header files */


#define FALSE 0
#define TRUE 1
#define N 2 /* number of processes */

int turn; /* whose turn is it? */


int interested[N] /* all elements initialized to FALSE */

void CS_entry (int process) /* process: Who is entering (0 or 1)? */


{
int other; /* number of the other process */
other = 1 – process; /* the opposite of process */
interested[process] = TRUE; /* show that you are interested */
turn = process; /* set flag */
while (turn == process) &&
(interested[other] == TRUE ) { }; /* wait */
}

void CS_exit (int process) /* process: Who is leaving (0 or 1) ? */


{
interested[process] = FALSE; /* indicate departure from CS */
}

This method is correct for mutual exclusion but it wastes the processor time. Furthermore, it
can have unexpected effects. Consider a system with two processes, H with high priority and
L, with low priority. The scheduling rules are such that H is run whenever it is in ready state.
At a certain moment, with L in CS, H becomes ready to run. H now begins busy waiting, but
since L is never scheduled while H is running, L never gets the chance to leave its CS, so H
loops forever. This situation is referred to as priority inversion problem.

Lecture Notes by Uğur Halıcı 66


EE442 Operating Systems Ch. 6 Interprocess Communication

6.2.5 Special Hardware Instructions

Followings are special hardware instructions that may be used for the solution of CS
problem. These operations are assumed to be atomically executed (in one machine cycle).

Test-and-set

The instruction test-and-set(int lock) returns true if lock is true, otherwise it first sets the
value of lock to true and returns false. It can be used for the solution of the CS problem as
follows:

int lock=0; /*global */

PO: P1:
{ {
while (TRUE) { while (TRUE) {
while test-and-set(lock) { } ; while test-and-set(lock) { };
CS CS
lock = FALSE; lock = FALSE;
non-CS; non-CS;
} }
} }

Swap

The instruction swap (int lock, int key) interchanges the content of its parameters. Again,
lock is used as a global boolean variable, initialized to false.

int lock=FALSE; /*global */

PO: P1:
{ int key { int key
while (TRUE) { while (TRUE) {
key=TRUE key=TRUE
while (KEY) swap (lock,key); while (LOCK) swap (lock,key);
CS CS
lock = FALSE; lock = FALSE;
non-CS; non-CS;
} }
} }

6.2.6. Semaphores

E. W. Dijkstra suggested using an integer variable for IPC problems. In his proposal, a new
variable type, called a semaphore, was introduced. Dijkstra proposed having two atomic
operations, DOWN and UP (P and V in Dijkstra’s original paper). The DOWN operation on a
semaphore checks to see if the value is greater than 0. If so, it decrements the value and

Lecture Notes by Uğur Halıcı 67


EE442 Operating Systems Ch. 6 Interprocess Communication

just continues. If the value is 0, the process is put to sleep. Checking the value, changing it,
and possibly going to sleep is all done as a single, indivisible action (this is why these
operations are called atomic.). It is guaranteed that once a semaphore operation has started,
no other process can access the semaphore until the operation has completed or blocked.
This atomicity is absolutely essential to solving synchronization problems and avoiding race
conditions. The UP operation increments the value of the semaphore addressed.

typedef int semaphore; /* semaphores are a special kind of int */


semaphore s=0;

void down(semaphore &s)


{
while (s <= 0) { } ;
s = s – 1;
}

void up(semaphore &s)


{
s=s+1;
}

If two processes try to execute up(s) or down(s) simultaneously, these operations will be
executed sequentially in an arbitrary order.

Semaphores can be used for CS problem as follows

semaphore mutex = 1; /* controls access to CS */

proces_i (void) /* n-processes */


{ while (TRUE) {
down(mutex) ; /* CS_entry */
CS
up (mutex) ; /* CS_exit */
non-CS
}
}

6.3 Classical IPC Problems


Besides its usage for CS problem, semaphores can also be used for synchronisation of the
processes. For example consider two concurrent processes: p1, p2. We require that
statement s2 of p2 will be executed after statement s1 of p1. Let p1 and p2 share a common
semaphore 'synch', initialized to 0. Semaphores can be used for this synchronisation
problem as follows :

semaphore synch=0; semaphore synch=0;

void P1(void) void P1(void)


{ ... { ...
down(synch)
S1 S1
} up(synch)
}

Lecture Notes by Uğur Halıcı 68


EE442 Operating Systems Ch. 6 Interprocess Communication

In this section we will examine some well-known IPC problems and their solutions using
semaphores.

6.3.1 The Bounded Buffer (Producer-Consumer) Problem

Assume there are n slots capable of holding one item. Process producer will produce items
to fill slots and process consumer will consume the items in these slots. There is no
information on the relative speeds of processes. Devise a protocol which will allow these
processes to run concurrently. A common buffer whose elements (slots) will be filled/emptied
by the producer/consumer is needed. The consumer should not try to consume items which
have not been produced yet (i.e. the consumer can not consume empty slots). The producer
should not try to put item into filled slots.

# define N 100 /* number of slots in the buffer */

typedef int semaphore; /* semaphores are a special kind of int */

semaphore mutex = 1; /* controls access to CS */


semaphore empty = N; /* counts empty buffer slots */
semaphore full = 0; /* counts full buffer slots */

void producer(void)
{
int item;
while (TRUE) /* infinite loop */
{
produce_item(item) /* generate something to put into buffer */
down(empty); /* decrement empty count */
down(mutex); /* enter CS */
enter_item(item); /* put new item in buffer */
up(mutex); /* leave CS */
up(full); /* increment count of full slots */
}
}

void consumer(void)
{
int item;
while (TRUE) /* infinite loop */
{
down(full); /* decrement full count */
down(mutex); /* enter CS */
remove_item(item); /* take item from buffer */
up(mutex); /* leave CS */
up(empty); /* increment count of empty slots */
consume_item(item); /* do something with the item */
}
}

Lecture Notes by Uğur Halıcı 69


EE442 Operating Systems Ch. 6 Interprocess Communication

6.3.2 The Readers and Writers Problem

Imagine a big database, such as an airline reservation system, with many competing
processes wishing to read and write. It is acceptable to have multiple processes reading the
database at the same time, if one process is writing to the database, no other processes
may have access to the database, not even readers. Following is a solution for this case.

typedef int semaphore;


semaphore mutex = 1; /* controls access to rc */
semaphore db = 1 ; /* controls access to db */
int rc = 0 ; /* no. of processes reading or writing */

void reader(void)
{
while (TRUE)
{
down(mutex); /* get exclusive access to rc */
rc = rc + 1; /* one reader more now */
if (rc == 1) down(db); /* whether this is the first reader */
up(mutex); /* release exclusive access to rc */
read_database(); /* access the data */
down(mutex); /* get exclusive access to rc */
rc = rc – 1; /* one reader fewer now */
if (rc == 0) up(db); /* whether this is the last reader */
up (mutex); /* release exclusive access to rc */
use_data_read(); /* non-CS */
}
}

void writer(void)
{
while (TRUE)
{
think_up_data(); /* non-CS */
down (db); /* get exclusive access */
write_database(); /* update the database */
up(db); /* release exclusive access */
}
}

It is seen in this solution that the readers have priority over writers. If a writer appears while
several readers are in the database, the writer must wait

6.3.3 The Dining Philosophers Problem

There are N philosophers spending their lives thinking and eating in a room. In their round
table there is a plate of infinite rice and N chopsticks. From time to time, a philosopher gets
hungry. He tries to pick up the two chopsticks that are on his right and his left. A philosopher
that picks both chopsticks successfully (one at a time) starts eating. A philosopher may pick
one chopstick at a time. When a philosopher finishes eating, he puts down both of his
chopsticks to their original position and he starts thinking again. The question is to write a
program which does not let any philosopher to die due to hunger (i.e. no deadlocks).

Lecture Notes by Uğur Halıcı 70


EE442 Operating Systems Ch. 6 Interprocess Communication

#define N 6 /* number of philosophers */


semaphore chopstick[N] /* a semaphore for each chopstick,
each to be initialized to 1 */
void philosopher(int i) /* which philosopher (0 to N-1) ? */
{
while (TRUE)
{
think(); /* philosopher is thinking */
down(chopstick[i]); /* take left chopstick */
down(chopstick [(i+1) % N]); /* take right chopstick */
eat(); /* yum-yum, rice */
up(chopstick[i]); /* put left chopstick */
up(chopstick [(i+1) % N]); /* put left chopstick */
}
}

Unfortunately, this program fails in the situation when all philosophers take their left
chopsticks simultaneously. None will able to take their right chopsticks, there will be a
deadlock, and all of them will die because of hunger.

We can modify the program so that after taking the left chopstick, the program checks
whether the right chopstick is available. If it is not, the philosopher puts down the left one,
waits for some time, and then repeats the whole process. This proposal too fails, although
for a different reason. With a little bit of bad luck, all the philosophers could start the
algorithm simultaneously, picking up their left chopsticks, seeing that their right chopsticks
are not available, putting down their left chopsticks, waiting, picking up their left chopsticks
again simultaneously, and so on till death. We have defined this situation in former chapters.
This is the situation in which all the programs continue to run indefinitely but fail to make any
progress, namely starvation.

Now, you may think, “If the philosophers would just wait a random time instead of the same
time after failing to acquire the right chopstick.” That is true, but in some application one
would prefer a solution that always works and cannot fail due to an unlikely series of random
numbers. (Think about safety control in a nuclear plant)

The following program uses an array state, to keep track of whether a philosopher is eating,
thinking, or hungry (trying to acquire chopsticks). A philosopher may move into eating state
only if neither neighbour is eating. The neighbors are defined by the macros LEFT and
RIGHT.

Lecture Notes by Uğur Halıcı 71


EE442 Operating Systems Ch. 6 Interprocess Communication

#define N 6 /* number of philosophers */


#define LEFT (i-1) % N /* number i’s left neighbor */
#define RIGHT (i+1) % N /* number i’s right neighbor */
#define THINKING 0 /* mode of thinking */
#define HUNGRY 1 /* mode of hunger */
#define EATING 2 /* mode of eating */

typedef int semaphore; /* semaphores are a special kind of int */


int state[N]; /* array to keep track of states */
semaphore mutex = 1; /* mutual exclusion for CS */
semaphore s[N] ; /* one semaphore per philosopher */

void philosopher(int i) /* i : Which philosopher (0 to N-1) ? */


{
while (TRUE) /* infinite loop */
{
think(); /* philosopher is thinking */
take_sticks(i); /* acquire two chopsticks or block */
eat(); /* yum-yum, rice */
put_sticks(i); /* put both chopsticks back */
}
}

void take_sticks(int i) /* i : Which philosopher (0 to N-1) ? */


{
down(mutex); /* enter CS */
state[i] = HUNGRY; /* record that the philosopher is hungry */
test(i); /* try to acquire 2 chopsticks */
up(mutex); /* leave CS */
down(s[i]) ; /* block if chopsticks were not acquired */
}

void put_sticks(int i) /* i : Which philosopher (0 to N-1) ? */


{
down(mutex); /* enter CS */
state[i] = THINKING; /* philosopher has finished eating */
test(LEFT); /* see if the left neighbor can eat now */
test(RIGHT); /* see if the right neighbor can eat now */
up(mutex); /* leave CS */
}

void test(int i) /* i : Which philosopher (0 to N-1) ? */


{
if (state[i] == HUNGRY && state[LEFT] != EATING && state[RIGHT] != EATING)
{
state[i] = EATING;
up(s[i]);
}
}

Lecture Notes by Uğur Halıcı 72


EE442 Operating Systems Ch. 6 Interprocess Communication

6.3.4 The Sleeping Barber Problem

The barber shop has one barber, one barber chair, and N chairs for waiting customers, if
any, to sit in. If there is no customer at present, the barber sits down in the barber chair and
falls asleep. When a customer arrives, he has to wake up the sleeping barber. If additional
customers arrive while the barber is cutting a customer’s hair, they either sit down (if there is
an empty chair) or leave the shop (if all chairs are full). The problem is to program the barber
and the customers without getting into race conditions.

#define CHAIRS 5 /* number of chairs for waiting customers */

typedef int semaphore;

semaphore customers = 0; /* number of waiting customers */


semaphore barbers = 0; /* number of barbers waiting for customers */
semaphore mutex = 1; /* for mutual exclusion */
int waiting = 0; /* customers are waiting not being haircut */

void Barber(void)
{
while (TRUE)
{
down(customers); /* go to sleep if number of customers is 0 */
down(mutex); /* acquire access to ‘waiting’ */
waiting = waiting – 1; /* decrement count of waiting customers */
up(barbers); /* one barber is now ready to cut hair */
up(mutex); /* release ‘waiting’ */
cut_hair(); /* cut hair, non-CS */
}
}

void customer(void)
{
down(mutex); /* enter CS */
if (waiting < CHAIRS)
{
waiting = waiting + 1; /* increment count of waiting customers */
up(customers); /* wake up barber if necessary */
up(mutex); /* release access to ‘waiting’ */
down(barbers); /* wait if no free barbers */
get_haircut(); /* non-CS */
}
else
{
up(mutex); /* shop is full, do not wait */
}
}

Our solution uses three semaphores: customers, which counts waiting customers (excluding
the one being served), barbers, the number of idle barbers (0 or 1), and mutex for mutual
exclusion. The variable waiting is essentially a copy of customers and it is required since
there is no way to read the current value of a semaphore.

Lecture Notes by Uğur Halıcı 73


EE442 Operating Systems Ch. 6 Interprocess Communication

QUESTIONS
1. A counting semaphore pair allows the down and up primitives to operate on two counting
semaphores simultaneously. It may be useful for getting and releasing two resources in one
atomic operation. The down primitive for a counting semaphore pair can be defined as
follows:

void down(sermaphore s1, semaphore s2)


{
while (s1<=0)or(s2<=0) do (*nothing*);
s1:=s1-1;
s2:=s2-1;
}

Show how a counting semaphore pair can be implemented using regular down(s) and up(s)
primitives.

2. Using up and down operations on semaphores,

a. Present an incorrect solution to the critical section problem that will cause a deadlock
involving only one process.

b. Repeat a. for case involving at least two processes.

3. consider the following processes executing concurrently:

void P1(void) void P2(void) void P3(void)


{ while (TRUE) { { while (TRUE) { { while (TRUE) {
st_a st_e st_h
st_b st_f st_i
st_c st_g }
st_d } }
} }
}

Give a solution to synchronize P1, P2 and P3 such that the following order of execution
across the statements are is satisfied:

statement a before statement f, statement e before statement h,


statement g before statement c, statement g before statement i.

4. A version of readers/writers problem is defined as follows: 'A reader can enter its critical
section only if there is no writer process executing, and there is no writer process waiting.'
Device a concurrent solution for the second readers/writers problem. That is, define the
shared variables and semaphores needed for your solution, and write the general structure
for:

a. a reader process,
b. a writer process,
c. the initialization code.

Lecture Notes by Uğur Halıcı 74


EE442 Operating Systems Ch. 6 Interprocess Communication

Write a comment for each statement in your solution.

5. Assume that there are two process P1 and P2 executing on processors Pi and Pj in a
distributed system. P1 and P2 are synchronized with a binary semaphore 'flag':

void P1(void) void P2(void)


{ while (TRUE) { { while (TRUE) {
st_a st_c
up(flag) down(flag)
st_b st_d
} }
} }

a. What is the resulting order of execution among statements a, b, c, and d with above
code?

Now assume that down(flag) is implemented as [ wait until you receive a 'proceed on flag'
message from a process executing on any processor], and up(flag) is implemented as a
[send a 'proceed on flag' message to all processors].

b. Is there any danger of violating the order of execution of part a. with this implementation.

c. If two or more processes use this implementation of down and up primitives in accessing
shared data, is there any danger of violating the mutual exclusion condition ?

d. Is there any danger of deadlocks in this implementation?

6. Consider a concurrent system with two processes p1 and p2, and two semaphores s1 and
s2. Explain why the following use of semaphores s1 and s2 may create a deadlock.

semaphore s1=1, s2=1

void P1(void) void P2(void)


{ down(s1) { down(s2)
use_device1 use_device2
down(s2) down(s1)
use_device2 use_device1
up(s1) up(s1)
up(s2) up(s2)
} }

7. Present a correct solution for concurrent bounded buffer problem assuming there are two
buffers, buffer A and buffer B available each of size n. Assume you are given procedures
which add an item to a buffer, and remove an item from a buffer, given the buffer name
explicitly. Give the structure of the consumer, and the producer processes, and the
initialization code. Clearly explain the need for every semaphore variable and shared
variable you use in your solution.

8. Suppose we have a computer system with n processes sharing 3 identical printers. One
process may use only one printer at a time. In order to use a printer, a process should call
procedure "request_printer(p)". This procedure allocates any available printer to the calling

Lecture Notes by Uğur Halıcı 75


EE442 Operating Systems Ch. 6 Interprocess Communication

process, and returns the allocated printer's id in variable p. If no printers are available, the
requesting printer is blocked, and it waits until one becomes available. When a process is
finished with printer p, it calls procedure "release_printer(p)".

Write procedures request_printer(p) and release_printer(p) in C, for this system, using


semaphores.

9. It is claimed that the following code for producer and consumer processes is a correct
solution for the bounded buffer problem :

semaphore mutex=1, empty=N, full=0

void producer(void) void consumer(void)


{ {
int item; int item;
while (TRUE) while (TRUE)
{ {
produce_item(item) down(mutex);
down(mutex); down(full);
down(empty); remove_item(item);
enter_item(item); up(mutex);
up(mutex); up(empty);
up(full); consume_item(item);
} }
} }

Is this solution deadlock-free? If yes, prove your answer by using any deadlock detection
algorithm you wish. If no, modify it so that it becomes deadlock-free.

10. A street vendor prepares cheese potatoes according to the following rule:

i. Baked potatoes should be ready;


ii. Grated cheese should be ready;
iii. When there are n customers waiting (n=0,1,.. ), up to n+1 cheese potatoes may be
prepared.

Write concurrent processes that use only semaphores to control concurrent processes

a. Potato_baker, which bakes 4 potatoes at a time and should run until out_of_potato

b. Cheese_grater, grates a single portion cheese at a time and should run until
out_of_chese

c. Cheese_potato_vendor, prepares one cheese potato at a time,


d. Customer_arrival

Do not forget to indicate the initial values of the semaphores.

11. In the IsBank, METU, there is a single customer queue, and four bank servers.

Write concurrent processes that use only semaphores to control concurrent processes

Lecture Notes by Uğur Halıcı 76


EE442 Operating Systems Ch. 6 Interprocess Communication

a. customer queue
b. bank servers

Do not forget to indicate the initial values of the semaphores.

12. Two processes P1 and P2 are being executed concurrently on a single processor
system. The code for P1 and P2 is given as:

{common variable declarations and initializations}

void P1(void) void P2(void)


{ {
while (TRUE) { while (TRUE) {
{CS1 entry code} {CS2 entry code}
CS1( ) ; CS2( ) ;
{CS1 exit code} {CS2 exit code}
Non-CS( ) ; Non-CS( ) ;
} }
} }

a. Write the CS entry and exit codes for P1 and P2 using down and up operations on
semaphores to satisfy the following condition: 'CS1 will be executed exactly once after CS2
is executed exactly once' (i.e. init | CS2 | CS1 | CS2 | CS1 | CS2 | CS1 | ....). Show the initial
values for semaphores as well.

b. Repeat part a. for the following condition: 'CS1 will be executed exactly once, after CS2 is
executed exactly two times.' (i.e. init | CS2 | CS2 | CS1 | CS2 | CS2 | CS1 |...)

13. Five dining philosophers are in the graduation ball of the university. Again the menu is
rice to be eaten by chopsticks. However here they spend their time also for dancing in
addition to thinking and eating. Thinking and eating are activities performed while sitting.
They are sitting around a table having five plates and one chopstick between each plate.
From time to time a philosopher gets hungry. He tries to pick up the two chopsticks that are
on his right and on his left. A philosopher that picks both chopsticks successfully, starts
eating without releasing his chopsticks. A philosopher may only pick up one chopstick at a
time. When a philosopher finishes eating he puts down both of his chopsticks to their
original position and he starts thinking again.

There are two ladies in the ballroom who are always willing to dance with the philosophers,
and they never let more than four philosophers to be sitting at the same time. They may
invite a philosopher to dance only if he is not eating , but he is thinking. A philosopher
cannot deny the invitation of a lady if he is thinking.

Write concurrent processes for philosophers and ladies, that use semaphores to control
synchronization over critical section code. Discuss deadlock and starvation issues, and
propose deadlock-free and nonstarving solutions if possible.

14. a. Find an execution order of the statements of the master and slave processes causing
deadlock:

Lecture Notes by Uğur Halıcı 77


EE442 Operating Systems Ch. 6 Interprocess Communication

semaphore mutex=1, clean=1, dirty=0;


void master(void) void slave(void)
{ {
m1: down (mutex); s1: down (mutex);
m2: down (clean); s2: down (dirty);
m3: drink(); s3: wash();
m4: up(dirty) s4: up(clean)
m5: up(mutex) s5: up(mutex)
} }

b. How can you overcome this deadlock possibility with no change on the initial values?

15. Our five philosophers in the coffee room have realized that it is not easy to wash dishes.
So they have decided to have a servant. Furthermore they bought some more cups. Now
they have 3 cups, 1 pot and a servant. All the cups are clean initially and pot is full of coffee.
The philosophers either drink coffee or chat. To drink coffee, a philosopher should grab a
clean cup, and have to wait the servant to pour some coffee in it. Then he drinks the coffee
and puts empty cup on the table. On the other hand, the servant is responsible from
washing it if there is any dirty cup, pouring coffee if there is any philosopher has grabbed
a cup. The servant is filling the pot if the coffee in the pot have been finished, so assume the
coffee in the pot is infinity. Write concurrent procedures for the philosophers and the wash-
cup, pour-coffee of the servant by using semaphores such that it should be deadlock free
and any process in non-critical section should not cause any other process to wait
unneccessarily.

Lecture Notes by Uğur Halıcı 78

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy