0% found this document useful (0 votes)
12 views22 pages

CH 5

Chapter 5 covers CPU scheduling, detailing various algorithms such as FCFS, SJF, and RR, and their evaluation criteria including CPU utilization and turnaround time. It discusses the complexities of scheduling in multiprocessor systems and introduces real-time scheduling concepts, emphasizing the importance of meeting deadlines. Additionally, it presents methods for comparing scheduling algorithms and highlights the significance of load balancing and processor affinity in modern computing architectures.

Uploaded by

Dima Azzam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views22 pages

CH 5

Chapter 5 covers CPU scheduling, detailing various algorithms such as FCFS, SJF, and RR, and their evaluation criteria including CPU utilization and turnaround time. It discusses the complexities of scheduling in multiprocessor systems and introduces real-time scheduling concepts, emphasizing the importance of meeting deadlines. Additionally, it presents methods for comparing scheduling algorithms and highlights the significance of load balancing and processor affinity in modern computing architectures.

Uploaded by

Dima Azzam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 22

Chapter 5: CPU Scheduling

Objectives

• Describe and apply various CPU scheduling algorithms

• Evaluate CPU scheduling algorithms based on scheduling criteria

• Understand the issues related to scheduling on multiprocessor and multicore hardware

• Introduce real-time scheduling and describe and practice with some known algorithms

• Present the scheduling models implemented in Windows and Linux kernels.

• Describe methods and techniques of evaluating and comparing scheduling algorithms.


Basic concepts

The CPU scheduler is a function of the OS that decides which


process, from the ready queue, to schedule next.

During its lifetime, a process repeatedly executes cycles of CPU


burst - I/O burst. This is the CPU–I/O Burst Cycle

The CPU burst distribution has been studied extensively and


found to have the following distribution. In summary:

• Large number of short CPU bursts (I/O bound processes)


• Small number of long CPU bursts (CPU-bound processes)
Preemptive and Nonpreemptive Scheduling
The CPU scheduler selects from among the processes in ready queue, and allocates a
CPU core to one of them. CPU scheduling decisions may take place when a process:
1. Switches from running to waiting state i.e. running process requests I/O or waits something
2. Switches from running to ready state i.e. interrupt occurs, higher priority process arrives…
3. Switches from waiting to ready i.e. an requested I/O operation has been accomplished
4. Terminates i.e. process exits or is stopped…

Nonpreemptive scheduling allows only cases 1 and 4. No process is halted for the
sake of another process, until itself requests I/O or terminates. This is usually a bad
strategy especially for interactive systems.

Preemptive scheduling allows also cases 2 and 3. It introduces some complexities:


• Shared data must be protected from race conditions. A process might be preempted
while accessing the shared data, and another process might modify it in between.
• Preemption might occur while a process is in kernel mode. Therefore the kernel data
structures must also be protected from race conditions.
• Interrupts might occur during crucial OS activities. Such activities must “disable
interrupts” for a brief time (a few instructions) then enable them.
Scheduling Criteria
Many scheduling strategies are possible. OS designers select the one that achieves
their goals. Different criteria suggested to evaluate and compare algorithms:

• CPU utilization. Keep the CPU as busy as possible maximize

• Throughput. Number of processes that complete per time unit maximize

• Turnaround time. Amount of time to execute a process minimize.

• Waiting time. Amount of time a process spends waiting in the ready queue minimize.

• Response time. Amount of time it takes from when a request arrives until the first
response is produced. This measure is particularly relevant for interactive systems. It
is the time it takes to start responding, not to complete the whole activity. minimize

We usually minimize or maximize the average for one or more of these criteria.

In some cases, the minimum or maximum might be more important than the average.
For example, for a realtime system, minimize the maximum waiting time (the waiting
time must be bounded for realtime applications)…
FCFS Scheduling (First- Come First-Served)
Simple algorithm. Nonpreemptive. The ready queue implemented as FIFO (of PCB
pointers). Main disadvantage: average waiting time too high and sensitive to the order of
arrival of jobs.

Example: single-core system, 3 processes almost simultaneously (at t=0) with bursts:
P1: 24, P2: 3, P3: 3
Gantt Chart
a) Order of arrival P1 P2 P3.

Average waiting time: (0 + 24 + 27)/3 = 17 Average turnaround time: (24+27+30)/3 = 27

b) Order of arrival P2 P3 P1

Average waiting time: (6 + 0 + 3)/3 = 3 Average turnaround time: (3+6+30)/3 = 13

The Convoy effect: when many IO-bound processes have to wait a CPU-bound process. They all
have short CPU bursts but all have to wait the CPU-bound process which has a long CPU burst.
This cycle repeats over and over, because of this non-preemptive scheduling. This is particularly
bad for interactive systems.
SJF Scheduling (Shortest Job First)
This is the optimal solution because it minimizes the average waiting time.
Problem: burst times are not known in advance; they need to be estimated / predicted.
Also, this strategy is non-preemptive.

Example: 4 processes arrive on a single-core system with the burst times:


P1: 6 P2 : 8 P3 : 7 P4 : 3. Suppose that these burst times are known in advance.

Gantt
Chart

avg. waiting time: (0+3+9+16)/4 =7

How to estimate the Length of the next CPU burst of a process? Based on its history.
Each time a process is scheduled, its burst time is measured and its expected burst is
re-estimated.

The exponential moving average is a forecasting method that adjusts the “guess” τ
after each new measure tn in the following way:
τn+1 = α.tn + (1 – α) τn where 0 ≤ α ≤ 1

Example with α = 0.5


and initial guess τ = 10:
Shortest-Remaining-Time First
This is the preemptive version of SJF. When a new job arrives, if its burst time is smaller
than the “remaining” burst time of the currently running process, the latter is preempted.
The next time it will scheduled is when its remaining time is the smallest.
The prediction of the burst time is the same as in SJF.

Example: 4 processes with the following arrival times and burst times:
Process Arrival Time Burst Time
P1 0 8
P2 1 4
P3 2 9
P4 3 5

Gantt
Chart

Average waiting time = [(10-1)+(1-1)+(17-2)+5-3)]/4 = 26/4 = 6.5 ms


Verify that with non-preemptive SJF, the average waiting time would be 7.75 ms

Average response time = [(0-0) + (1-1) + (17-2)+(5-3)]/4 = 17/4 = 4.25 ms


RR (Round Robin) Scheduling
A preemptive method that shares the CPU “fairly” among the ready processes. Each
ready process gets a small unit of CPU time called time quantum q. If q elapses and the
process does not block yet (request IO, wait something), it is preempted, which means put
in the tail of the ready queue, and the one at the head of the queue gets its turn.

The main advantage of RR (and preemptive methods in general) is a better response time.
That is, the time an arriving process has to wait before it gets its first quantum.

Example: q=4, burst times: P1: 24, P2 : 3, P3 :3

Avg. waiting time =(6+4+7)/3 = 5.67 Avg. response time =(0+4+7)/3 = 3.67

• If q is too large, RR is similar to FCFS (that is, almost no preemption)


• If q is too small, there are too many context switches. q must remain large compared
to the context switch time (~10μs), otherwise the overhead is too high. But not too large.
• As a general rule, the choice is q > 80th percentile of CPU burst time. This strategy
seeks that only about 20% of the time the processes will be preempted by the
quantum timer (before blocking by themselves).
Priority Scheduling
In the computing world, some processes are more time sensitive than others. Priority scheduling
assigns each process a priority level, usually a number, i.e. 0 to 7. (0 is the “high priority”). The CPU is
allocated to the ready process with highest priority.

Nonpreemptive Priority. The running process is not preempted when a higher priority process
arrives. It continues running until a “normal reason” of scheduling occurs.
Preemptive Priority. The running process is preempted if a higher priority process arrives. To
avoid “starvation”, either we use aging (increase priority with time) or use RR per level: a separate
FIFO queue for each priority level. This approach is called “Multilevel Queue”

Example. Non-preemptive

Example. Preemptive with RR (q=2ms):


Priority Scheduling – Exercise (Textbook Q 5.5)
p.s. In this exercise,
- higher number indicates higher priority
- The time quantum is q = 10 ms.

Non-Preemptive priority:
arrival p1 p2 p3 p4 p5 p6
prio 40 30 30 30 5 10
burst 20 25 25 15 10 10
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115
p1 p2 p3 p4 p5 p6

Preemptive priority with RR:


arrival p1 p2 p3 p4 p5 p6
prio 40 30 30 30 5 10
burst 20 25 25 15 10 10
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115
p1 p2 p3 p2 p3 p4 p2 p3 p5 p6 p5
---------------------- RR -------------------- -- RR --

- What is the turnaround time for each process?


- What is the waiting time for each process?
- What is the CPU utilization rate?
Multilevel Queue
Priorities might be assigned externally by user, or
internally by the OS based on the process type. In
the latter case, there are usually 4 levels (and their
corresponding queues) as in the diagram. This
distinction is because each of these categories has
different requirements for the response time.

A process can move between the various queues; aging can


be implemented this way: when a process waits for too long,
system increases its priority, to avoid “starvation”

A Multilevel-feedback-queue scheduler is defined by:


• a number of queues
• scheduling algorithms for each queue
• method to determine when to upgrade a process
• method to determine when to demote a process
• method to determine which queue a process will enter
when it is scheduled out.
Scheduling on multiprocessor systems
The term Multi-processor applies to many different architectures: Multicore CPUs,
Multithreaded cores, NUMA systems, Heterogeneous multiprocessing (mobile systems
have some fast, power-consuming cores and other slow, power-efficient cores).

In Asymmetric Multiprocessing, one CPU is dedicated to OS activities. Bottleneck, unpopular


Most Oses (Windows, Linux, Android, iOS) use SMP (Symmetric Multiprocessing): every
processor is involved in user and kernel activities. Therefore, the scheduler runs on each
processor, to select a thread from the ready queue. Two possible strategies:
• One common ready queue. Requires
synchronization to avoid race conditions on the
shared queue. This generates a bottleneck because
context switching occurs very frequently.

• Each processor has its own private ready queue.


This is the preferred choice because:
• no need to synchronize access to shared queue

• better use of cache memory because each


thread is re-scheduled on the same processor
again. (affinity, see later)
Hyperthreading on Multicore Processors
Modern architectures place many processor cores on same physical chip.
In a Multithreaded Multicore System (aka hyperthreading) each core has many
interleaved hardware threads: when a thread is in state of memory stall (i.e. cache miss),
core executes another thread.
Intel processors have 2 threads per
core. Core i9-7900X has 10 cores,
hence 20 hardware threads.

Hardware threads appear logically to OS as


logical processors. The interleaving of hardware
threads is the responsibility of hardware. (OS is
responsible of only level 1 in the figure).

However, OS tries load-balancing the tasks


among the available cores. For example, if we
have only two tasks running, scheduling them on
two different cores is faster than on the same core
with two hardware threads.
Load Balancing and Processor Affinity
Load balancing: try to keep workload evenly distributed on the processors. For
example, by having similar queue size on all processors. Two techniques are possible:

• Push migration: A periodic task checks the load on each processor, and might push a
task from an overloaded processor to under-loaded one.
• Pull migration. an idle processor pulls a waiting task from another, busy processor

When a thread runs on a processor, it “populates” its cache. The migration of the thread
to another processor has a cost (repopulating the cache). We say that the thread has
affinity for the processor it is running on (it hates migration).

• The One-queue-per-processor design is good for affinity


• Affinity and load-balancing are conflicting goals
• Hard affinity: task is bound to a processor, core, etc. Linux syscall sched_setaffinity()
• Soft affinity: task migration is allowed, (but should be done with caution…)

NUMA (Non-uniform Memory Access) are multi-


processor systems where each processor has a
memory but can also access others’ memory, albeit
slower. Affinity is extremely important here..
Real-Time Systems
Realtime applications are event driven. Each event requires a latency upper-bound.
Event latency is the time that elapses from when an event occurs to when it is serviced .

• Hard Realtime. Typical for embedded systems. Timing constraints are strict.
• Soft Realtime: Typical for multimedia streaming. timing requirements are soft.

Preemptive Priority Scheduling is mandatory for all realtime systems.


• Sufficient for soft realtime, but hard real-time must also guarantee to meet deadlines.

A model for realtime processes is that they are periodic; with the following parameters:
• Period p or rate 1/p (rate of events)
• deadline d ≤ p (to finish event processing)
• processing time t ≤ d

Admission Control
is a protocol by which a new process declares its timing requirements to the scheduler, which
admits the process if its requirements can be met, rejects it otherwise.
Usually, the realtime app declares the “worst case” parameters: lowest p, d, and highest t.
Rate Monotonic Scheduling
Assigns each periodic task a static priority based on its rate. High rate => high priority

It is optimal for static priority scheduling: if any set of processes is not admissible by the rate-
monotonic scheduler, it is not admissible by any other scheduler with static priorities

Admission rule: CPU utilization < N(21/N – 1) where N is the number of processes.

Ex. two periodic processes: P1: p1=50=d1, t1=20. P2: p2=100=d2, t2=35. P1 higher priority
CPU utilization = t1/p1 + t2/p2 = 75% < 2(21/2 – 1) = 83%

Notice If P2 was given the higher priority, P 1 misses deadline:


Hence why RM scheduler gives higher prio. to higher-rate process

Ex. p1=50=d1, t1=25. p2=80=d2, t2=35


CPU utilization = 94% > 83%. Scheduler cannot
admit both processes, because it cannot guarantee
P2’s deadline:
Earliest Deadline First (EDF) Scheduling
Assigns priorities dynamically, based on deadline. Earlier deadline => higher priority
It can theoretically achieve 100% CPU utilization, but not in practice because of the cost of context
switching. (and this scheduler does a lot of them)

Ex. p1=50=d1, t1=25. p2=80=d2, t2=35 CPU utilization = 94%.


the RM Scheduler failed to admit these processes,
P1 overtakes P2 here
but EDF scheduler can do it:
P2’s deadline @ 160
P1’s @ 150, earlier

Proportional Share Scheduling


This is a scheduling policy where each process is allocated a share of CPU time.
• The system has a total of T shares.
• A new process requests N shares according to its needs. If the process is granted N shares, it
will have N/T of the CPU time during its execution.
• Admission control is based on availability of shares. For example, if the system has 100 shares
and 85% of them are already allocated, a new application will be admitted if it requests 15 shares
or less.
Algorithm Evaluation
To select a CPU-scheduling algorithm, we have to define our criteria.
• Deterministic modeling. We take a predetermined workload (i.e. a trace file from a real system) ,
compare algorithms on that workload for a given criteria.

Ex. 5 processes P1-P5 arrive at t=0 with burst times 10, 29, 3, 7, 12 ms. Compare the average
waiting time for FCFS, SJF and RR (q = 5ms).

FCFS: 28ms SJF: 13ms

RR: 23ms

• Queuing theory is an analytic, mathematical method that relates arrival rates and service times
of processes with their average waiting times. Based on Little’s formula: n = λ W (n: number of
processes in queue, λ: arrival rate, W: avg. waiting time)

• Simulation can also be used to evaluate and compare performance of scheduling algorithms.
Task arrival and service times can be generated from a trace or random distribution profile.
Linux Scheduling
Linux schedules tasks. A task can be either a thread or a single threaded process.
A Linux based system has many Scheduling classes with different priorities. The standard comes
with 2 classes: Normal and realtime. Within each class, further priority levels are defined using a
number in the range [0, 139]. (0 is highest priority)

Realtime tasks have static priority levels (range [0, 99]). Within each level, scheduler uses either non-
preemptive (called FIFO in Linux) or preemptive RR.

For normal tasks (range [100, 139]), Linux uses CFS (Completely Fair Scheduling): each task has a
variable called vruntime, tracking how long that task has used CPU. Scheduler selects the task
with smallest vruntime. The goal is to achieve “fairness”.

To take priority levels into account, CFS gives normal tasks with higher priority a higher proportion
of CPU time. For this, the vruntime of task is scaled by a decay factor based on its
nice_value = priority – 120 (number in the range [-20, 19])
.
> nice + 10 some_prog launches a program “nicely”, with low priority. Good for long batch jobs..

The quantum in Linux is not constant but dynamic, calculated so that each runnable task receive at
least one q during a period called Target Latency. More tasks  higher T.L.
Linux supports load balancing but avoids migration across scheduling domains (set of cores with same NUMA
memory, or L2 cache). Migration is done only in extreme cases of load imbalance.
Windows Scheduling
Windows scheduler, called Dispatcher, uses
preemptive priority, with 32 priority levels.

• Levels [16, 31] are realtime class


• Levels [1, 15] are variable class

“Variable” means it changes over time:


• When a thread uses all its quantum, its priority is lowered ( for fairness)
• When a thread blocks, its priority is boosted.
• In both cases, the priority must stay within the column of the table above. In other words, the
thread must remain in the initial category it was created.

Notes:
• The categories on the first row are modifiable in the windows task manager, under “details” tab.

• In Windows API, they are also settable when a thread is created, and modifiable using the
system call SetPriorityClass().
Textbook readings
From textbook “OS concepts 10th edition”

 Reading list
5.1.1, 5.1.2, 5.1.3, 5.2, 5.3.1, 5.3.2, 5.3.3, 5.3.4
5.5.1, 5.5.2, 5.5.3, 5.5.4, 5.6.2, 5.6.3, 5.6.4, 5.6.5
5.7.1, 5.7.2, 5.8.1

 Recommended Exercises
5.1, 5.2, 5.3 to 5.5, 5.7, 5.8, 5.11, 5.14, 5.15, 5.16,
5.17, 5.18, 5.22, 5.28 5.35

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy