5_WindowsTraps
5_WindowsTraps
2
The Critical-Section Problem
n threads all competing to use a shared
resource (i.e.; shared data)
Each thread has a code segment, called
critical section, in which the shared data is
accessed
Problem:
Ensure that when one thread is executing in
its critical section, no other thread is allowed to
execute in its critical section
3
Three Requirements
1. Mutual Exclusion
Only one thread at a time is allowed into its critical section, among all
threads that have critical sections for the same resource or shared
data.
2. Progress
If no thread is in the critical section and some threads want to enter,
then only those threads not in the remainder section can participate
in the decision of which thread gets to enter next.
The selection process cannot be postponed indefinitely.
3. Bounded Waiting
There must be a bound on the number of times that other processes
are allowed to enter their critical section after a process has
requested to enter its critical section and before the request is
granted.
4
Initial Attempts to Solve Problem
5
First Attempt: Algorithm 1
Shared variables - initialization
int turn = 0;
turn == i Ti can enter its critical section
Thread Ti
do {
while (turn != i) ;
critical section
turn = j;
reminder section
} while (1);
Strict alternation between i and j
Satisfies mutual exclusion, but not progress
6
Second Attempt: Algorithm 2
Shared variables - initialization
int flag[2]; flag[0] = flag[1] = 0;
flag[i] == 1 Ti can enter its critical section
Thread Ti
do {
flag[i] = 1;
while (flag[j] == 1) ;
critical section
flag[i] = 0;
remainder section
} while(1);
Satisfies mutual exclusion, but not progress requirement.
Very sensitive to timing of the two threads
7
Third Attempt: Algorithm 3
(Peterson’s Algorithm - 1981)
Shared variables of algorithms 1 and 2 - initialization:
int flag[2]; flag[0] = flag[1] = 0;
int turn = 0;
Thread Ti
do {
flag[i] = 1;
turn = j;
while ((flag[j] == 1) && turn == j) ;
critical section
flag[i] = 0;
remainder section
} while (1);
Solves the critical-section problem for two threads.
8
Dekker’s Algorithm (1965)
9
Dekker’s Algorithm (contd.)
Shared variables - initialization:
int flag[2]; flag[0] = flag[1] = 0;
int turn = 0;
Thread Ti
do {
flag[i] = 1;
while (flag[j] )
if (turn == j) {
flag[i] = 0;
while (turn == j);
flag[i] = 1;
}
critical section
turn = j;
flag[i] = 0;
remainder section
} while (1);
10
Bakery Algorithm
(Lamport 1979)
A Solution to the Critical Section problem for n threads
11
Bakery Algorithm
12
Bakery Algorithm
do {
choosing[i] = 1;
number[i] = max(number[0],number[1] ...,number[n-1]) + 1;
choosing[i] = 0;
for (j = 0; j < n; j++) {
while (choosing[j] == 1) ;
while ((number[j] != 0) &&
((number[j],j) ‘’<‘’ (number[i],i)));
}
critical section
number[i] = 0;
remainder section
} while (1);
13
Mutual Exclusion - Hardware Support
Interrupt Disabling
Concurrent threads cannot overlap on a uniprocessor
Thread will run until performing a system call or interrupt
happens
Special Atomic Machine Instructions
Test and Set Instruction - read & write a memory location
Exchange Instruction - swap register and memory location
Problems with Machine-Instruction Approach
Busy waiting
Starvation is possible
Deadlock is possible
14
Synchronization Hardware
return rv;
}
15
Mutual Exclusion with Test-and-Set
Shared data:
boolean lock = false;
Thread Ti
do {
while (TestAndSet(lock)) ;
critical section
lock = false;
remainder section
}
16
Synchronization Hardware
17
Mutual Exclusion with Swap
Thread Ti
int key;
do {
key = 1;
while (key == 1) Swap(lock, key);
critical section
lock = 0;
remainder section
}
18
Semaphores
signal (S):
S++;
19
Critical Section of n Threads
Shared data:
semaphore mutex; //initially mutex = 1
Thread Ti:
do {
wait(mutex);
critical section
signal(mutex);
remainder section
} while (1);
20
Semaphore Implementation
21
Implementation
22
Semaphore as a General
Synchronization Tool
23
Two Types of Semaphores
Counting semaphore
integer value can range over an unrestricted
domain.
Binary semaphore
integer value can range only between 0 and 1;
can be simpler to implement.
Counting semaphore S can be implemented
as a binary semaphore.
24
Deadlock and Starvation
25
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons:
1. Requests from user mode
Via the system service dispatch mechanism
Kernel-mode code runs in the context of the requesting thread
2. Interrupts from external devices
Windows interrupt dispatcher invokes the interrupt service routine
ISR runs in the context of the interrupted thread
(so-called “arbitrary thread context”)
ISR often requests the execution of a “DPC routine,”
which also runs in kernel mode
Time not charged to interrupted thread
3. Dedicated kernel-mode system threads
Some threads in the system stay in kernel mode at all times
(mostly in the “System” process)
Scheduled, preempted, etc., like any other threads
26
Trap Dispatching
27
Interrupt Dispatching
user or
kernel mode kernel mode
code Note, no thread or
process context
switch!
Interrupt dispatch routine
interrupt !
Disable interrupts
Interrupt service routine
Record machine state (trap
frame) to allow resume Tell the device to stop
interrupting
Mask equal- and lower-IRQL Interrogate device state,
interrupts start next operation on
device, etc.
Request a DPC
Find and call appropriate
ISR Return to caller
Dismiss interrupt
28
Hardware interrupt processing
x86:
I/O interrupts come into one of the lines of interrupt controller
Interrupt controller interrupts processor (single line)
Processor queries for interrupt vector; uses vector as index to IDT
29
Interrupt Precedence via IRQLs (x86)
IRQL = Interrupt Request Level IRQL is also a state of the processor
the “precedence” of the interrupt with Servicing an interrupt raises processor
respect to other interrupts IRQL to that interrupt’s IRQL
Different interrupt sources have this masks subsequent interrupts at equal
different IRQLs and lower IRQLs
High
used when halting the system (via KeBugCheck())
Power fail
originated in the NT design document, but has never been
used
Inter-processor interrupt
used to request action from other processor (dispatching a
thread, updating a processors TLB (translation lookaside
buffer), system shutdown, system crash)
Clock
Used to update system‘s clock, allocation of CPU time to
threads
31
Predefined IRQLs (contd.)
Profile
Used for kernel profiling (see Kernel profiler – Kernprof.exe,
Res Kit)
Device
Used to prioritize device interrupts
DPC/dispatch and APC
Software interrupts that kernel and device drivers generate
Passive
No interrupt level at all, normal thread execution
Restriction: code running at DPC+ levels must not wait
for an object which results in a thread re-scheduling
32
IRQLs on 64-bit Systems
x64 IA64
15 High/Profile High/Profile/Power
14 Interprocessor Interrupt/Power Interprocessor Interrupt
13 Clock Clock
12 Synch (Srv 2003) Synch (MP only)
Device n Device n
. .
4 . Device 1
3 Device 1 Correctable Machine Check
2 Dispatch/DPC Dispatch/DPC & Synch (UP only)
1 APC APC
0 Passive/Low Passive/Low
33
Interrupt Prioritization & Delivery
IRQLs are determined as follows:
On x86, x64 & IA64 systems: IRQL = IDT vector number / 16
On MP systems, which processor is chosen to deliver an
interrupt?
By default, any processor can receive an interrupt from any
device
Can be configured with IntFilter utility in Resource Kit
On x86 and x64 systems, the IOAPIC (I/O advanced
programmable interrupt controller) is programmed to interrupt
the processor running at the lowest IRQL
On IA64 systems, the SAPIC (streamlined advanced
programmable interrupt controller) is configured to interrupt one
processor for each interrupt source
Processors are assigned round robin for each interrupt vector
34
Interrupt object
35
Flow of Interrupts
0
2
3
CPU Interrupt
Dispatch Table
36
Software interrupts
37
Deferred Procedure Calls (DPCs)
Used to defer processing from higher (device) interrupt level to a
lower (dispatch) level
Also used for quantum end and timer expiration
Driver (usually ISR) queues request
One queue per CPU. DPCs are normally queued to the current
processor, but can be targeted to other CPUs
Executes specified procedure at dispatch IRQL (or “dispatch level”,
also “DPC level”) when all higher-IRQL work (interrupts) completed
Maximum times recommended: ISR: 25 usec, DPC: 100 usec
See http://msdn.microsoft.com/en-us/windows/hardware/gg487462.aspx
queue head DPC object DPC object DPC object
38
Delivering a DPC
DPC routines can call kernel functions 4. Dispatcher executes each DPC
but can‘t call system services, generate routine in DPC queue
page faults, or create or wait on objects
39
Asynchronous Procedure Calls
(APCs)
Execute code in context of a particular user thread
APC routines can acquire resources (objects), incur page faults,
call system services
APC queue is thread-specific
User mode & kernel mode APCs
Permission required for user mode APCs
Executive uses APCs to complete work in thread space
Wait for asynchronous I/O operation
Emulate delivery of POSIX signals
Make threads suspend/terminate itself (env. subsystems)
APCs are delivered when thread is in alertable wait state
WaitForMultipleObjectsEx(), SleepEx()
40
Asynchronous Procedure Calls
(APCs)
Special kernel APCs
Run in kernel mode, at IRQL 1
Always deliverable unless thread is already at IRQL 1 or above
Used for I/O completion reporting from “arbitrary thread context”
Kernel-mode interface is linkable, but not documented
“Ordinary” kernel APCs
Always deliverable if at IRQL 0, unless explicitly disabled
(disable with KeEnterCriticalRegion)
User mode APCs
Used for I/O completion callback routines (see ReadFileEx, WriteFileEx); also,
QueueUserApc
Only deliverable when thread is in “alertable wait”
K
Thread APC objects
Object
U
41
IRQLs and CPU Time Accounting
42
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle
process time
Since interrupt activity is not charged to any thread or process,
Process Explorer shows these as separate processes (not really
processes)
Context switches for these are really number of interrupts and DPCs
43
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where
system has spent its time
CPU time accounting is driven by programmable interrupt timer
Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT
accounted
E.g., one or more threads run and enter a wait state before clock fires
Thus threads may run but never get charged
View context switch activity with Process Explorer
Add Context Switch Delta column
44
Exception Dispatching
45
Exception Dispatching (contd.)
Structured exception handling;
Accessible from MS VC++ language: __try, __except, __finally
See Jeffrey Richter, “Advanced Windows”, MS Press
See Johnson M.Hart, „Win32 System Programming“, Addison-Wesley
ALPC to Debugger
debugger port (first chance)
exception Trap
handler (Trap frame,
Exception Exception Frame-based
record) dispatcher) handlers
Debugger
Unhandled exceptions are passed to
(second chance)
next handler
46
Internal Windows API exception
handler
Processes unhandled exceptions
At top of stack, declared in StartOfProcess()/StartOfThread()
47
System Service Dispatching
Triggered when executing an instruction assigned to system service
dispatching
Instruction depends on type of systems:
32-bit system
Stores address to system service dispatch routine at machine specific
register (MSR)
sysenter (syscall on AMD)
sysexit (sysret on AMD)
64-bit system
Pass system call number in EAX register
syscall
Sysret
Kernel-mode system service dispatching
Drivers uses zwxxx system calls
Build fake interrupt stack , call KiSystemService directly
Set previous mode to kernel
48
System Service Dispatching
Windows Call USR or
Call WriteFille(…) Application
Application GDI service(…)
View IDT
View the IRQL
Use Kernel Profiler to Profile Execution
Examine interrupt internals
Monitor Interrupt and DPC activity
50
Lab: DPC
51
Further Reading
52
Source Code References
Windows Research Kernel sources
\base\ntos\ke\i386 (similar files for amd64)
Trap.asm, Trapc.c – Trap dispatcher
Spinlock.asm – Spinlocks
Clockint.asm – Clock Interrupt Handler
Int.asm, Intobj.c, Intsup.asm – Interrupt Processing
\base\ntos\ke
eventobj.c - Event object
mutntobj.c – Mutex object
semphobj.c – Semaphore object
timerobj.c, timersup.c – Timers
wait.c, waitsup.c – Wait support
\base\ntos\inc\ke.h – structure/type definitions
53