Pistachio Whitepaper
Pistachio Whitepaper
white paper
System Architecture Group
University of Karlsruhe
May 1, 2003
1 L4Ka::Pistachio
L4Ka::Pistachio is the latest L4 microkernel developed by the System Architecture
Group at the University of Karlsruhe. L4Ka::Pistachio is the first available kernel
implementation of the L4 Version 4 kernel API, which is fully 32 and 64 bit clean,
provides multiprocessor support, and superfast local IPC. It continues the L4 tradi-
tion of providing outstanding inter-process communication (IPC) performance, and
a highly flexible kernel API.
The first release implements most of the core functionality of the API. The kernel
is written in C++ with a significant focus on performance, portability, and reusabil-
ity. L4Ka::Pistachio supports most existing mainstream hardware architectures, in
particular Intel’s IA321 , and IA642 , PowerPC 32bit, Alpha 21164, and MIPS R4000
and higher, with multiprocessor support for the first three architectures. In the near
future additional support is planned for AMD64, Power4, ARM3 , and UltraSparc.
L4Ka::Pistachio provides a solid basis for research and development of highly spe-
cialized and general purpose operating systems for a wide variety of systems, ranging
from tiny embedded devices to huge multiprocessor server systems. As an exam-
ple, L4Ka::Linux, a modified Linux 2.4 kernel running on top of L4Ka::Pistachio,
enables unmodified commodity applications to coexist with customized system com-
ponents, which allows a smooth migration path towards customized services, and
supports secure extensions to monolithic operating systems. This is a core require-
ment for upcoming consumer devices such as SmartPhones, secure payment, and
digital rights management.
2 L4 Version 4 API
L4Ka::Pistachio implements the L4 Version 4 API (currently still referred to as
eXperimental Version 2, or X2). The Version 4 API supersedes the seven year old
1 Pentium and higher
2 Itanium1 and SKI
3 StrongARM and XScale
1
Version 2 API (V2), which was the basis for L4’s success story. All L4 APIs are
designed with the main focus on performance and flexibility. As described below,
Version 4 introduces a set of new kernel features eliminating limitations experienced
with V2.
2
Version 4 introduces a kernel provided page, the kernel interface page (KIP),
which contains function entry points for all system calls as well as frequently accessed
system information. The invocation of a system call is now a simple function call
to the entry point in the KIP. When new architectural features are added to next
generation processors, these features can be directly utilized by replacing the page;
old code will automatically use the latest processor features.
This methodology also allows for system calls which do not necessarily enter the
kernel, e.g., the system clock or super-fast local IPC. The overhead of this flexible
scheme is negligible when using an optimized dynamic linker which links system
calls against the KIP’s entry points.
4 In the current release L4Ka::Pistachio does not yet implement SuperFast IPC.
3
2.7 Interrupts
Similar to the Version 2 API, Version 4 abstracts system interrupts as kernel threads
and interrupt delivery as IPC. Version 4, however, completely abstracts the first-level
interrupt controller and only provides basic primitives for interrupt association. This
allows for a higher level of parallelism, more efficient synchronization, and caching
of interrupt controller state thereby increasing overall performance. Abstracting
interrupt hardware was also a portability requirement, since some architectures allow
access to interrupt controllers only in privileged mode.
3 Tools
Powerful development tools can significantly shorten the development cycle. To-
gether with L4Ka::Pistachio we release L4Ka::IDL4, an IDL compiler which gen-
erates RPC stub code that is highly optimized for the kernel API and ABI, the
hardware architecture, and the C/C++ compiler. L4Ka::IDL4’s generated code
quality is comparable to hand-optimized assembly stub code.
L4Ka::IDL4 not only supports the Version 4 API but also V2 and X.0 allowing
a smooth migration path for existing software. It supports the CORBA and DCE
syntax and has an integrated C++ parser which can parse C/C++ header files to,
e.g., re-use type declarations.
The specification of Version 4’s IPC interface is significantly influenced by and
optimized for L4Ka::IDL4. Providing a large memory-based register file usually
has a significant impact on the cache footprint of message transfers. However,
using specifically optimized marshaling stubs the additional costs can be completely
eliminated resulting in better overall performance.
RPC takes place as a three-step process. In the first step all parameters are
marshaled into a message buffer. This message buffer is then transfered using the
system’s communication primitives and finally un-marshaled at the destination. The
Version 2 API provides two message registers, X.0 up to three; messages exceeding
this limit have to be transfered using a memory copy with a significant startup over-
head. Version 4 provides up to 64 message registers with some of them backed by
memory. The IDL compiler can marshal the parameters directly into these message
registers and the kernel can transfer them from one address space to another with
significantly less overhead. On the receiver’s side the parameters can be directly
used from the message register store.