Demand Paging On Symbian Online Book
Demand Paging On Symbian Online Book
by
Jane Sales
Demand Paging on Symbian
by
Jane Sales
Reviewed by
John Beattie
Dan Handley
Jenny Keates
Jason Parker
Jo Stichbury
Satu McNabb
Symbian Foundation
1 Boundary Row
London SE1 8HP
England
This work is licensed under the Creative Commons Attribution-Share Alike 2.0
UK: England & Wales License. To view a copy of this license, visit creativecom-
mons.org/licenses/by-sa/2.0/uk or send a letter to Creative Commons, 171
Second Street, Suite 300, San Francisco, California, 94105, USA.
ISBN: 978-1-907253-00-3
Acknowledgements................................................................................................ix
Chapter 1: Introduction........................................................................................1
3.1 ROMs................................................................................................................9
8.2 Testing...........................................................................................................162
Chapter 9: In Conclusion.................................................................................167
About the Author
Jane Sales joined Psion in 1995 to lead the team developing a new operating
system, EPOC32, for Psion’s as-yet-unreleased Series 5. A few months later,
EPOC32 first booted in Jane’s spare bedroom in Cookham, Berkshire. Jane
notes with pleasure that the distribution of Symbian-based devices based on its
descendant are rather more widespread today.
Jane left Symbian in 2003 to move to the south of France with her husband. She
wasn’t allowed to escape completely though – under the olive trees in her garden
she wrote her first book, Symbian OS Internals, which was published by Wiley in
2005. Soon afterwards Jane moved to Ukraine where she set up a mobile strat-
egy consultancy, working with Symbian’s Research and Strategy Groups, among
others.
Acknowledgements
Jane would like to thank Jo Stichbury and Satu McNabb for their calm, organized
management of this project – an approach that makes making books fun. Jane
would also like to thank John Beattie for his painstaking review which exposed
many of her silly mistakes. Nevertheless, any errors that remain in the book you
are reading are Jane’s alone.
Symbian would like to thank Jane for her professional and yet relaxed approach
to this project – this book was created, apparently, effortlessly. We’d also thank
everyone involved in this project: the reviewers and technical experts, and copy-
editor Jenny Keates for her skilled edit.
Introduction 1
1
Introduction
Readers of this book should have a basic understanding of the Symbian kernel
architecture, including the process model, threads and memory management.
If you are not familiar with this material, I suggest reading Chapters 1 (Introduc-
ing EKA2), 3 (Threads, Processes and Libraries) and 7 (Memory Models) of my
book.1 To understand code-paging in depth, you should also consider reading
Chapters 9 (The File Server) and 10 (The Loader). If your interest lies in
device drivers, then Chapter 12 (Device Drivers and Extensions) provides a good
grounding in Symbian’s device driver architecture.
1 Symbian OS Internals: Real Time Kernel Programming, Jane Sales et al., 2005, John Wiley &
Sons. This entire text for the book can also be found online at developer.symbian.org/wiki/index.
php/Symbian_OS_Internals.
2 Demand Paging on Symbian
Demand paging is a feature of virtual memory systems that makes it possible for
pages of RAM to be loaded from permanent storage when they are needed – that
is, on demand. When the contents are no longer required, the RAM used to store
them may be reused for other content. In this way, the total physical RAM re-
quired to store content is less than if all the content were permanently available,
and hence the BOM of the device can be reduced.
The most important mechanism used in demand paging is the page fault. (Mem-
ory is managed in units known as pages, which are blocks of 4 KB on all current
architectures.) When an application attempts to access a location that is not
present in memory, a page fault occurs. The kernel handles this page fault, read-
ing the missing page from disk into RAM and then restarting the faulting applica-
tion. All memory that is managed by the kernel in this way is said to be pageable
memory, and the process is controlled by an entity known as the paging system.
A good source of more information on basic operating system concepts is Andrew
Tanenbaum’s Operating Systems: Design and Implementation.2 With an unbe-
coming lack of modesty, I’ll also suggest my own book, Symbian OS Internals,2 if
you’re more interested in the specifics of the Symbian kernel.
The Symbian implementation of demand paging has been a huge success. Not
only is there considerably more free RAM in the device, but also the ROM boots
more quickly, applications start up faster and the stability of the device has in-
creased. So successful was demand paging that it has been back ported two OS
generations, to devices that have already been released into the market.
2 Operating Systems: Design and Implementation (Second Edition). Andrew Tanenbaum, 1997,
Prentice-Hall.
The Advantages and Disadvantages of Demand Paging 3
2
The Advantages and Disadvantages of
Demand Paging
Demand paging can provide a considerable reduction in the RAM usage of the
operating system. The actual size of the RAM saving depends considerably on
the way demand paging is configured, as you will see. However, it is safe to say
that on a feature-rich user-interface (UI) platform it is possible to make multi-
megabyte RAM savings without a significant reduction in performance. Internal
tests show an increase in free RAM at boot from 18.5 MB to 32 MB – a 73% in-
crease. Figures after launching a couple of applications (Web and GPS) are even
more impressive. Free RAM increases from 9 MB to 26.5 MB in the demand-
paged ROM – a 194% increase.
At least two other benefits may be observed. Note that these are highly depen-
dent on the paging configuration. See Section 7 for more details.
During Symbian tests, typical boot time of a production ROM reduced from 35
seconds to 22 seconds – a performance boost of 37%.
This increased stability only applies when the entire device is OOM. Individual
threads may still have OOM problems due to reaching their own heap limits, and
demand paging will not help in these situations.
2.2.1 Performance
One of the expected downsides to demand paging is a reduction in performance.
Demand paging can add an execution overhead to any process that takes a
paging fault. Also, paging faults block a thread for a significant and unpredictable
time while the fault is serviced, so any thread that can take a paging fault is not
suitable for truly real-time tasks. The delay will be in the order of 1ms if the back-
The Advantages and Disadvantages of Demand Paging 5
ing store (media used to store pages that are not in main memory) is not
being used for other file system activity. However, in busy systems, the delay for
a single fault could be hundreds of milliseconds or more, and should be treated
as unbounded for the purpose of any real-time analysis. Note that a paging fault
can occur for each separate 4 KB page of memory accessed, so an eight-byte
object straddling two pages could cause two paging faults.
2.2.2 Complexity
Demand paging makes more demands on system software, especially the ker-
nel. The resulting increase in complexity brings a concomitant risk of defects. For
example, if the kernel is servicing a paging fault and needs to execute a thread
which can itself take a paging fault then deadlock results. (This problem also
occurs if the servicing thread needs to block on a service provided by another
thread which takes a paging fault, or to block on a mutex held by such a thread
– or even if the servicing thread needs to hold a mutex of a higher order than the
6 Demand Paging on Symbian
one held by the faulting thread.) Because this is complex, Symbian applies the
simple, safe rule that demand-paged memory must not be accessed when hold-
ing any kernel-side mutex. Symbian had to examine and re-write the kernel and
the media drivers to avoid this issue – but, fortunately, such code is rare. (And
for code paging, of which I shall say more in the next section, Symbian also had
to look at file systems, file-system plug-ins and all the other areas of the OS that
these use.)
That said, we should congratulate the Symbian kernel engineers. They have de-
livered an extraordinarily robust implementation of demand paging – such is the
quality of their delivery that it has needed very few modifications since it was first
released.
However, this does not take into account the considerable savings from demand
paging itself. You should expect a considerable net decrease in RAM usage when
using demand paging!
Understanding Demand Paging on Symbian 7
3
Understanding Demand Paging on Symbian
3.1 ROMs
The term ‘ROM’ traditionally refers to memory devices storing data that cannot
be modified. These devices also allow direct random access to their contents,
so that code can execute from them directly, and is said to be execute in place
(XIP). This has the advantage that programs and data in ROM are always avail-
able and don’t require any action to load them into memory.
On Symbian, the term ROM has developed the looser meaning of ‘data stored
in such a way that it behaves like it is stored in read-only memory.’ The underly-
ing media may be physically writeable (RAM or flash memory) but the file system
presents a ROM-like interface to the rest of the OS, usually as drive Z.
The ROM situation is further complicated when the underlying media is not XIP.
This is the case for NAND flash, which is used in almost all devices on the market
today. Here, it is necessary to copy (or shadow) any code in NAND to RAM for
execution. The simplest way to achieve this is to copy the entire ROM contents
into RAM during system boot and use the MMU to mark this area of RAM with
read-only permissions. The data stored by this method is called the core ROM
image (or just core image) to distinguish it from other data stored in NAND. The
core image is an XIP ROM and is usually the only one. It is permanently resident
in RAM.
8 Demand Paging on Symbian
Figure 1, Layout A on page 9 shows how the NAND flash is structured in this
simple case. All the ROM contents are permanently resident in RAM and any
executables in the user data area (usually the C: or D: drive) are copied into RAM
as they are needed.
This approach is costly in terms of RAM usage, so Symbian introduced the com-
posite file system, a more efficient scheme.
This scheme (broadly speaking) splits the ROM contents into those parts re-
quired to boot the OS, and everything else. The former is placed in the core
image as before and the latter is placed into another area known as the read-only
file system (ROFS). The loader (part of the file server) copies code in the ROFS
into RAM as it is needed at run-time, at the granularity of an executable, in the
same way as for executables in the user data area.
There can be several ROFS images, for example, localization and/or operator-
specific images. Usually, the first one (called the primary ROFS) is combined with
the core image into a single ROM-like interface by what is known as the compos-
ite file system.
In the rest of this document, I will use the term ROM to mean the combined core
and primary ROFS images – that is, everything on drive Z. Where it is important
to refer to just the core ROM image (or an XIP ROM generally), I will specify this.
References to ROFS will mean the primary ROFS unless otherwise stated. For
clarity, I shall only consider systems with a single (primary) ROFS.
Figure 1: The data copied into RAM for two different NAND flash layouts. The same use
case and ROM contents are assumed for each layout.
Since an XIP ROM image on NAND has to be loaded into RAM in order to run, an
opportunity arises to demand page the contents of the XIP ROM. This means that
when executing the ROM, we read its data from NAND into RAM on demand.
10 Demand Paging on Symbian
An XIP ROM image is split into two parts, one containing unpaged data and one
containing data that are paged on demand. Unpaged data consists of:
1. Kernel-side code
2. All code that should not be paged for other reasons (for example, perfor-
mance, robustness or power management)
3. The static dependencies of (1) and (2).
The terms ‘locked down’ or ‘wired’ are also used to mean unpaged.
At boot time, the unpaged area at the start of the XIP ROM image is loaded into
RAM as normal, but the virtual address region normally occupied by the paged
area is left unmapped. No RAM is allocated for it.
When a thread accesses virtual memory in the paged area, it takes a page fault.
The page-fault handler in the kernel then allocates a page of physical RAM and
reads the contents for this from the XIP ROM image on NAND flash. The thread
then continues execution from the point where it took the page fault. This process
is called ‘paging in’ and I will describe it in more detail in Sections 3.10 and 4.1.7.
When the free RAM on the system reaches zero, the kernel can satisfy memory-
allocation requests by taking RAM from the paged-in XIP ROM region. As RAM
pages in the XIP ROM region are unloaded, they are said to be ‘paged out’ (I
discuss this in Section 3.11). Figure 2 on the next page shows the operations
described.
All content in the paged data area of an XIP ROM is subject to paging, not just
executable code, so accessing any file in this area may give rise to a page
fault. Remember that a page may contain data from one or more files, and page
boundaries do not necessarily coincide with file boundaries.
All non-executable files in an XIP ROM, even those in the unpaged section, will
be paged unless they are explicitly configured to be unpaged (see Section 7.3).
When XIP ROM paging is used in conjunction with the composite file system, the
non-demand-paged NAND flash layout strategy may no longer make sense. If a
small core image and a large ROFS image are used, then only a fraction of the
files in the overall ROM can benefit from XIP ROM paging. Instead, it is usual to
Understanding Demand Paging on Symbian 11
place most (or even all) files in the core ROM image. The ROFS is mainly used
for files that would be in the unpaged area of the core ROM anyway or have too
many unpaged dependencies.
Figure 3, layout C on page 12 shows a typical XIP ROM paging structure. Al-
though the unpaged area of the core image may be larger than the total core im-
age in Figure 1, layout B, only a fraction of the contents of the paged area needs
to be copied into RAM compared to the amount of loaded ROFS code in Figure
1, layout B.
12 Demand Paging on Symbian
Figure 3: The data copied into RAM for two additional NAND flash layouts (compare with-
Figure 1). The same use case and ROM contents are assumed for each layout.
XIP ROM paging was officially first introduced in Symbian OS v9.3, but was so
successful that it was back-ported by some device manufacturers to their devices
based on Symbian OS v9.2.
Understanding Demand Paging on Symbian 13
Code paging extends XIP ROM paging, to include other file systems such as the
C drive, ROFS and media drives that cannot be removed (you wouldn’t want to
page to a memory card that the user might take out of the phone!). Executables
that aren’t in XIP ROM are stored as files in those other file systems. Before their
code can be executed, it must be copied to RAM. The particular location in RAM
cannot usually be determined ahead of time, so the loader, part of the file server,
must modify the file’s contents after copying to correct the memory pointers
contained within them: the modification is known as ‘relocation’ or ‘fix-up’. This
process makes code paging considerably more complex than XIP ROM paging.
The additional overhead of code paging means that device manufacturers should
usually choose to use XIP ROM paging on their devices where possible. Layout
D in Figure 3 shows a typical NAND flash structure when both code paging and
XIP ROM paging are used. Only those parts of an executable currently in use are
copied into RAM. XIP ROM paging is still used for most data in ROM, and code
paging is used for any remaining paged executables in ROFS. It is expected that
the primary use for code paging will be for executables in the user data area,
such as third-party applications.
Pages that are used for code-paged executables never contain the end of one
executable and the start of another, as XIP ROM pages might. This means that
the last page of an executable is very likely to contain less than 4 KB of data.
It is also worth noting that most other operating systems don’t implement code
paging in this way. Instead, when they need to page out the contents of ex-
ecutables, they write the modified contents to the backing store so that they can
later be recovered. Symbian does not do this because of concerns about power
consumption and the wearing of the storage media used for the backing store.
Code paging support was originally planned for Symbian OS v9.4, but was such
a success that it was actually made available in Symbian OS v9.3.
14 Demand Paging on Symbian
In Symbian OS v9.4, the file server uses file system caching to speed file opera-
tions. The file-system cache is built upon the disconnected chunk, which allows
physical RAM to be committed and de-committed anywhere within its virtual ad-
dress space, with no requirement that RAM in the chunk is contiguous.
The file-system cache is closely linked to demand paging. When the file server
caches a new file, it commits and locks memory, reads the data, and then unlocks
the memory, placing the pages at the start of the live list to become the youngest
pages in the system. Then, if the file server reads the file again and there is a file-
cache hit, it locks the cache buffer, removing the pages from the live list. The file
server reads the data, and then unlocks the pages again, returning them to the
start of the live list.
You can see that the cached data for the file will stay in memory as long as it is
accessed often enough to stay in the live list. It will only be lost from the live list
if there is no free memory in the RAM allocator, and all the other memory used
for paging and caching has been more recently accessed than it has – that is, if
these cache pages have become the oldest pages in the system.
Writeable data paging extends demand paging from just paging code and con-
stant date to the paging of writeable data too – data such as user stacks, heaps
and other user data stored in chunks. The big difference here is that data is
mutable, whereas code is not, so if it is modified while it is paged in, it must be
written to a backing store when it is paged out. The backing store is analogous to
the ‘swap file’ used to implement the virtual memory scheme on PC systems.
Understanding Demand Paging on Symbian 15
Writeable data paging would offer the greatest RAM saving but would also have
the greatest impact on performance. Also, the additional write activity would in-
crease the power consumption of the device and wear the backing media.
A smart memory-caching scheme might be required to mitigate this.
Writeable data paging could also be used to implement a more traditional code-
paging scheme than the one described in Section 3.5.
At the time of writing, Symbian does not support writeable data paging.
All memory content that can be demand paged is said to be ‘paged memory’ and
the process is controlled by the ‘paging subsystem.’ A page is a 4 KB block of
RAM. Here are some other terms that are used:
• Live page – a page of paged memory the contents of which are currently
available
• Dead page – a page of paged memory the contents of which are not cur-
rently available
• Page in – the act of making a dead page into a live page
• Page out – the act of making a live page into a dead page. The RAM used
to store the contents of this page may then be reused for other purposes.
In the rest of this section, I’ll give an overview of how the paging subsystem
works and introduce some more vocabulary that is useful for understanding later
sections.
paging cache. The live page list is split into two sub-lists, one containing young
pages and the other old pages – the paging subsystem attempts to keep the ratio
of the lengths of the two lists at a value called the ‘young/old ratio.’ The paging
subsystem uses the MMU to make all young pages accessible to programs but
all old pages inaccessible. However, the contents of old pages are preserved and
they still count as being live.
If a program accesses an old page, this causes a page fault. The paging subsys-
tem then turns that page into a young page (rejuvenates it), and at the same time
turns the last young page into an old page. See Section 4.1.4 for more detail and
illustration of this process.
When a program attempts to access a page that is paged out, the MMU gener-
ates a page fault and the executing thread is diverted to the Symbian exception
handler. This then performs the following tasks:
1. Obtains a page of RAM from the system’s pool of unused RAM (the ‘free
pool’) or, if this is empty, pages out the oldest live page and uses that
instead
2. Reads the content for this page from some media, such as NAND flash
3. Updates the paging cache’s live list as described in the previous section
4. Uses the MMU to make this RAM page accessible at the correct linear
(virtual) address
5. Resumes execution of the program’s instructions, starting with the one
that caused the initial page fault.
Note that these actions are executed in the context of the thread that tries to
access the paged memory. Paging in on a code-paged system is described in
Section 4.1.6.2.
If the system needs more RAM and the free pool is empty, then RAM that is
Understanding Demand Paging on Symbian 17
being used to store paged memory is freed up for use. This is called ‘paging out’
and happens using the following steps:
The first two parameters are the most important and they are discussed in the fol-
lowing sections. The third has a less dramatic effect on the system and is usually
left unchanged at its default value of 3.
There are existing parameters that also affect how demand paging performs.
Optimizing the configuration of all these together is discussed in Chapter 7.
It is important that all areas of the operating system that are involved in servic-
ing a paging fault are protected from blocking on the thread that took the paging
fault (directly or indirectly). If they are not, a deadlock situation could occur. This
is partly achieved in Symbian by ensuring that all kernel-side components are
always unpaged. Section 5.2.1 looks at the problem of page faults in kernel-side
code in more detail.
18 Demand Paging on Symbian
If the system needs more RAM but the free RAM pool is empty, then pages are
removed from the paging cache in order to service the memory allocation. This
cannot continue indefinitely because a situation will arise where the same pages
are continually paged in and out of the paging cache. This is known as page
thrashing. Performance is dramatically reduced in this situation.
To avoid catastrophic performance loss, a minimum paging cache size can be de-
fined. If a system memory allocation would cause the paging cache to drop below
the minimum size, then the allocation fails.
As data is paged in, the paging cache grows, but any RAM used by the cache
above the minimum size does not contribute to the amount of used RAM reported
by the system. Although this RAM is really being used, it will be recycled when-
ever anything else in the system requires the RAM. So the effective RAM usage
of the paging cache is determined by its minimum size.
In theory, it is also possible to limit the maximum paging cache size. However,
this is not useful in production devices because it prevents the paging cache from
using all the otherwise unused RAM in the system. This may reduce performance
for no effective RAM saving.
Understanding Demand Paging on Symbian 19
The easiest way to visualize the RAM saving achieved by demand paging is to
compare the most simplistic configurations. Consider a non-demand-paged ROM
consisting of a core with no ROFS (as in Figure 1, layout A). Compare that with
a demand-paged ROM consisting of an XIP ROM paged core image, again with
no ROFS (similar to Figure 3, layout C, but without the ROFS). The total ROM
contents are the same in both cases. Figure 4 depicts the effective RAM saving.
Figure 4: The effective RAM saving when paging a simple XIP ROM.
20 Demand Paging on Symbian
The effective RAM saving is the size of all paged components minus the mini-
mum size of the paging cache. Note that when a ROFS section is introduced,
this calculation is much more complicated because the contents of the ROFS are
likely to be different between the non-demand-paged and demand-paged cases.
You can increase the RAM saving by reducing the set of unpaged components
and/or reducing the minimum paging cache size, making the configuration more
‘stressed.’ You can improve the performance (up to a point) by increasing the set
of unpaged components and/or increasing the minimum paging cache size, mak-
ing the configuration more ‘relaxed.’ However, if the configuration is too relaxed,
it is possible to end up with a net RAM increase compared with a non-demand-
paged ROM.
The default ‘deflate’ compression algorithm used by Symbian only allows a par-
ticular size of compressed data to be decompressed as a whole unit. This is fine
when decompressing a complete XIP ROM image or a whole executable, but it is
not acceptable for demand paging where only a single page of an image/execut-
able needs to be decompressed during a page-in event.
Support for byte-pair compression was introduced in Symbian OS v9.2. For more
details, see Section 4.1.13.
Under the Hood: The Implementation of Demand Paging 21
4
Under the Hood: The Implementation of
Demand Paging
The kernel manages the free memory in the system via the class MmuBase
(mmubase.h). This uses RamCacheBase to access the paging system and
provide RAM from both the free pool, and, if necessary, live pages. To do this,
RamCacheBase makes use of MmuBase’s RAM allocator object
(DRamAllocator, found in ramalloc.h).
1 Where possible, I have given the name of the file in which you can find more information about
the class in question, although some of these files may not initially be available under the Eclipse
Public License. Where filenames are not given, you may find further information about the class or
method given by using the code search available on developer.symbian.org.
22 Demand Paging on Symbian
2. Determine the storage media location where the page’s contents are
stored
To read demand-paged content from media, the kernel must first determine its
storage location. For the contents of ROM, it does this using an index stored in
the unpaged part of ROM. For non-XIP code, it uses the blockmap structure rep-
resenting the particular executable. This is stored in the kernel’s code segment
object for that executable. The kernel searches an address-sorted list to deter-
mine which code segment to use.
4. Use the MMU to make this RAM page accessible at the correct virtual
address
The kernel stores all the memory that is currently being used to store demand-
paged content in the ‘live list,’ discussed in Section 4.1.4.
5. Resume execution
The kernel resumes execution of the code that triggered the page-in event.
2 Symbian OS Internals: Real Time Kernel Programming, Jane Sales et al., 2005, John Wiley &
Sons. The entire text of the book can be found online at developer.symbian.org/wiki/index.php/
Symbian_OS_Internals
Under the Hood: The Implementation of Demand Paging 23
There were two main effects of this work on device manufacturers. The first was
a modest increase of RAM usage by the system – around 0.58% of total RAM.
The second was an effect on people porting Symbian to new platforms. If you are
doing this, you will need to recompile the bootstrap.
Implementation Overview
In brief:
• The SPageInfo structure was simplified so that members only store one
datum.
• New members were added to SPageInfo to allow the implementation of
demand paging and file-system caching.
• SPageInfo is now stored in a sparse array, indexed by a page’s
physical address. The memory allocation for the array has been moved
into the generic bootstrap code (bootmain.s).
• All use of page numbers in APIs was removed, because information
pertaining to a RAM page can now be directly obtained using only the
physical address of that page.
• All updates to SPageInfo objects are now performed by functions that
assert that the system lock is held, ensuring that all updates have been
coded in a safe and atomic manner.
• Simulated OOM conditions have been added to low-level memory-
management functions. This improves coverage on existing and new
OOM testing.
Implementation Detail
SPageInfo is defined in mmubase.h and looks as follows:
struct
struct SPageInfo
SPageInfo
{
{
enum
enum TType
TType
{
{
24 Demand Paging on Symbian
enum TState
{
EStateNormal = 0, // no special state
EStatePagedYoung = 1, // demand paged and on the young list
EStatePagedOld = 2, // demand paged and on the old list
EStatePagedDead = 3, // demand paged, currently being modified
EStatePagedLocked = 4 // demand paged but temporarily not paged
};
...
private:
TUint8 iType; // enum TType
TUint8 iState; // enum TState
TUint8 iSpare1;
TUint8 iSpare2;
TAny* iOwner; // owning object
TUint32 iOffset; // page offset within owning object
Under the Hood: The Implementation of Demand Paging 25
The iType member indicates how the RAM page is currently being used. The
iOwner and iOffset members then typically specify a particular kernel object
(for example, DEpocCodeSeg) and location within it.
For demand paging, the key TType attributes for a page are:
The iState member was added to support demand paging and has one of
these values from enum TState:
prevent it from being reclaimed for new usage, that is, it is ‘pinned.’ The
page’s contents remain valid and mapped by the MMU.
The iLink member is used by the paging system to link page-information struc-
tures together into the live page list. This member is also re-used to store other
data when the page isn’t in the live list – for example, when a page is in the
EStatePagedLocked state, iLink.iPrev is used to store a lock count.
The remaining iSpare members pad the structure size to a power of two and will
be used in the implementation of future OS features.
The actual functions for translating between an address and an SPageInfo are:
As the page info array is sparse, an invalid physical address may cause an
exception when indexing the array. So to make it possible to validate physical ad-
dresses, a bitmap is also made available at address KPageInfoMap. This con-
tains set bits for each 4 KB page of RAM in the array which is present in memory
and is used by SPageInfo::SafeFromPhysAddr(), which performs the same
task as SPageInfo::FromPhysAddr() but returns NULL if the physical ad-
dress is invalid rather than causing a data abort.
Initialization
The bootstrap code (in bootmain.s) creates both the array at KPageInfo
LinearBase and the bitmap at KPageInfoMap. Initially, all SPageInfo struc-
tures have type EInvalid. During kernel boot, the MmuBase::Init2() func-
tion sets the correct state for each page by scanning the RAM bank list provided
by the bootstrap, and scanning all MMU page tables for RAM already allocated.
software should flash (release/re-acquire) the system lock and then detect if an-
other thread has modified the page in question. This is done using the iModifier
member.
The kernel sets iModifier to zero whenever the SPageInfo object is changed
in any way. So if a thread sets this to a suitable unique value (for example, the
current thread pointer) then that thread may flash the system lock and then find
out whether another thread has modified the page by checking whether
iModifier has changed. The kernel provides functions for this – SetModifier()
and CheckModified(). An example of their use is as follows:
NKern::LockSystem();
while(long_running_operation_needed)
{
do_part_of_long_running_operation(thePageInfo);
if(NKern::FlashSystem() &&
pageInfo->CheckModified(currentThread))
{
// someone else got the System Lock and modified our page...
reset_long_running_operation(thePageInfo);
}
}
NKern::UnlockSystem();
All live pages are stored on the ‘live page list,’ which is a linked list of SPageInfo
objects, each of which refers to a specific page of physical RAM on the device.
The list is ordered chronologically by time of last access, to enable a least recent-
ly used (LRU) algorithm to be used when discarding paged content.
To keep the pages in chronological order, the kernel needs to detect when they
are accessed. It does this by splitting the live list into two sub-lists, one contain-
ing ‘young’ pages and the other ‘old’ pages. It uses the MMU to make all young
pages accessible, but old pages inaccessible. However, the contents of old
pages are still preserved, and these pages are still considered to be live. When
an old page is next accessed, there is a data abort. The fault handler can then
simply move the old page to the young list and make it accessible again so the
program can continue as normal.
The net effect is of a first-in, first-out list in front of an LRU list, which results in
less page churn than a plain LRU.
This is shown in more detail in Figure 5:
When a page is paged in, it is added to the front of the young list, making it the
youngest page in the system, as shown in Figure 6:
The paging system aims to keep the relative sizes of the two lists equal to a value
called the ‘young/old ratio.’ So, if this ratio is R, the number of young pages is N,
and the number of old pages is No, then if Ny > RNo a page is taken from the back
of the young list and placed at the front of the old list. This process is called ‘ag-
ing’ and is shown in Figure 7:
30 Demand Paging on Symbian
If a program accesses an old page, it causes a page fault, because the MMU has
marked old pages as inaccessible. The fault handler then turns the old page into
a young page (‘rejuvenates’ it), and, at the same time, turns the last young page
into an old page. This is shown in Figure 8.
When the kernel needs more RAM (and the free pool is empty), it needs to re-
claim the RAM used by a live page. In this case, the oldest live page is selected
for paging out, turning it into a dead page, as Figure 9 shows.
If paging out leaves the system with too many young pages according to the
young/old ratio, then the kernel would age the last young page on the list (in
Figure 9, that would be Page D).
The net effect of this is that if the page is accessed at least once between
every page fault, it will just cycle around the young list. If it is accessed less often,
relative to the page-fault rate, it will appear in the old list – appearing further and
further back the less often it is accessed.
These two memory pools interact through the class RamCacheBase, which
provides:
class RamCacheBase
class RamCacheBase
{
{
public:
public:
// Initialisation called during MmuBase:Init2.
// Initialisation called during MmuBase:Init2.
virtual void Init2();
virtual void Init2();
// Initialisation called from M::DemandPagingInit.
// Initialisation called from M::DemandPagingInit.
virtual TInt Init3()=0;
virtual TInt Init3()=0;
// Remove RAM pages from cache and return them to the system.
// Remove RAM pages from cache and return them to the system.
virtual TBool GetFreePages(TInt aNumPages)=0;
virtual TBool GetFreePages(TInt aNumPages)=0;
// Attempt to free-up a contiguous region of pages
// Attempt to free-up a contiguous region of pages
// and return them to the system.
// and return them to the system.
virtual TBool GetFreeContiguousPages(TInt aNumPages, TInt aA-
virtual TBool GetFreeContiguousPages(TInt aNumPages,
lign) =0;
TInt aAlign) =0;
// Give a RAM page to the cache system for managing.
// Give a RAM page to the cache system for managing.
virtual void DonateRamCachePage(SPageInfo* aPageInfo)=0;
virtual void DonateRamCachePage(SPageInfo* aPageInfo)=0;
// Attempt to reclaim a RAM page given to
// Attempt to reclaim a RAM page given to
// the cache with DonateRamCachePage.
// the cache with DonateRamCachePage.
virtual TBool ReclaimRamCachePage(SPageInfo* aPageInfo)=0;
virtual TBool ReclaimRamCachePage(SPageInfo* aPageInfo)=0;
32 Demand Paging on Symbian
Memory Allocation
When the kernel wants to allocate memory, it uses
MmuBase::AllocRamPages(), which requests a number of pages from the
RAM allocator object. If there are insufficient pages to meet this request then this
method calls RamCacheBase::GetFreePages(). This is actually implemented
in DemandPaging::GetFreePages().
DCodeSeg
When an executable binary is loaded into the system, the kernel creates a
DCodeSeg (kern_priv.h) object to represent it. This mainly contains two broad
groups of information:
Paging In
When a demand-paged DCodeSeg is first loaded, the contents of the .text sec-
tion of the executable are not present in RAM. All that exists is a reserved region
of virtual address space in the code chunk. This means that when a program ac-
cesses the contents, the MMU will generate a data abort. The exception handler
calls MemModelDemandPaging::HandleFault(), which then has to obtain
RAM, copy the correct contents to it and map it at the correct virtual address.
This is known as ‘paging in.’
1. Check the MMU page table entry for the address that caused the abort. If
the entry is KPteNotPresentEntry then there is no memory mapped at
this address and it may need paging in.
2. Verify that the exception was caused by an access to the code chunk
memory region.
3. Find the DCodeSeg which is at this address by searching the sorted list
DCodeSeg::CodeSegsByAddress.Find(aFaultAddress).
4. Verify that the DCodeSeg is one that is being demand paged.
5. Call MemModelDemandPaging::PageIn(), which then performs the
following steps:
6. Obtain a DemandPaging::DPagingRequest (demand_paging.h)
object by using DemandPaging::AcquireRequestObject().
7. Obtain a physical page of RAM using
DemandPaging::AllocateNewPage().
8. Map this RAM at the temporary location
DemandPaging::DPagingRequest::iLoadAddr.
9. Read correct contents for into RAM page by calling
DemandPaging::ReadCodePage().
10. Initialize the SPageInfo structure for the physical page of RAM, marking
it as type EPagedCode.
11. Map the page at the correct address in the current process.
12. Add the SPageInfo to the beginning of the live page list. This marks it as
the youngest (most recently used) page.
13. Return, and allow the program that caused the exception to continue to
execute.
36 Demand Paging on Symbian
In the multiple memory model, there are separate MMU mappings for each pro-
cess into which a DCodeSeg is loaded – but step 11 in the previous sequence
updates the mapping only for the process that caused the page in to occur. This
is a deliberate decision because:
To avoid having duplicate copies of the same DCodeSeg page, the mul-
tiple memory model keeps a list of pages which a DCodeSeg has paged
in (DMemModelCodeSegMemory::iPages). Then, before calling
MemModelDemandPaging::PageIn() in step 5 of the previous sequence, it
checks to see if there is already a page loaded and if so, simply maps that page
into the current process and ends.
Aging a Page
The pseudo-LRU algorithm used by demand paging means that as pages
work their way down the live list, they eventually reach a point where they
change from young to old. At this point, the kernel changes the MMU map-
pings for the page to make it inaccessible. It does this using the method
MemModelDemandPaging::SetOld() (xmmu.cpp).
In the moving memory model, SetOld() simply has to find the single page
table entry (MMU mapping) for the page in the user code chunk, and clear the
bits KPtePresentMask. In the multiple memory model, there can be many
page table entries that need updating. In this case, the kernel calls the method
DoSetCodeOld()(xmmu.cpp) to actually do the work. DoSetCodeOld()
examines the bit array DMemModelCodeSegMemory::iOsAsids to determine
the processes into which the DCodeSeg is loaded, and then updates each map-
ping in turn. Because the system lock must be held, this can affect the real-time
performance of the system, and so the technique described in Section 4.1.3.5
is used. If the page’s status changes before DoSetCodeOld()has modified the
mappings in all processes, then DoSetCodeOld() simply ends. This is the right
thing to do, because the page’s status can change in one of two ways: either
Under the Hood: The Implementation of Demand Paging 37
by the page being rejuvenated (which I’ll discuss next) or by the page being
removed from the live list (ceasing to be demand paged). Both of these events
logically supersede any page-aging operation.
Rejuvenating a Page
When a program accesses an old page, it generates a data abort because the
MMU has marked these pages as inaccessible. The fault handler (MemModelDe
mandPaging::HandleFault) deals with this using the following sequence of
actions.
1. Get the MMU page table entry for the address that caused the abort. If
the bits KPtePresentMask are clear, then this is an old page that needs
rejuvenating. (If all bits are clear, then the page needs paging in instead.
2. Find the SPageInfo for the page, using the physical address stored in
the page table entry.
3. If state of page is EStatePagedDead, then change the page table
entry to KPteNotPresentEntry and proceed to the paging-in operation
(described in Section 4.1.6.2) instead of rejuvenating it. This is because
a dead page is in the process of being removed from the live list and it
should be treated as though it were not present.
4. Otherwise update the page table entry to make the page accessible again.
5. Move the page’s SPageInfo to the beginning of the live page list. This
marks it as the youngest page in the system.
Similarly to paging in, we only update the page table entry for the current
process. The whole rejuvenation operation is performed with the system lock
held.
Freeing a Page
When a physical page of RAM that holds demand-paged code is needed for
other purposes, it must be free by calling
MemModelDemandPaging::SetFree(). This mainly involves setting all page
table entries that refer to the page to KPteNotPresentEntry.
However, again the multiple memory model needs to update many page table
entries, so Symbian factored this implementation out into a separate method
called DoSetCodeFree(). Unlike the rejuvenation code, this method does not
38 Demand Paging on Symbian
need to pay attention to whether pages are changed while it is processing them.
This is because all pages that are being freed have their state set
to SPageInfo:: EStatePagedDead first. This prevents other parts of the
demand-paging implementation from changing the page.
The free operation is performed with the system lock held. However, as is the
case for the rejuvenation code, DoSetCodeOld() flashes the system lock while
freeing, and this makes it possible for the code segment to be unloaded and its
DMemModelCodeSegMemory destroyed while these data structures are being
used. To prevent this situation, SetOld() must be called with the RAM-allocator
mutex held. As the destructor for DMemModelCodeSegMemory also acquires
this mutex during its operation, it cannot complete while one of its (former) RAM
pages is in the process of being freed.
• The virtual address and the size of the ROM is fixed and known at boot
time. So it is a trivial matter to determine whether a particular memory
access occurred in demand-paged memory or not.
• The ROM cannot be unloaded, so the kernel does not need to guard
against as many race conditions.
• The ROM is globally mapped by the MMU, so even in the multiple memory
model there is only a single MMU mapping that needs updating when the
demand-paging subsystem manipulates pages of RAM.
ROM Format
When the ROMBUILD tool generates ROM images for demand paging, it
divides the contents of the ROM into two sections. All unpaged content is
placed in the first section, with paged content following it. The size of the two
sections is stored in the ROM header – the unpaged part is stored at ROM
offset zero through TRomHeader::iPageableRomStart and the paged
part is stored at ROM offset TRomHeader::iPageableRomStart through
Under the Hood: The Implementation of Demand Paging 39
TRomHeader::iUncompressedSize.
Initialization
The first, unpaged, part of ROM is loaded into RAM by the system’s boot/code
loader before any Symbian code is executed. Then, during kernel start-up,
MemModelDemandPaging::Init3()(xmmu.cpp) initializes ROM paging. This
method checks the ROM header information and allocates MMU page tables for
the virtual memory region to which the ROM will be mapped.
TUint32 iDataStart;
TUint32 iDataStart;
40 Demand Paging on Symbian
TUint16 iDataSize;
TUint8 iCompressionType;
TUint8 iPagingAttributes;
};
iDataStart gives the offset from the start of the ROM for the page, and
iDataSize gives the number of bytes actually used.
As the ROM image is always stored in its own partition, starting at storage block
zero, those attributes are all that is required to locate, read and decompress the
data for a ROM page.
This calculation is complicated by several factors. The data may or may not be
compressed, so its size is not necessarily the same as the size of the page. The
data can start at any offset into the file, because the file contains a small header
and any number of previous (possibly compressed) pages. And finally, the file
itself may be split over multiple, discontinuous sectors on the media.
What is needed is a representation of how the file is laid out on the media. We
call this abstraction a ‘blockmap,’ because it is structured in terms of blocks that
Under the Hood: The Implementation of Demand Paging 41
There are two types of blockmap used in the demand paging system: the user-
side blockmap and the kernel blockmap.
User-Side Blockmap
The file system provides the user-side blockmap to the loader, which in turn
passes it to the kernel. The user-side blockmap is defined by the
SBlockMapInfoBase and TBlockMapEntryBase classes. It consists of a
single context structure SBlockMapInfoBase that contains information relating
to the blockmap as a whole, and a series of TBlockMapEntryBase structures
describing the file layout. These structures are defined in e32ldr.h.
//
// e32ldr.h
e32ldr.h
struct
struct SBlockMapInfoBase
SBlockMapInfoBase
{
{
TUint
TUint iBlockGranularity;
iBlockGranularity;
TUint
TUint iBlockStartOffset;
iBlockStartOffset;
TInt64
TInt64 iStartBlockAddress;
iStartBlockAddress;
TInt
TInt iLocalDriveNumber;
iLocalDriveNumber;
};
};
class
class TBlockMapEntryBase
TBlockMapEntryBase
{
{
public:
public:
TUint
TUint iNumberOfBlocks;
iNumberOfBlocks;
TUint
TUint iStartBlock;
iStartBlock;
};
};
start of the partition. This is necessary because some file systems (for example,
FAT) have data before the first sector, and the size of this data may not be a mul-
tiple of the block size.
The blockmap does not have to start at offset zero in a file, or even if it does, the
first byte of the file may not lie on a block boundary, so iBlockStartOffset is
the offset of the first byte represented by the blockmap from the first block given.
class TBlockMap
class TBlockMap
{
{
struct SExtent
struct SExtent
{
{
TInt iDataOffset;
TInt iDataOffset;
TUint iBlockNumber;
TUint iBlockNumber;
};
};
TInt iDataLength;
TInt iDataLength;
TInt iExtentCount;
TInt iExtentCount;
SExtent* iExtents;
SExtent* iExtents;
// ...
// ...
};
};
While the user-side blockmap is stored as a list of runs of blocks starting at a
particular block, the kernel blockmap is stored as a list of logical file offsets that
start at a particular block. The list is ordered, so the length of a run can be found
by calculating the difference between successive file offsets in the list.
The kernel creates the kernel blockmap from the user-side blockmap so that it
can look-up physical media locations efficiently, finding the block corresponding
to a particular file offset via a binary search.
Under the Hood: The Implementation of Demand Paging 43
The kernel blockmap may have a different block size to its user-side equivalent,
because the kernel needs to communicate with media drivers in terms of read
units, which may be different to the sector size of the media. The read unit size is
usually 512 bytes.
The kernel blockmap provides a method that reads part of a file into memory. It
takes a logical file offset and a size describing the area of the file to read, a buffer,
and a function that performs the actual task of reading blocks from the media.
In this section, I will discuss the use of code segments for code-paged executa-
bles. All references to code segments will refer to non-global RAM-loaded code
segments.
When a new executable or DLL is loaded, a code segment object is created and
initialized. This is a three-stage process. First the loader calls the kernel to create
the code segment, then it loads the code into memory and fixes it up, and finally it
calls the kernel a second time to complete initialization and indicate that the code
can now be mapped into processes ready for execution.
For a demand-paged code segment, the procedure is similar, except that the
loader does not actually load the code itself. Instead, it provides the necessary
information to the kernel, which performs load and fix-up on demand. The three
stages are still necessary because the loader needs to access the code itself to
generate the code-relocation table. I describe these stages in detail in the next
section.
44 Demand Paging on Symbian
class TCodeSegCreateInfo
class TCodeSegCreateInfo
{
{
TUint32* iCodeRelocTable;
TUint32* iCodeRelocTable;
TInt iCodeRelocTableSize;
TInt iCodeRelocTableSize;
TUint32* iImportFixupTable;
TUint32* iImportFixupTable;
TInt iImportFixupTableSize;
TInt iImportFixupTableSize;
TUint32 iCodeDelta;
TUint32 iCodeDelta;
TUint32 iDataDelta;
TUint32 iDataDelta;
TBool iUseCodePaging;
TBool iUseCodePaging;
TUint32 iCompressionType;
TUint32 iCompressionType;
TInt32* iCodePageOffsets;
TInt32* iCodePageOffsets;
TInt iCodeStartInFile;
TInt iCodeStartInFile;
TInt iCodeLengthInFile;
TInt iCodeLengthInFile;
SBlockMapInfoBase iCodeBlockMapCommon;
SBlockMapInfoBase iCodeBlockMapCommon;
TBlockMapEntryBase* iCodeBlockMapEntries;
TBlockMapEntryBase* iCodeBlockMapEntries;
TInt iCodeBlockMapEntriesSize;
TInt iCodeBlockMapEntriesSize;
RFileClamp iFileClamp;
RFileClamp iFileClamp;
};
};
The location of the executable file on the media is given by a user-side blockmap,
which is made up of iCodeBlockMapCommon, iCodeBlockMapEntries and
iCodeBlockMapEntriesSize. The position of the text section in the execut-
able file is given by iCodeStartInFile and iCodeLengthInFile.
The loader passes iFileClamp to stop the file being deleted while it is being
used for paging, and sets the flag iUseCodePaging to tell the kernel to page
this code segment.
When the kernel creates a non-paged code segment, its main task is to allocate
RAM for the code and map it into the fileserver process (in which the loader
thread runs). This is done using conventional means – in the moving memory
model, the kernel commits memory to the global code chunk, and in the multiple
memory model, it allocates physical RAM pages then maps them into the fileserv-
er process’s code chunk.
For a demand-paged code segment, the kernel must allocate not physical RAM
but address space in the appropriate chunk. This is done using a new commit
type ‘ECommitVirtual.’ This marks part of the chunk’s address space as used,
so that nothing else may be committed there.
For non-paged code segments, the loader would load any static data directly
following the code in memory. This is not possible with code paging, because the
kernel loads code on demand but the loader loads the data. So the kernel allo-
cates memory, starting at the page after the end of the code, ready for the loader
to load the data into.
As well as committing address space for the code, the kernel must initialize
its internal data structures and read all the relevant parts of TCodeSegCre-
ateInfo. It copies the code page offsets table to the kernel heap, and creates
a kernel blockmap from the user-side version passed. Finally, the code segment
46 Demand Paging on Symbian
is marked as paged – from now on any read access will cause code to be loaded
from media and decompressed, although no relocation or fix up will be performed
yet.
For a paged code segment, the loader does not need to load the code, because
the kernel handles this later, when a fault is taken. This means that the kernel
must do the relocations and fix-ups, so the loader must now generate the data
the kernel will need to perform these operations.
The loader does load static data now, to the address allocated by the kernel,
and then loads dependencies in the usual way. The loader then calls back to the
kernel, passing the relocation and fix-up tables in the TCodeSegCreateInfo
structure.
For a non-paged code segment, all that the kernel has to do at this stage is the
necessary cache maintenance to ensure that the instruction cache is up-to-date
with the contents of RAM. It also unmaps the code segment from the fileserver
process.
For a paged code segment, the kernel reads the relocation and import tables and
stores them on the kernel heap, as it does the initial static data and the export
table (part for the text section). It then frees the memory allocated for the loader
to write the static data into. Any page that has already been loaded (because the
loader accessed it when compiling the relocation table) is fixed up. Cache main-
tenance happens as before, and the code segment is unmapped from the file
server.
Under the Hood: The Implementation of Demand Paging 47
class
class DDemandPagingLock
DDemandPagingLock : : public
public DBase
DBase
{
{
public:
public:
//
// Reserve
Reserve memory
memory so
so that
that this
this object
object can
can be
be
// used for locking up to aSize
// used for locking up to aSize bytes.bytes.
IMPORT_C
IMPORT_C TInt
TInt Alloc(TInt
Alloc(TInt aSize);
aSize);
//
// Perform
Perform Unlock(),
Unlock(), then
then free
free the
the memory
memory reserved
reserved by
by Alloc().
Alloc().
IMPORT_C void Free();
IMPORT_C void Free();
//
// Ensure
Ensure all
all pages
pages in
in the
the given
given region
region are
are present
present and
and lock
lock
// them so that they will not be paged out. If the
// them so that they will not be paged out. If the region region
//
// contained
contained nonodemand
demand paged
paged memory,
memory, thenthen no action
no action is per-
is performed .
formed.
// This function may not be called again until the previous
//
// This function
memory has beenmay not be called again until the previous
unlocked.
// memory has been unlocked.
IMPORT_C TBool Lock(DThread* aThread, TLinAddr aStart,
IMPORT_C TBool Lock(DThread* aThread, TLinAddr
TInt aStart,
aSize); TInt
aSize);
//Unlock any memory region which was previously locked with Lock().
//Unlock any Unlock();
inline void memory region which was previously locked with
Lock().
inline void Unlock();
IMPORT_C DDemandPagingLock();
inline ~DDemandPagingLock() { Free(); }
IMPORT_C
}; DDemandPagingLock();
inline ~DDemandPagingLock() { Free(); }
};
tions in which we wish to lock memory, the operation must not fail due to out-of-
memory conditions. Because of this, the DDemandPagingLock object provides
the Alloc() method which reserves memory for later use.
Any code that needs to lock pages should create a DDemandPagingLock object
and allocate memory during its initialization phase. It can later lock this memory
without risk of failing.
To avoid wasting reserved memory when it is not being used, the kernel does the
reservation by increasing the minimum size of the live list
(iMinimumPageCount) by the number of reserved pages, (iReservePage-
Count). The kernel calls DemandPaging::ReserveAlloc() to do this. After
this, because the live list is now larger than it would otherwise have been, more
demand-paged and file-cache content can reside in RAM, so the memory re-
served for locking is being put to good use until it is needed.
Locking Memory
To lock memory ready for safe access, kernel-side code calls
DemandPaging::LockRegion(), specifying a region of virtual memory in
a given thread’s process.This method first checks if the region can contain
demand-paged memory, and immediately returns false if it can’t. This provides
very fast execution for typical use cases. (The only memory that can be demand
paged is code and constant data in executable images, and it is very unlikely that
an application would ask a device driver to operate on this sort of data.)
If the memory region to be locked does reside in a pageable area – that is, it is a
ROM or code chunk – then the method goes on to repeatedly call
LockPage() for each page in the region.
already present), then examines the SPageInfo for the page to determine its
type. If the page is pageable and on the live list, then it is removed from the live
list and its state is changed to EStatePagedLocked with a lock count of one. If
the page was already locked, then the lock count is simply incremented.
Because locked pages are not on the live list, they will not be selected for paging
out.
const
const TInt
TInt KMaxFragmentSize
KMaxFragmentSize =
= 0x8000;
0x8000;
//
// one
one time
time initialisation...
initialisation...
DDemandPagingLock*
DDemandPagingLock* iPagingLock
iPagingLock =
= new
new DDemandPagingLock;
DDemandPagingLock;
iPagingLock.Alloc(KMaxFragmentSize);
iPagingLock.Alloc(KMaxFragmentSize);
//
// example
example function
function which
which locks
locks memory
memory in
in fragments...
fragments...
void DoOperation(TUint8* buffer, TInt
void DoOperation(TUint8* buffer, TInt size)size)
{
{
while(size)
while(size)
{
{
TUint8*
TUint8* fragmentStart
fragmentStart == buffer;
buffer;
do
do
{ {
TInt
TInt lockSize
lockSize =
= Min(size,
Min(size, KMaxFragmentSize);
KMaxFragmentSize);
if(iPagingLock->Lock(buffer,lockSize))
if(iPagingLock->Lock(buffer,lockSize))
break;
break; //
// lock
lock used
used so
so now
now process
process fragment
fragment
50 Demand Paging on Symbian
// expand fragment...
buffer += lockSize;
size -= lockSize;
} while(size);
// process fragment...
TInt fragementSize = buffer-fragmentStart
DoOperationOnFragment(fragmentStart, fragmentSize);
iPagingLock->Unlock();
}
}
Note that although the previous code makes use of the DDemandPagingLock
API, as it should, the actual work is done by the DemandPaging object.
These new APIs are present in non-demand-paged systems, but they are never
called.
Under the Hood: The Implementation of Demand Paging 51
This API creates a chunk that is local (that is, private) to the process creating it.
The size of the reserved virtual address space is given by aMaxSize. Memory
may be committed at creation by setting appropriate values for
aInitialBottom and aInitialTop.
RChunk::Commit(TInt
RChunk::Commit(TInt aOffset,
aOffset, TInt
TInt aSize);
aSize);
Commit (allocate) aSize bytes of memory at position aOffset within the chunk.
RChunk::Decommit(TInt
RChunk::Decommit(TInt aOffset,
aOffset, TInt
TInt aSize);
aSize);
Decommit (free) aSize bytes of memory at position aOffset within the chunk.
RChunk::Unlock(TInt
RChunk::Unlock(TInt aOffset,
aOffset, TInt
TInt aSize);
aSize);
This method places aSize bytes of memory at position aOffset within the
chunk in the unlocked state. While the memory is in this state, the system is
free to reclaim it for other purposes, so its owner must no longer access it. Both
aSize and aOffset should be a multiple of the MMU page size.
52 Demand Paging on Symbian
This method reverses the operation of Unlock() for aSize bytes of memory at
position aOffset. It returns the RAM to the fully committed state, which means
its owner can access it once more, and its contents are exactly as they were
when Unlock() was originally called. Both aSize and aOffset should be a
multiple of the MMU page size.
However, if in the interim the system has reclaimed any RAM in the region, then
the method fails with KErrNotFound and the region remains in the unlocked
state.
The existing Commit() function has been modified to operate on a region of the
chunk that has been unlocked. It performs the lock operation on all pages that
have not been reclaimed by the system and allocates new pages where the pre-
vious ones have been reclaimed. After this operation, the contents of the region
are undefined.
This chance was made to optimize situations in which a virtual address region
is to be reused for new cache contents. It avoids the extra overheads involved
in separate Decommit() and Commit() operations, which could be significant
because any new RAM allocated to a chunk must be wiped clean first, to avoid
security issues with data leakage between processes.
RChunk::Decommit(TInt
RChunk::Decommit(TInt aOffset,
aOffset, TInt
TInt aSize);
aSize);
The existing Decommit() function has been modified so that it also returns any
unlocked memory in the region back to the system.
At the end of this operation, the pages used for this cache are placed at the front
of the live list because they are now the youngest pages in the system. As de-
mand-paging or other file-system caching operations occur, the cache pages will
grow older and move down the live list.
At the end of this operation, the pages used for this cache are back on the live
list as the youngest pages in the system. The cached data for the file will stay
in RAM for as long as it is accessed sufficiently often to remain in the live list. It
will only be lost from the live list if there is no free memory in the RAM allocator
and all the other memory used for paging and caching has been more recently
accessed than it has (that is, when these cache pages have become the oldest
pages in the system).
54 Demand Paging on Symbian
File-system caching uses a ‘backing off’ algorithm, so that when cache contents
are lost, the size of the memory being used for caching is reduced, reducing the
likelihood of future cache misses.
Where performance is more important than ROM size, it may be better to use
byte-pair compression instead of deflate compression, even outside the context
of demand paging.
Support for byte-pair compression has been part of the loader since Symbian
OS v9.2. It is only possible to demand-page data that has either been byte-pair
compressed or is uncompressed.
Algorithm
The input stream is compressed one 4 KB block at a time. This block size was
chosen to match the MMU page size, enabling the paging subsystem to decom-
press each page as needed. Fortuitously, this seems to be the optimum block
size for compression efficiency. In some cases, the block size may be less than 4
KB – for example, when the last block in a compression stream is ‘short.’
1. Find the least frequently occurring byte, X. This will be used as the escape
byte.
2. Replace all occurrences of X with the pair of bytes {X,X}.
3. Find the new least frequently occurring byte, B.
4. Find the most frequently occurring pair of consecutive bytes, {P,Q}.
5. Check for terminating condition.
56 Demand Paging on Symbian
When calculating frequencies in steps three and four, exclude those that involve
any bytes from the escaped form {X,?}.
When calculating frequencies in step four, do not count overlapping pairs that
occur in a repeated byte sequences. So the sequence of five identical bytes
(P,P,P,P,P) contains the two pairs and singleton ({P,P},{P,P},P) – not four pairs.
The terminating condition in step five checks if the substitution performed in steps
six and seven will not result in any further compression. This depends on the
storage format for the compressed data (discussed in the next section) and is
calculated as follows:
1. Let f(B) be the frequency of byte B, and f(P,B) the frequency of the pair
{B,P
2. If the number of byte-pair tokens created so far is <32 then terminate if
f(P,B) - f(B) <= 3.
3. If the number of byte-pair tokens created so far is >=32 then terminate if
f(P,B) - f(B) <= 2.
Storage Format
The compressed data for each block is stored in one of three forms, depending
on the number of tokens created during compression:
0 Tokens
When the compression hasn’t performed any token substitutions, the ‘com-
pressed’ data for the block is stored in a format as shown in Table 1.
Table 1
Values Meaning
0x00 One byte token count value (zero)
... Bytes of uncompressed data
Under the Hood: The Implementation of Demand Paging 57
1 to 31 tokens
When the compression step produced between one and 31 token substitutions,
the compressed data for the block is stored in a format as shown in Table 2.
Table 2
Values Meaning
N One byte token count value N
X One byte with the value of the escape character X
{B0,P0,Q0}...
N times 3 bytes of token/pair values {B,P,Q}
{BN-1,PN-1,QN-1}
... Bytes of compressed data
32 to 255 tokens
When the compression step produced between 32 and 255 token substitutions,
the format is modified to store the token values in a bit vector, as shown in
Table 3 on the next page.
Table 3
Values Meaning
N One byte token count value N
X One byte with the value of the escape character X
When a breakpoint is removed, the kernel unlocks the page and demand pages it
as usual.
The code to create shadow pages now makes use of the new
M::LockRegion() and M::UnlockRegion() methods to force demand-
paged ROM to be loaded into memory before the page is shadowed.
Media drivers that support paging should register themselves with the paging
system by calling Kern::InstallPagingDevice() during system boot. If
demand paging is to operate for all user-side code, media drivers must be cre-
ated and installed via kernel extensions, rather than waiting until they are loaded
Under the Hood: The Implementation of Demand Paging 59
by EStart. To support this, Symbian has added two new APIs to enable device
driver creation from kernel-side code – Kern::InstallLogicalDevice() and
Kern::InstallPhysicalDevice().
//
// kernel.h
kernel.h
class
class DPagingDevice
DPagingDevice :: public
public DBase
DBase
{
{
public:
public:
enum
enum TType
TType //
// The
The type
type of
of device
device this
this represents.
represents.
{
ERom = 1<<0, /**< Paged ROM device type. */
{
ECode = 1<<1 /**< Code paging device type. */
ERom
}; = 1<<0, /**< Paged ROM device type. */
ECode = 1<<1 /**< Code paging device
virtual TInt Read(TThreadMessage* aReq, type. */
}; TLinAddr aBuffer, TUint aOffset,
virtual TInt Read(TThreadMessage*
aReq,TLinAddr
TUint aSize, TInt aDrvNumber)aBuffer,
= 0;
public: TUint aOffset,TUint aSize, TInt aDrvNumber) = 0;
public:
TUint32 iType;
TUint32
TUint32 iType;
iDrivesSupported;
TUint32 iDrivesSupported;
const char* iName;
const char* iName;
TInt iReadUnitShift;
TInt
TInt iReadUnitShift;
iDeviceId;
TInt
}; iDeviceId;
};
The read method is called by the paging system to read data from the media rep-
resented by this device. This method should store aSize bytes of data at offset
aOffset in the buffer aBuffer.
iDrivesSupported tells the system which local drives are supported for code
paging. It is a bitmask containing one bit set for each local drive supported, where
the bit set is 1 << the local drive number. If this device does not support code
paging, iDrivesSupported should be zero.
60 Demand Paging on Symbian
iReadUnitShift is the Log2 of the read unit size. A read unit is the number of
bytes that the device can optimally read from the underlying media. For example,
for small block NAND, a read unit would be equal to the page size, 512 bytes,
and iReadUnitShift would be set to nine.
iDeviceId is the value, chosen by the kernel, that the device should use to
identify itself.
The kernel can’t rely on allocating a buffer every time a page fault occurs, be-
cause of performance issues and the possibility that there might not be enough
free RAM. Accordingly, these resources are packaged up into paging request
objects, and a fixed number of them are allocated when the system is initialized.
TLinAddr iLoadAddr;
TPte* iLoadPte;
};
TUint iPagingRequestCount;
DPagingRequest* iPagingRequests[KMaxPagingRequests];
SDblQue iFreeRequestPool;
TInt iNextPagingRequestCount;
...
};
Overview
When a thread takes a paging fault, it acquires a paging request object and
maps a new page of RAM at the temporary address (iLoadAddr). It then calls
the media driver to load the data into the request object’s buffer (iBuffer) and
copies or decompresses the data from it into the page at iLoadAddr. Finally, the
request object is released and the new page mapped in to the correct location in
the memory map.
Concurrency
Since request objects are only created at boot, the number of them limits the
number of paging faults that can be processed concurrently. Because there can
be many threads taking paging faults at the same time, some care is needed to
co-ordinate access to request objects. One problem that must be avoided is prior-
ity inversion, in which a high-priority thread waits to get its paging fault processed
because low-priority threads are holding all the request objects.
When a thread tries to acquire a request object, the kernel checks the free pool
first. If it is non-empty, an unused request object is taken for the pool. If not, the
62 Demand Paging on Symbian
kernel selects a request object at random. The kernel increments the request
object’s usage count and makes the thread wait on its mutex.
When a thread releases a request object, the kernel signals the object’s mutex
and decrements its usage count. Once the count reaches zero, the kernel places
the object back in the free pool. Both these operations occur with the system lock
held, to synchronize access to the relevant data structures. They are implement-
ed in DemandPaging::AcquireRequestObject() and
ReleaseRequestObject() in mmubase.cpp.
This scheme distributes free objects while they are available, and then makes
threads queue for a randomly selected object. Use of a mutex for this purpose
provides priority inheritance, avoiding the issue of priority inversion. Random
selection mitigates the possibility of pathological behavior.
A media driver may also need to queue multiple incoming requests, and so we
added a TThreadMessage member (iMessage) to DPagingRequest, allowing
these request objects to be placed on a message queue and used in inter-thread
communications.
Initialization
During initialization, the kernel creates a fixed number of paging request objects
per paging device in the system. This is set by the constant KPagingRequest-
sPerDevice, which is (at the time of writing) two. This number was chosen be-
cause media drivers cannot issue more than one request to hardware at a time,
and if we have two objects, then one thread can be waiting for data to be read
in and another thread can be decompressing data. Adding more request objects
would only result in more threads waiting for data to be read.
the purpose, basing the offset on the ID. It calls down to the memory-model-spe-
cific code to allocate the temporary virtual address (using
AllocLoadAddress()), passing the ID. It creates the mutex.
Only open, read-only3, non-empty files may be clamped. For each clamp, a
handle is generated (encapsulated by the RFileClamp class) and returned to
the user. Files may be clamped multiple times. A new handle is generated in each
case.
To close (remove) a clamp, the specific handle must be passed back to the file
server (invalid handles lead to the error code KErrNotFound). It is only when all
the clamps for a file have been closed that the file can be considered
unclamped.
RFs
• Replace()
• Delete()
• NotifyDismount() (with argument EFsDismountNotify
Clients and EFsDismountForceDismount)
• AllowDismount()
• DismountFileSystem()
• SwapFileSystem()
RFile
• Replace()
• SetSize()
• Open() (with argument EFileWrite)
The composite and ROM file systems provide ‘pseudo’ support for file clamping
– they always return a zero-value identifier for a file when clamping is requested.
Because files cannot be modified on these file systems, nor can the mount be
dismounted, clamping them is pointless.
Other file systems will show the default behavior. Requests for file clamping will
return the error code KErrNotSupported.
//
// e32ldr.h
e32ldr.h
class
class RFileClamp
RFileClamp
{
{
public:
public:
inline
inline RFileClamp()
RFileClamp()
{
{
iCookie[0]
iCookie[0] = = 0;
0;
iCookie[1]
iCookie[1] = = 0;
0;
}
}
IMPORT_C
IMPORT_C TInt
TInt Clamp(RFile&
Clamp(RFile& aFile);
aFile);
IMPORT_C
IMPORT_C TInt Close(RFs&
TInt Close(RFs& aFs);
aFs);
public:
public:
TInt64
TInt64 iCookie[2];
iCookie[2];
};
};
The class keeps two cookies. The first, iCookie[0], is generated by the file
system to which the file belongs and represents a unique identifier for the file.
The second, iCookie[1], is made up of two 32-bit values: the drive number
and a count value that is incremented by the file-server mount instance on cre-
ation of each new clamp.
EXPORT_C
EXPORT_C TInt
TInt RFileClamp::Clamp(RFile&
RFileClamp::Clamp(RFile& aFile)
aFile)
Unclamps the file that was clamped with this RFileClamp object. It is safe to call
this function with a handle that was not successfully opened.
RArray<RFileClamp> iClampIdentifiers;
RArray<RFileClamp> iClampIdentifiers;
TInt32 iClampCount;
TInt32 iClampCount;
The parameter aName is name of the file and aUniqueId is the value returned
by the file system.
MFileAccessor*
MFileAccessor* fileAccessor
fileAccessor =
= NULL;
NULL;
GetInterface(CMountCB::EFileAccessor,
GetInterface(CMountCB::EFileAccessor, (void*&)
(void*&) fileAccessor, NULL);
fileAccessor,NULL);
iBody = new(ELeave)CMountBody(this, fileAccessor);
iBody = new(ELeave)CMountBody(this, fileAccessor);
The GetInterface() call to the file system determines whether it provides sup-
port for the MFileAccesor interface. If not, the CMountBody is passed a NULL
parameter for fileAccessor, and requests for clamping functionality receive
the error code KErrNotSupported. Otherwise, the CMountBody is passed a
pointer to the file-system mount object – the object on which the
GetFileUniqueId() method is invoked.
TBool
TBool iDismountRequired;
iDismountRequired;
TInt
TInt (*iCallBackFunc)(TAny*);
(*iCallBackFunc)(TAny*);
TDismountParams*
TDismountParams* iCallBackParams;
iCallBackParams;
Each file on the media will consist of a number of groups of contiguous blocks.
Each group is represented by its starting address, the media block size, the ad-
dress of the first block on the media and the number of consecutive blocks in the
group. If the file is not fragmented, an RFile::BlockMap() call will retrieve the
blockmap for the whole file (or a specified part of it). If the file is fragmented, sub-
sequent calls to RFile::BlockMap() will return the blockmap information for
the next group of contiguous blocks in the file. RFile::BlockMap() will return
KErrCompletion when it has reached the end of the requested part of the file.
Implementation
Symbian has added another new class. This one represents a group of contigu-
ous blocks, and is called TBlockMapEntry. It contains the number of blocks in
the group and the number of the first block in that group:
class
class TBlockMapEntry
TBlockMapEntry
{
{
public:
public:
TBlockMapEntry();
TBlockMapEntry();
void
void SetNumberOfBlocks(
SetNumberOfBlocks( TUint
TUint aNumberOfBlocks
aNumberOfBlocks ););
void
void SetStartBlock( TUint aStartBlock );
SetStartBlock( TUint aStartBlock );
public:
public:
TUint
TUint iNumberOfBlocks;
iNumberOfBlocks; //
// number
number of
of contiguous
contiguous blocks
blocks in
in map
map
TUint iStartBlock;
TUint iStartBlock; // number for first block in the map
// number for first block in the map
};
};
Parameters:
SBlockMapInfo&
SBlockMapInfo& aInfo:
aInfo:
iBlockGranularity is the size of the block for a given device in bytes. This
field is only filled on the first call to RFile::BlockMap().
iBlockStartOffset is the offset into the first block of a file containing the start
of the file, or the start of the section of the file that has been requested. This field
is only filled on the first call to RFile::BlockMap().
iStartBlockAddress is the address of the first block of the file. This field is
only filled on the first call to RFile::BlockMap().
If you don’t need the blockmap for the whole, then you can specify a start and
70 Demand Paging on Symbian
end position for a section of the file (aStartPos, aEndPos). This is useful for the
demand paging where only the executable section of a file will be needed. Both
of these parameters specify offsets from the start of the file in bytes, and if they
are not passed, the whole file is assumed.
aBlockMapUsage is the reason for the blockmap API use, which is set to
EBlockMapUsagePaging by default.
• Until the end of the file or the file section is reached, KErrNone will be
returned.
• If the end of the file is reached, KErrCompletion will be returned. In this
case the length of the iMap may be smaller than its maximum.
• An error code.
TBlockMapEntry* myBlockMapEntry =
(TBlockMapEntry*)map[c].iMap.Ptr();
// Then read the contents of iMap using
// the myBlockMapEntry pointer
}
map.Close();
if (KErrCompletion!=pos) … // deal with error
...
This API will not be available to any process other than the file server. It is used in
the loader (for demand paging), and in the loopback proxy extension. Appropriate
security vetting by SID will be required on the file server.
Finally, the FAT file system itself would return the blockmap, using the mount
object to walk the FAT table to obtain the requisite information.
When loading code into RAM, the loader now checks whether the code should
be paged. If so, it does not load, relocate or fix up the code, but instead collates
extra information that it passes to the kernel, allowing the latter to perform these
steps when each page is loaded. The loader also prevents the image file from be-
ing deleted or modified while it is being paged.
Under the Hood: The Implementation of Demand Paging 73
Implementation
The method E32Image::ShouldBeCodePaged() has been added to deter-
mine whether an image should be code paged. This decision is based on several
factors:
The file clamp API is used to prevent the image file from being deleted.
E32Image::BuildFixupTable(), E32Image::AllocateRelocationData
() and associated methods have been added to build the import fix-up table and
relocation tables to pass to the kernel. The functions that perform the relocation
and import fix-up are still called for code-paged images, however rather than do-
ing the fix up themselves, they populate the appropriate tables.
Summary
This chapter has covered a lot of ground! You should have the details necessary
to understand the implementation, and you can find out more by consulting
documentation and source available on developer.symbian.org.
74 Demand Paging on Symbian
Enabling Demand Paging on a New Platform 75
5
Enabling Demand Paging on a New Platform
In this section, we’ll discuss the main factors that should influence the form of
demand paging you choose to implement on your system. For example, demand
paging can have a significant impact on device driver code; we’ll discuss the rea-
sons and some ways to manage the changes necessary. We’ll also discuss some
of the broader impacts of demand paging on the underlying system.
Here we examine the different paging scenarios and describe factors to consider
when choosing which type to implement.
You should also note that device driver and system server APIs may need chang-
ing to support the implementation changes you’ll have to make to enable demand
paging.
in which data compression and relocation information are based around page-
sized data blocks.With code paging, there is much scope for deadlock and poor
performance, because:
• File systems must provide a control path that doesn’t cause paging faults.
aPriority is the priority of the thread create
• File system meta-data caching is required for acceptable performance
• Third-party plug-ins must not take paging faults or use any service that
might do so.
Data paging has the potential to offer the greatest RAM saving, but can only work
on devices with a suitable backing store. Power management would be a signifi-
cant problem because the backing store is likely to be heavily used. You would
need to implement a smart memory-caching system to mitigate this.
• The Symbian kernel services page faults at the priority of the NAND media
driver thread (KNandThreadPriority = 24). This means any thread of
higher priority will have its priority reduced while the driver faults.
• In a system in which two or more device drivers share the same thread,
the page fault taken by one driver can reduce the performance of the other
drivers using that thread.
The only safe rule to apply is that demand-paged memory must not be accessed
while holding any kernel-side mutex.
In the following sections, I’ll look at these problem areas in more detail and dis-
cuss the actions you can take to mitigate them.
5.2.1 Page Faults and Device Drivers: Where Can Page Faults Occur?
Device drivers, like all kernel-side components, are wired in memory; once
loaded, their code and read-only data sections will never be paged out. The
ROMBUILD tool ensures this by placing drivers in the unpaged part of an XIP
ROM. If any device driver is not resident in ROM, the loader will always copy it (in
its entirety) into RAM, and wire it there. Correctly written kernel-side code should
only access user memory using special functions provided by the kernel, which I
discuss next.
// kernel.h
// kernel.h
Kern::ThreadDesRead()
Kern::ThreadDesRead()
Kern::ThreadGetDesLength()
Kern::ThreadGetDesLength()
Kern::ThreadGetDesMaxLength()
Kern::ThreadGetDesMaxLength()
Kern::ThreadGetDesInfo()
Kern::ThreadGetDesInfo()
Kern::ThreadRawRead()
Kern::ThreadRawRead()
Note that, at time of writing, Symbian only supports the demand paging of read-
only data, and so writes to pageable memory will not cause paging faults (but
will cause the normal permissions check exception). The write functions can be
considered safe for demand paging:
// kernel.h
// kernel.h
Kern::ThreadDesWrite(DThread*, TAny*, const TDesC8&, TInt, TInt,
DThread*)
Kern::ThreadDesWrite(DThread*,TAny*,const TDesC8&,TInt,TInt,DThread*)
Kern::ThreadRawWrite(DThread*,TAny*,const
Kern::ThreadRawWrite(DThread*, TAny*, const TAny*,TInt,DThread*)
TAny*, TInt, DThread*)
Kern::ThreadDesWrite(DThread*,TAny*,const
Kern::ThreadDesWrite(DThread*, TAny*, const TDesC8&,TInt,DThread*)
TDesC8&, TInt,
DThread*)
//
// klib.h
klib.h
kumemget(TAny*
kumemget(TAny* aKernAddr,
aKernAddr, const
const TAny*
TAny* aAddr,
aAddr, TInt
TInt aLength)
aLength)
kumemget32(TAny*
kumemget32(TAny* aKernAddr, const TAny* aAddr, TInt
aKernAddr, const TAny* aAddr, TInt aLength)
aLength)
umemget(TAny*
umemget(TAny* aKernAddr,
aKernAddr, const
const TAny*
TAny* aUserAddr,
aUserAddr, TInt
TInt aLength)
aLength)
Enabling Demand Paging on a New Platform 79
// kernel.h
Kern::KUDesGet(TDes8& aDest, const TDesC8& aSrc)
Kern::KUDesInfo(const TDesC8& aSrc, TInt& aLength, TInt&
aMaxLength)
Kern::KUDesSetLength(TDes8& aDes, TInt aLength)
Similarly, you can also assume the following write functions to be demand-paging
safe:
// klib.h
kumemput(TAny* aAddr, const TAny* aKernAddr, TInt aLength)
kumemput32(TAny* aAddr, const TAny* aKernAddr, TInt aLength)
kumemset(TAny* aAddr, const TUint8 aValue, TInt aLength)
umemput(TAny* aUserAddr, const TAny* aKernAddr, TInt aLength)
umemput32(TAny* aUserAddr, const TAny* aKernAddr, TInt aLength)
umemset(TAny* aUserAddr, const TUint8 aValue, TInt aLength)
// kernel.h
Kern::KUDesPut(TDes8& aDest, const TDesC8& aSrc)
// kernel.h
Kern::KUSafeRead(const TAny* aSrc, TAny* aDest, TInt aSize)
Kern::KUSafeWrite(TAny* aDest, const TAny* aSrc, TInt aSize)
Kern::KUSafeInc(TInt& aValue)
Kern::KUSafeDec(TInt& aValue)
Kern::SafeRead(const TAny* aSrc, TAny* aDest, TInt aSize)
Kern::SafeWrite(TAny* aDest, const TAny* aSrc, TInt aSize)
// kernel.h
// kernel.h
Kern::CodeSegGetMemoryInfo(DCodeSeg&, TModuleMemoryInfo&, DPro-
cess*)
Kern::CodeSegGetMemoryInfo(DCodeSeg&, TModuleMemoryInfo&, DProcess*)
// platform.h
Epoc::RomProcessInfo(TProcessCreateInfo&, const TRomImageHeader&)
//
// platform.h
platform.h
DebugSupport::CloseCodeModifier()
DebugSupport::CloseCodeModifier()
DebugSupport::ModifyCode(DThread*,
DebugSupport::ModifyCode(DThread*, TLinAddr,
TLinAddr, TInt,
TInt, TUint,
TUint, TUint)
TUint)
DebugSupport::RestoreCode(DThread*, TLinAddr)
DebugSupport::RestoreCode(DThread*, TLinAddr)
You should assume that all other parts of the ROM could be demand paged.
DMutex objects used by the kernel in all possible operations where the device
driver may take a page fault, then the mutex order will not be violated. In practice,
it is very difficult for the device driver writer to guarantee this in all the possible
situations that mutex nesting could take place. Because of this, Symbian has
decided that the only safe rule to apply is that demand-paged memory must not
be accessed while holding any kernel-side mutex.
The following lists the most likely code in the base port or debugging modules to
hold system mutexes.
• DPowerController::DisableWakeupEvents()
• DPowerController::EnableWakeupEvents()
• DPowerController::PowerDown(TTimeK aWakeupTime)
• DPowerHandler::PowerUp()
• DPowerHandler::Wait()
• DPowerHandler::PowerDown(TPowerState aState)
4. DKernelEventHandler Implementations
The kernel calls these in many different places while it holds internal kernel
mutexes.
82 Demand Paging on Symbian
In the next section, I discuss how to avoid this mutex issue, and other issues
arising from demand paging.
Since EKA2, there has been support for multiple DFC queues and threads.
However, it is common for device drivers that execute in a context other than
their client’s thread to share the kernel’s DFC thread zero, which is the thread
associated with DFC queue zero. This approximates to the behavior of drivers on
EKA1, where only a single DFC queue is supported. Any driver that uses DFC
thread zero will execute in a shared thread context.
So Symbian now recommends that each driver uses its own thread and DFC
queue. We have already modified all of our own drivers in this way. To assist you
in this change, the kernel (from version Symbian OS v9.3 onwards) provides a
new dynamic queue class, TDynamicDfcQue.
TDynamicDfcQue();
IMPORT_C void Destroy();
private:
TDfc iKillDfc;
};
The method returns KErrNone if successful, or one of the standard error codes.
For example, the following code could be added to a physical device driver (PDD)
entry point to create a DFC queue (error handling omitted for brevity):
const
const TInt
TInt KThreadPriority
KThreadPriority =
= 27;
27;
_LIT8(KThreadBaseName,”DriverThread”);
_LIT8(KThreadBaseName,”DriverThread”);
TDynamicDfcQue*
TDynamicDfcQue* pDfcQ;
pDfcQ;
TInt
TInt r
r =
= Kern::DynamicDfcQCreate(pDfcQ,
Kern::DynamicDfcQCreate(pDfcQ, KThreadPriority,
KThreadPriority, KThread-
BaseName);
KThreadBaseName);
pdd->SetDfcQ(pDfcQ);
pdd->SetDfcQ(pDfcQ);
Remember to delete the DFC queue from the PDD object’s destructor:
84 Demand Paging on Symbian
DPddObject::~DPddObject()
DPddObject::~DPddObject()
{
if (iDfcQ)
iDfcQ->Destroy();
}
Note that if you have several drivers making use of a common peripheral bus,
then you will need to ensure that the code managing the bus is thread safe. You
will do this by using mutexes to protect state, rather than relying on only one
driver being able to execute at once as you might have done before.
There are several types of device driver architecture. A driver may be dynamically
loaded or boot loaded (if it is a kernel extension). It might have a single channel
or multiple channels, or a PDD and an LDD, or an LDD only. Each of these dif-
ferent architectures needs you to create the dedicated DFC queue at a different
place in your code. I discuss this in detail, with code examples, in Section 5.3.
Following this rule effectively moves the impact of demand paging into the user-
Enabling Demand Paging on a New Platform 85
side client, which can then choose to use demand-paged memory (or not), know-
ing that other clients won’t affect this choice.
Unfortunately, apart from the methods mentioned above, there is no general tech-
nique you can use when reworking kernel-side code to avoid accessing pageable
memory while holding mutexes. You will need to find specific solutions for your
own particular situation, and may have to re-architect your code.
If it were feasible to have complete knowledge of all the software on the phone
and its interactions, then it might be possible to prove that a certain mutex usage
could never cause deadlock, and so was ‘safe’. This is almost impossible on a
complex phone and, even if safe mutex usage could be proven, this is likely to be
a fragile situation, susceptible to breaking when system code changes. I repeat
that the only safe assumption you can make is that any access to pageable
memory while holding a mutex has the potential to cause system deadlock.
86 Demand Paging on Symbian
Mutex use
If a device driver accesses paged data from a thread other than that of its client,
it should use one of the Kern::ThreadXxx() APIs listed in Section 5.2.2.1.
These APIs use the system lock, which automatically excludes the use of another
NFastMutex. You should ensure that they are never called while holding a
DMutex.
If a driver reads from its client’s user side memory space while executing in its cli-
ent’s thread context, it must use one of the following published APIs:
//
// klib.h
klib.h
umemget()
umemget()
umemget32()
umemget32()
kumemget()
kumemget()
kumemget32()
kumemget32()
//
// kernel.h
kernel.h
Kern::KUDesGet()
Kern::KUDesGet()
Kern::KUDesInfo()
Kern::KUDesInfo()
Kern::InfoCopy()
Kern::InfoCopy()
These APIs have a precondition that excludes their use with a DMutex, because
they cannot be called from a critical section. Again, you must ensure they are not
called while holding an NFastMutex.
Kernel ASSERTs
The functions listed in Section 5.2.3.1 contain asserts, active in UDEB builds,
which cause a system fault if called while (most kinds of) system mutex are held.
This can help you to identify code that needs modification for demand paging.
However, to ease the integration of demand paging, these asserts do permit mu-
texes with an order value of KMutexOrdGeneral0 through
KMutexOrdGeneral7. This should not be taken as indicating that these mutex-
es are safe for use with demand paging.
The assertions are not active unless there is demand-paged memory in the sys-
tem, so will not affect products that do not make use of demand paging.
I’ll also point out those architectures that are most likely to affect the performance
of other device drivers or other clients of those drivers.
Kernel extensions in this category may execute some of their operation in the
context of a kernel thread. Although HAL calls can be used to pass data struc-
tures (often configuration) to drivers in this category, their operation is usually
safe because, typically, these data structures are accessed in the context of the
calling client thread. But you must take care to validate this assumption, especial-
ly when HAL calls are used to reconfigure the driver and/or hardware. When
the HAL call needs to synchronize with driver operation, it will be done in a kernel
context, and paging issues could arise.
88 Demand Paging on Symbian
Media drivers
Media drivers are channel-based device drivers (PDDs) that are also kernel
extensions. They interface to user-side clients (file systems) via the local media
subsystem (an LDD and a kernel extension) which creates and manages the
channel objects. Typically, the extension part of the media subsystem will perform
early initialization – creating the LDD factory object – but the extension entry
point of media drivers servicing page-in requests will also create and install the
PDD factory object. The channel objects are created on the first access.
The recommended model for media drivers either uses a unique DFC queue and
associated kernel thread, or has the media driver executing wholly in the context
of its client (as the internal RAM media driver does). As we’ve seen, this is be-
cause operations on media can be long running, so sharing the execution thread
with another driver could result in unacceptable impact on the performance of
that driver. The parallelism of file and disk operations would also be impaired.
This means that there is no issue with page faults and shared thread contexts in
media drivers.
One issue arises with media drivers that service page-in requests: if their cli-
ent – a file system running on an associated file-server drive thread – passes
an address in memory that is paged out, and the driver needs to read this in the
context of its unique kernel thread, a deadlock could occur, with the media driver
thread taking a page fault and thus becoming unable to service the ensuing
page-in request. To mitigate this, the media subsystem now ensures that the data
is paged in (if necessary taking a page fault in the context of the file-server drive
thread) before passing it to the media driver thread. Please refer to Section 5.5
Enabling Demand Paging on a New Platform 89
for more detail on media driver migration, and to Section 4.3 for more on demand
page locking.
Although the mechanism exists to allow unique channel contexts, it has been
common practice to use the shared DFC queue thread zero on drivers derived
from DLogicalChannel. Much of the following discussion will consider this type
of driver.
90 Demand Paging on Symbian
I shall next propose some simple solutions for migrating ‘problem’ drivers to a
demand-paged system. The overriding principle, which we have seen, is that
drivers accessing data structures in their client’s user-side address space do so
from either the client’s thread context or from the context of the driver’s unique
kernel thread – and if the driver can have more than client, in a driver kernel-
thread context that is particular to the client.
} }
}
return r; r;
return
}
One variation of this form has a global DFC queue and simply invokes
Kern::DfcQInit(..) from the entry point – this is the construct used by me-
dia drivers:
TDfcQue MyDriverDfcQ;
const TInt KMyDriverDfcQuePriority = XX;
_LIT(KMyDriveThreadName,MY_DRIVER_THREAD_NAME);
DECLARE_STANDARD_EXTENSION()
{
TInt r=KErrNoMemory;
DMyDriver* pH=new DMyDriver;
if (pH)
{
r = Kern::DfcQInit(&MyDriverDfcQ, KMyDriverDfcQuePriority,
&KMyDriveThreadName);
if(KErrNone==r)
{
// second phase construction of DMyDriver
}
}
return r;
}
1. If the driver enforces a single-channel policy then the DFC queue should
be associated with the PDD factory object (DPhysicalDevice-derived).
The DFC queue should be created as a result of loading the driver and
destroyed as a result of unloading the driver – so to enforce a single chan-
nel policy, the LDD factory DLogicalDevice-derived object’s Create()
function will typically include something like the following:
DMyLDDFactory::Create(DLogicalChannelBase*& aChannel)
{
if(iOpenChannels!=0) // iOpenChannels is a member
of DLogicalDevice
// of DLogicalDevice
return KErrInUse;
... // now create the Logical Channel
}
2. If the driver does not support more than one hardware unit, then the DFC
queue should again be associated with the PDD factory object (DPhysi-
calDevice-derived) that is created when the driver is loaded and de-
stroyed when it is unloaded. The constructor of the LDD factory object of
a driver that does not support more than one unit will not set bit one of
DLogicalDevice::iParseMask (KDeviceAllowUnit).
3. If a driver supports more than one hardware unit, it might be that the units
are implemented by the same hardware block with a shared control inter-
face. In this case, it might be possible to bring the device to an inconsis-
tent state if the shared control interface is accessed from multiple threads.
Rather than implementing complex synchronization mechanisms, it may
be easier to have all channel operations of the shared interface executing
from the same kernel thread context. Again, the DFC queue is associated
Enabling Demand Paging on a New Platform 93
with the PDD factory object, and the queue and kernel thread are created
when the driver is loaded, and destroyed when the driver is unloaded.
4. If a driver supports multiple hardware units that are independent from
each other and independently controlled, then the ownership of the DFC
queue should be given to the PDD object – the physical channel. The DFC
queue (and its associated thread) should be created whenever a channel
is opened, and destroyed when the channel is closed.
class
class TDynamicDfcQue
TDynamicDfcQue :
: public
public TDfcQue
TDfcQue
{
{
public:
public:
TDynamicDfcQue();
TDynamicDfcQue();
IMPORT_C
IMPORT_C void
void Destroy();
Destroy();
private:
private:
TDfc
TDfc iKillDfc;
iKillDfc;
};
};
Where:
The destruction of the DFC queue used by a device driver should be triggered by
94 Demand Paging on Symbian
the destruction of the object it is associated with. So, to destroy the DFC queue
and terminate the thread associated with it, the destroy method must be called:
The DLogicalChannel class holds a pointer to the DFC queue used by each
logical channel object derived from it. This pointer is typically set up during the
second-phase construction of the LDD in the DoCreate() function, which
means the DFC queue must have been created by the time the LDD’s DoCre-
ate() is invoked. This is guaranteed when the DFC queue is owned by the LDD
or PDD factory objects, because these are created when first loading the logical
or physical device. It is also guaranteed in the case when the PDD object owns
the DFC queue, as the order of channel construction is as follows:
1. LDD constructor
2. PDD constructor
3. PDD DoCreate()
4. LDD DoCreate()
However, when the LDD owns and creates the DFC queue, it is down to you, the
developer, to guarantee that the correct pointer to the DFC queue is stored in
DLogicalChannel::iDfcQ as part of DoCreate().
Note that the LDD has access to the LDD factory object, the PDD factory object
and the PDD through the iDevice, iPhysicalDevice and iPdd pointers in
DLogicalChannelBase base class.
In the next four sections, I’ll give code examples for these different situations.
Enabling Demand Paging on a New Platform 95
TDfcQue* DMyLogicalDevice::DfcQ()
return iDfcQ;
}{
return iDfcQ;
}
DECLARE_STANDARD_LDD()
{
DECLARE_STANDARD_LDD()
DMyLogicalDevice* pD=new DMyLogicalDevice;
{
if(pD)
DMyLogicalDevice*
{ pD=new DMyLogicalDevice;
if(pD)
TDynamicDfcQ* q;
{
TInt r = Kern::DynamicDfcQCreate( q, KMyDriverThreadPriority,
TDynamicDfcQ* q; KMyDriverThread);
TInt r = Kern::DynamicDfcQCreate( q, KMyDriverThreadPriority,
if(KErrNone==r)
{ KMyDriverThread);
if(KErrNone==r)
pD->Construct(q);
{
return pD;
pD->Construct(q);
}
return pD;
pD->AsyncClose();
} }
pD->AsyncClose();
return NULL;
} }
return NULL;
// }Logical Channel
DMyDriverLogicalChannel::DoCreate(TInt aUnit,
// Logical Channel const TDesC8* /*anInfo*/,
DMyDriverLogicalChannel::DoCreate(TInt aUnit,
const TVersion &aVer)
{ const TDesC8* /*anInfo*/,
... const TVersion &aVer)
{
SetDfcQ(iDevice->DfcQ());
...
}SetDfcQ(iDevice->DfcQ());
...
}
In the previous code extract, we create a dynamic DFC queue on the kernel heap
and arrange for the logical device object to have a pointer to it.
When the logical device is loaded, the DLL entry point is invoked with
KModuleEntryReasonProcessAttach, which invokes the LDD-specific
initialization DECLARE_STANDARD_LDD(). The LDD-specific entry point creates
the LDD factory object and, if successful, creates the dynamic DFC queue (and
Enabling Demand Paging on a New Platform 97
associated thread). The logical channel uses the pointer to the logical device to
obtain the DFC queue.
A possible variation to this scheme creates the dynamic DFC queue the
DLogicalDevice-derived Install() function. This simplifies the entry point:
_LIT(KLddName,”MyDriver”);
TInt DMyLogicalDevice::Install()
// Install the device driver.
{
TDynamicDfcQ* q;
TInt r = Kern::DynamicDfcQCreate(q, KMyDriverThreadPriority,
KMyDriverThread);
if(KErrNone==r)
{
Construct(q);
r=SetName(&KLddName);
}
return r;
}
DECLARE_STANDARD_LDD()
{
return new DMyLogicalDevice;
}
class
class DMyPhysicalDevice
DMyPhysicalDevice :: public
public DPhysicalDevice
DPhysicalDevice
{
{
public:
public:
DMyPhysicalDevice();
DMyPhysicalDevice();
~DMyPhysicalDevice();
~DMyPhysicalDevice();
void
void Construct(TDynamicDfcQue*
Construct(TDynamicDfcQue* aDfcQ);
aDfcQ);
virtual
virtual TDfcQue*
TDfcQue* DfcQ();
DfcQ();
...
...
public:
public:
...
...
TDynamicDfcQue*
TDynamicDfcQue* iDfcQ;
iDfcQ;
};
};
98 Demand Paging on Symbian
DMyPhysicalDevice::DMyPhysicalDevice()
// Constructor
{
// sets iVersion and iUnitMask (if required)
}
DMyPhysicalDevice::~DMyPhysicalDevice()
// Destructor
{
...
// cancel any other DFCs owned by this device
if (iDfcQ)
iDfcQ->Destroy();
}
TDfcQue* DMyPhysicalDevice::DfcQ()
{
return iDfcQ;
}
DECLARE_STANDARD_PDD()
{
DMyPhysicalDevice* pD=new DMyPhysicalDevice;
if(pD)
{
TDynamicDfcQ* q;
TInt r = Kern::DynamicDfcQCreate(q, KMyDriverThreadPriority,
KMyDriverThread);
if(KErrNone==r)
{
pD->Construct(q);
return pD;
}
pD->AsyncClose();
}
return NULL;
}
Enabling Demand Paging on a New Platform 99
// Logical Channel
DMyDriverLogicalChannel::DoCreate(TInt aUnit,
const TDesC8* /*anInfo*/,
const TVersion &aVer)
{
...
SetDfcQ(iPhysicalDevice->DfcQ());
...
}
The previous code extract uses the same principles outlined in the previous para-
graph, with the main differences being:
Again, you can create the DFC queue in the DPhysicalDevice-derived In-
stall() function.
...
// sets iVersion and iParseMask with bit 1 (KDeviceAllowUnit)
// set and bit 2 (KDeviceAllowPhysicalDevice) unset
}
if(!aChannel)
return KErrNoMemory;
return KErrNone;
}
// Logical Channel
DMyDriverLogicalChannel::DMyDriverLogicalChannel()
// Constructor
{
// may set up pointer to owning client’s thread
// and increase its reference count
// iDfcQ=NULL;
}
DMyDriverLogicalChannel::~DMyDriverLogicalChannel()
// Destructor
{
// may also decrease the owning client’s thread reference count
...
// cancel any other DFCs owned by this channel
if (iDfcQ)
iDfcQ->Destroy();
}
TDynamicDfcQue* q;
r = Kern::DfcQCreate(q, KMyDriverThreadPriority,
KMyDriverThread);
if (KErrNone==r)
{
SetDfcQ(q);
iMsgQ.Receive();
return r;
}
In the example, the DFC queue is owned by the logical channel. The second-
phase constructor (DoCreate()) creates the queue. If successful, it sets the
DLogicalChannel pointer to DFC queue (iDfcQ).
When the channel is closed, the destructor of the logical channel is invoked and
this destroys the DFC queue.
// Logical
// Logical Device
Device
DMyLogicalDevice::DMyLogicalDevice()
DMyLogicalDevice::DMyLogicalDevice()
// Constructor
// Constructor
{// Sets iVersion
{// Sets iVersion and
and iParseMask
iParseMask with
with bit
bit 1
1 (KDeviceAllowUnit)
(KDeviceAllowUnit)
102 Demand Paging on Symbian
// Physical Device
TInt DMyPhysicalDevice::Create(DBase*& aChannel, TInt aUnit,
const TDesC8* aInfo, const TVersion& aVer)
{
DMyDriver* pD=new DMyDriver;
aChannel=pD;
TInt r=KErrNoMemory;
if (pD)
r=pD->DoCreate(aUnit,aInfo);
return r;
}
// PDD
DMyDriver::DMyDriver()
// Constructor
{
...
//iDfcQ=NULL;
}
DMyDriver::~DMyDriver()
// Destructor
{
...
// cancel any other DFCs owned by this channel
if (iDfcQ)
iDfcQ->Destroy();
}
{
}
iDfcQ=pDfcQ;
return r; // if error, framework will delete LDD and PDD
} }
return r; // if error, framework will delete LDD and PDD
}
TDfcQue* DMyDriver::DfcQ(TInt aUnit)
{
TDfcQue* DMyDriver::DfcQ(TInt
TDfcQue* pDfcQ=NULL; aUnit)
{
if(aUnit==iUnit)
TDfcQue* pDfcQ=NULL;
pDfcQ=iDfcQ;
if(aUnit==iUnit)
return pDfcQ;
} pDfcQ=iDfcQ;
return pDfcQ;
//}Logical Channel
DMyDriverLogicalChannel::DoCreate(TInt aUnit, const TDesC8* /*anInfo*/,
// Logical Channel const TVersion &aVer)
{
DMyDriverLogicalChannel::DoCreate(TInt aUnit, const TDesC8* /*anInfo*/,
... const TVersion &aVer)
{
SetDfcQ(iPdd->DfcQ(aUnit));
...
...
SetDfcQ(iPdd->DfcQ(aUnit));
}
...
Key points:
USB driver
The USB driver originally used DfcQue0 for its iPowerUpDfc and
iPowerDownDfc members and set this in the platform-independent layer.
From e32/drivers/usbcc/ps_usbc.cpp:
DUsbClientController::DUsbClientController()
DUsbClientController::DUsbClientController()
{
{
__KTRACE_OPT(KUSB,
__KTRACE_OPT(KUSB,
Kern::Printf(“DUsbClientController::DUsbClientController()”));
Kern::Printf(“DUsbClientController::DUsbClientController()”));
#ifndef SEPARATE_USB_DFC_QUEUE
#ifndef SEPARATE_USB_DFC_QUEUE
iPowerUpDfc.SetDfcQ(Kern::DfcQue0());
iPowerUpDfc.SetDfcQ(Kern::DfcQue0());
iPowerDownDfc.SetDfcQ(Kern::DfcQue0());
iPowerDownDfc.SetDfcQ(Kern::DfcQue0());
#endif
#endif
}
}
The driver now requires initialization of iPowerUpDfc and iPowerDownDfc in
the platform-specific layer. In the Symbian-provided base-ports, a dedicated DFC
queue is also created.
From omap/shared/usb/pa_usbc.cpp:
#ifdef SEPARATE_USB_DFC_QUEUE
#ifdefTInt
const SEPARATE_USB_DFC_QUEUE
KUsbThreadPriority = 27;
const TInt KUsbThreadPriority = 27;
_LIT8(KUsbThreadName,”UsbThread”);
_LIT8(KUsbThreadName,”UsbThread”);
TInt TOmapUsbcc::CreateDfcQ()
TInt
{ TOmapUsbcc::CreateDfcQ()
{
TInt r=Kern::DfcQCreate(iDfcQ,KUsbThreadPriority,&KUsbThreadNa
TInt r=Kern::DfcQCreate(iDfcQ,KUsbThreadPriority,&KUsbThreadName);
me);
if (KErrNone
if (KErrNone !=
!= r)
r)
{
{
__KTRACE_OPT(KHARDWARE,
__KTRACE_OPT(KHARDWARE,Kern::Printf(“PSL: > Error
Kern::Printf(“PSL: initializing
> Error initial-
izing USB client support. Can’t create DFC Que”)) ;
return r; USB client support. Can’t create DFC Que”))
; }
return r;
iPowerUpDfc.SetDfcQ(iDfcQ);
}
iPowerUpDfc.SetDfcQ(iDfcQ);
Enabling Demand Paging on a New Platform 105
iPowerDownDfc.SetDfcQ(iDfcQ);
return KErrNone;
}
#endif
Sound driver
A new virtual abstract method has been added to the class DSoundPDD:
virtual TDfcQue* DfcQ() = 0;
You must define this method in the base-port-derived object and ensure it returns
the DFC queue to use.
I recommend that you change the driver so that it creates its own DFC queue, as
described in Section 5.2.4.1. However, the minimum required change is to imple-
ment the function to return a pointer to DFC queue zero. For example, the follow-
ing implementation would suffice:
TDfcQue*
TDfcQue* DSoundPddDerived::DfcQ()
DSoundPddDerived::DfcQ()
{
{
return
return Kern::DfcQue0();
Kern::DfcQue0();
}
}
DDigitiser
The class DDigitiser no longer initializes the member variable iDfcQ (DFC
queue pointer). You must initialize this variable in the base-port derived object.
We recommend that you change the driver so that it creates its own DFC queue,
as described in Section 5.2.4.1. However, the minimum required change is to set
the variable to use DFC queue zero in the derived object. For example, you could
add the following code to the derived class constructor:
DDigitiserDerived::DDigitiserDerived()
DDigitiserDerived::DDigitiserDerived()
{
{
// ...
// ... = Kern::DfcQue0();
iDfcQ
iDfcQ
} = Kern::DfcQue0();
106 Demand Paging on Symbian
// ...
}
In the first situation, when the state of a resource needs to be protected from
the effects of pre-emption for an appreciable period of time, the recommended
approach is to use mutual exclusion, protecting the resource with a DMutex. An
exception to this is where the only risk is of the same driver triggering the same
operation before the previous one completes – that is, when an operation is non-
blocking and occurs from different thread contexts. In that case, an NFastMutex
should suffice.
As before, it is essential that the thread in which the driver runs does not itself
take a page fault, otherwise deadlock will occur.
A media driver is typically a PDD with a filename in the form ‘med*.pdd’. Like
other kernel-side components, it is always marked as unpaged, which means
that its code and read-only data sections will never be paged out. The only time
the media driver could theoretically take a page fault is when it accepts a write
request from a user-side client whose source data is paged out – this could be
data in the paged area of an XIP ROM or code that has been loaded into RAM
from code-paging-enabled media. To remedy this, Symbian has modified the lo-
cal media subsystem to ensure that the source data in a write request is paged
in before the write request is passed to the media driver thread. This may mean
taking a page fault in the context of the file-server drive thread before passing the
request on.
Large write requests of paged data are fragmented into a series of smaller ones
to avoid exhausting available RAM. Such fragmentation is quite rare but it might
happen, for example, when copying a large ROM data file into a temporary loca-
tion on the user data drive.
I explain the steps needed to enable a media driver to support XIP ROM and/or
code paging in the following sections. For the specific changes required to sup-
port paging from internal MMC/SD card, see Section 5.5.6.
1. The paging flags – whether code paging and/or XIP ROM paging is sup-
ported.
108 Demand Paging on Symbian
2. The paging fragment size. If a write request points to paged data, then the
request will be split up into separate fragments of this size. This value
needs to be chosen with care. If it is too small, writes may take an
unacceptably long time to complete. If it is too large, paging requests may
take an unacceptably long time to be satisfied.
3. The number of drives that support code paging. If code paging is not
supported (that is, only XIP ROM paging is supported), this should be
zero.
4. The list of local drives that support code paging (if code paging is
supported). This should be a subset of the overall drive list supported by
the media driver.
For example, here (in bold italics) are the changes made to support paging on
NAND on the H4 reference platform:
The macros can then be picked up in the media driver source code and
passed to LocDrv::RegisterPagingDevice(). This function is similar to
LocDrv::RegisterMediaDevice() in that it takes a drive list as a parameter
but in this case it identifies the drive(s) to be used for code paging (if any).
Enabling Demand Paging on a New Platform 109
Some media drivers may have no kernel extension entry point defined (for exam-
ple, the MMC media driver). These will have a DECLARE_STANDARD_PDD macro
defined rather than DECLARE_EXTENSION_PDD. You will need to modify these to
have a DECLARE_EXTENSION_PDD / DECLARE_STANDARD_EXTENSION pair.
The kernel extension entry point must create a dedicated DFC queue (as dis-
cussed earlier) – otherwise a page fault in a drive thread cannot be satisfied. The
entry point must then create a DPrimaryMediaBase object and register it with
the local media subsystem. To support demand paging, you should modify the
entry point to register the paging device with the paging subsystem, and instanti-
ate and install the driver factory object.
{
TInt r=Kern::DfcQInit(&NandMediaDfcQ,
andThreadPriority,
&KNandMediaThreadName);
if (r!=KErrNone)
return r;
pM->iDfcQ=&NandMediaDfcQ;
r=LocDrv::RegisterMediaDevice(MEDIA_DEVICE_NAND,
NAND_DRIVECOUNT,
NandDriveNumbers,
pM,
NAND_NUMMEDIA,
KNandDriveName);
if (r != KErrNone)
return r;
r = LocDrv::RegisterPagingDevice(pM,
NandPagingDriveNumbers,
NAND_PAGEDRIVECOUNT,
PAGING_TYPE,
SECTOR_SHIFT,
NUM_PAGES);
if (r == KErrNone)
{
device = new DPhysicalDeviceMediaNand;
if (device == NULL)
return KErrNoMemory;
r = Kern::InstallPhysicalDevice(device);
}
// Ignore error if demand paging not supported by kernel
else if (r == KErrNotSupported)
r = KErrNone;
else
Enabling Demand Paging on a New Platform 111
return r;
return r;
pM->iMsgQ.Receive();
pM->iMsgQ.Receive();
return KErrNone;
return KErrNone;
}
}
Note that:
{
TInt r=KErrNotSupported;
TInt id=aRequest.Id();
if (id == DLocalDrive::ECaps)
{
TLocDrv* drive = aRequest.Drive();
TLocalDriveCapsV4& c =
*(TLocalDriveCapsV4*)aRequest.RemoteDes();
r=Caps(*drive,c);
}
// etc
}
1. Ensure the local media subsystem LDD (elocd.ldd) has been built with
the __ALLOW_CONCURRENT_FRAGMENTATION__ macro undefined. This
ensures that the local media subsystem never issues more than one write
fragment at a time.
2. Change the paging media driver so that it keeps track of write request
chains and defers any read or format requests received after the first frag-
ment and before the last in a sequence. Note that write fragments should
never be deferred.
One way in which you could implement step two is for the media driver to main-
tain a bit mask, with each bit representing a ‘write fragment in progress’ flag for a
particular drive. For example:
114 Demand Paging on Symbian
iFragmenting |= (0x1<<iCurrentReq->Drive()->iDriveNumber);
iFragmenting |= (0x1<<iCurrentReq->Drive()->iDriveNumber);
class MDemandPagingInfo
class MDemandPagingInfo
{
{
Enabling Demand Paging on a New Platform 115
public:
virtual TInt DemandPagingInfo(TDemandPagingInfo& aInfo) = 0;
};
...
};
#define MMC_PAGING_TYPE
#define MMC_PAGING_TYPE DPagingDevice::ERom |
DPagingDevice::ERom |
DPagingDevice::ECode
DPagingDevice::ECode
#define MMC_PAGEDRIVELIST
#define MMC_PAGEDRIVELIST 1
1 //
// code
code paging
paging from
from user
user data
data
#define MMC_PAGEDRIVECOUNT 1
#define MMC_PAGEDRIVECOUNT 1
#define MMC_NUM_PAGES
#define MMC_NUM_PAGES 8
8
aDemandPagingInfo.iPagingDriveList = pagingDriveNumbers;
aDemandPagingInfo.iDriveCount = MMC_PAGEDRIVECOUNT;
aDemandPagingInfo.iPagingType = MMC_PAGING_TYPE;
aDemandPagingInfo.iReadShift = 9;
aDemandPagingInfo.iNumPages = MMC_NUM_PAGES;
return KErrNone;
}
}
For example:
mmcloader
mmcloader z:\\core.img
z:\\core.img d:\\sys$rom.bin
d:\\sys$rom.bin d:\\sys$rom.pag
d:\\sys$rom.pag
Then, when the board is rebooted, the MMC/SD media driver reads the boot sec-
tor and uses the pointer stored to determine the location of the image file, so that
it can begin to satisfy paging requests.
Modifying EStart
Now we need to prevent the paged and unpaged image files from being uninten-
tionally deleted from the internal MMC drive. To support this, Symbian has added
a new mechanism to EStart to allow it to permanently clamp the image files. The
variant part of EStart must now implement a new virtual function that returns the
image file names:
In addition, the file system implementations of the methods affected by file clamp-
ing must check for the existence of file clamps. An example of this in writeable file
systems is provided in the FAT code. ROFS provides an example for a read-only
file system. Here is the FAT DeleteL() method:
void
void CFatMountCB::DeleteL(const
CFatMountCB::DeleteL(const TDesC&
TDesC& aName)
aName)
{
{
__PRINT(_L(“CFatMountCB::DeleteL”));
__PRINT(_L(“CFatMountCB::DeleteL”));
CheckStateConsistentL();
CheckStateConsistentL();
CheckWritableL();
CheckWritableL();
TFatDirEntry
TFatDirEntry fileEntry;
fileEntry;
TEntryPos
TEntryPos fileEntryPos(RootIndicator(),0);
fileEntryPos(RootIndicator(),0);
FindEntryStartL(aName,KEntryAttMaskSupported,fileEntry,
FindEntryStartL(aName,KEntryAttMaskSupported,fileEntry,
fileEntryPos);
fileEntryPos);
TEntryPos dosEntryPos=fileEntryPos;
TEntryPos dosEntryPos=fileEntryPos;
TFatDirEntry
TFatDirEntry dosEntry=fileEntry;
dosEntry=fileEntry;
MoveToDosEntryL(dosEntryPos,dosEntry);
MoveToDosEntryL(dosEntryPos,dosEntry);
if
if ((dosEntry.Attributes()&KEntryAttReadOnly)
((dosEntry.Attributes()&KEntryAttReadOnly) || ||
(dosEntry.Attributes()&KEntryAttDir))
(dosEntry.Attributes()&KEntryAttDir))
User::Leave(KErrAccessDenied);
User::Leave(KErrAccessDenied);
//
// Can
Can not
not delete
delete a
a file
file if
if it
it is
is clamped
clamped
CMountCB* basePtr=(CMountCB*)this;
CMountCB* basePtr=(CMountCB*)this;
TInt
TInt startCluster=StartCluster(dosEntry);
startCluster=StartCluster(dosEntry);
if(basePtr->IsFileClamped(MAKE_TINT64(0,startCluster))
if(basePtr->IsFileClamped(MAKE_TINT64(0,startCluster)) >
> 0)
0)
User::Leave(KErrInUse);
User::Leave(KErrInUse);
EraseDirEntryL(fileEntryPos,fileEntry);
EraseDirEntryL(fileEntryPos,fileEntry);
FAT().FreeClusterListL(StartCluster(dosEntry));
FAT().FreeClusterListL(StartCluster(dosEntry));
FAT().FlushL();
FAT().FlushL();
}
}
120 Demand Paging on Symbian
Since these changes are compiled into the platform-specific bootstrap, you will
need to rebuild it.
BR1982: Kernel-side read from user memory must not occur while holding
a mutex
As discussed earlier, kernel-side code must not read from paged (user-side)
memory while the current thread holds any mutexes. The kernel functions that
access user memory have been changed in debug builds to assert some (but not
all) of the new restrictions.
It is possible (though unlikely) that existing kernel-side code may panic in debug
builds if it doesn’t conform to the new restrictions. In release builds, it may inter-
mittently hang.
The impact of the above is that any code that modifies RAM-loaded executables may
now unexpectedly fail. Since executables are stored in the \sys\bin\ directory,
and only components with the TCB capability can modify or delete files in this
directory, this compatibility issue should be limited to a very few components (for
example, software install, debuggers and possibly some Java code).
new function, the system keeps track of any pages that are in use within the ex-
ecutable to be deleted, and only deletes the executable when it is no longer used
by the paging subsystem. As a result, disk space may not be released until some
time after the call completes.
You should modify any source that writes into code chunks or an XIP code area,
so that it makes its own copy of the code before writing back into it. For some
components, such as FOTA clients, this may not be possible. In these cases, you
will have to take application-specific measures to ensure the code you are writing
to is not paged.
• XIP ROM memory is visible at all times after the system has booted.
• RAM-loaded code segments are visible at all times.
Demand paging makes these code areas intermittently unavailable – if your tool
accesses them at such a time, it will be faulted by the MMU. This means that
existing stop-mode debuggers (or similar tools) will not work reliably on demand-
paged ROMs.
to be changed to support the new format. This also affects any tool that updates
the XIP ROM image (such as FOTA clients).
It is likely that most tools affected by this are contained within a software develop-
ment context and so the impact is likely to be limited.
Table 4 below lists a number of possible scenarios and their impact. Scenario 1 is
the default case if the device manufacturer does not alter the Symbian build tools.
Scenarios 2 to 4 look at what happens if the device manufacturer makes different
modifications to those. Please note that scenarios 5 to 7 require support in Sym-
bian that does not exist at the time of writing, and these scenarios are included
for completeness only.
124 Demand Paging on Symbian
Table 4
Scenario Impact
6 ‘Fat’ SIS solution. Device manufactur- Not supported at the time of writing.
er tools build both ‘deflate’ and ‘byte- Larger SIS files.
pair’ binaries. The MAKESIS tool
puts both in the SIS package, and the
Symbian software installer selects the
appropriate binary at install time.
6
Component Evaluation for Demand Paging
This chapter describes how you might evaluate the impact of demand paging on
a component or a group of components, and how to mitigate any negative im-
pact. The processes that are described here are heavily influenced by the Sym-
bian system-wide evaluation that was carried out during the prototype phase of
demand paging.
You are encouraged to use these ideas in whole or in part for your own com-
ponents. At a minimum, the paging categories of Symbian-owned components
(Section 6.4) must be respected.
You should analyze several aspects of the component architecture before even
considering demand paging. We first consider static analysis techniques, before
moving on to dynamic analysis in the next section.
Also note dynamic dependency information – for example, whether the compo-
nent is a plug-in to a framework or is a framework itself.
This category could also include complex compound use cases such as ‘Receive
text message while playing mp3 audio.’
Other use cases may have a benchmarked performance that is important to the
component owner but would not be considered real-time or performance critical
from a system-wide perspective. For example: ‘time taken to notify user of text
message reception.’ In this case, it is unlikely that the user would notice any time
delay caused by demand paging because the text message is processed in the
background.
There will also be some use cases that fall into grey areas and their importance
may be somewhat subjective. In these cases, it’s probably appropriate for the
component owner to negotiate with the system-wide design authority.
Component Evaluation for Demand Paging 129
The list should also include any cases of custom IPC architectures where
one thread reads from another thread’s address space, for instance by using
RMessage::Read().
If the pass rate is not as high as expected, then you need to either fix the code
or understand the reasons for the failure. If an immediate fix is not possible,
then the demand-paging configuration should be relaxed (either by making more
dependent components unpaged or increasing the minimum/maximum size of the
paging cache), to establish the point at which functional equivalence is achieved.
130 Demand Paging on Symbian
Note this is a temporary measure for evaluation: eventually all known defects
exposed by demand paging should be fixed.
You can use your existing benchmarking code for the component, or write new
code specifically for demand paging. If you don’t have any benchmarking then
re-running existing test code with time stamping enabled may provide you with
sufficient information.
You should run the tests using several demand-paging configurations with dif-
ferent maximum paging cache sizes (to simulate OOM behavior). A higher num-
ber of different configurations gives a more accurate picture of the performance
impact of demand paging.
When you make a graph of maximum paging cache size versus performance,
there is often a point at which performance drops off dramatically. This indicates
that page thrashing is occurring. Sometimes this is so dramatic that you will need
a logarithmic scale on one or both axes to determine the drop-off point. At other
times, the performance drop-off is more gradual, indicating that the use case is
less sensitive to page faults.
Figure 10 gives some example performance data. The performance profiles for
two use cases, A and B, are presented in two different ways. The top graph has
linear axes and appears to show that performance for use case A drops sharply
as the maximum paging cache goes below 96 pages. For use case B, perfor-
mance drops less sharply at around 128 pages. The bottom graph has a logarith-
mic Y-axis and shows that the drop-off point for use case A is actually nearer 160
pages. For use case B, no additional information is revealed.
Component Evaluation for Demand Paging 131
132 Demand Paging on Symbian
Figure 10: The change in performance with maximum paging cache size for two
use cases. The same data is presented in both graphs, using a linear Y-axis (top)
and a logarithmic Y-axis (bottom).
Component Evaluation for Demand Paging 133
page faults result in the audio buffers not filling in time and audio playback
stutters.
Major component redesign may be impractical in the short term. In this case, you
may temporarily have to set large parts of the component to be unpaged. This
greatly reduces RAM-saving potential.
Dynamic analysis may also reveal that some executables are paged in for much
of the time, despite not being involved in any real-time or performance-critical
use cases. In this case, since the executable is always in RAM, you might gain
by making the executable unpaged and reducing the minimum size of the paging
cache accordingly.
As well as identifying unpaged files within the component, your evaluation may
give you data on other dependent components, possibly requiring those com-
ponents to be unpaged to meet certain performance guarantees. For example,
an unpaged real-time server may require that all its third-party plug-ins are also
unpaged. It is important that these cross-component requirements are considered
from a system-wide perspective.
Once you have a candidate unpaged list, how to act upon it is a decision shared
between the component owner, the system architects of the platform and the
(software) customers of the platform (if any). A good unpaged list should
guarantee the robustness and functionality of the system irrespective of how
small the paging cache is.
It is not practical to find an optimum minimum cache size from static analysis.
Device manufacturers will determine it empirically, according to the final contents
of the ROM and the performance requirements of the device. One way they might
do this is to build a performance profile (as in Section 6.2.2) using a selection of
the most code-intensive use cases on the device.
136 Demand Paging on Symbian
The complete list of categories and their inclusion criteria are described in Sec-
tions 6.4.1 to 6.4.6. Symbian now maintains a list of the all files in Symbian and
their demand-paging categories, and monitors it to ensure conformance.
Table 5
Criterion Description
Device stability When the file is paged, the device is
unstable. The instability should be fixed
by normal coding methods where possible.
General performance When the file is paged, performance for all use
cases is degraded due to page thrashing.
Security It is necessary to prevent the file from paging for
security reasons (for example, the file is located
in a special area or must be excluded from any
integrity checks made while paging in).
There is some overlap between these criteria. For instance, it is likely that any file
that is made mandatory unpaged due to ‘permanent presence’ will also satisfy
‘general performance’. The list of criteria is not exhaustive and may expand in the
future.
has the same definition as in Section 6.1.2. Note that paging can improve per-
formance in some use cases so there should be evidence that making a file (or
group of files) unpaged is better than making it (or them) paged.
6.4.6 Paged
This is the ‘catch-all’ category for files that don’t fit into any of the other catego-
ries. By default, all files will be paged unless otherwise stated.
Configuring Demand Paging on a Device 139
7
Configuring Demand Paging on a Device
This chapter describes how to use and configure demand paging, assuming that
the platform already has support for it enabled. I will cover the most sensible
ways to switch on demand paging for XIP ROM images and executables.1
This section discusses the configuration changes necessary for introducing de-
mand paging to a basic demand-page ROM.
ROM_IMAGE[0] {
ROM_IMAGE[0] {
<Basic demand paging keywords>
<Basic demand paging keywords>
}
}
In addition to the new keywords, the location of files in the ROM must be con-
sidered. I will describe this in Section 7.1.2, following it with an XIP ROM paging
example.
pagedrom keyword
The pagedrom keyword takes no arguments. It instructs ROMBUILD to sort the
1 For further information about the basic usage of demand-paging tools and keywords,
please consult the Symbian Developer Library documentation at developer.symbian.
org/sfdl
140 Demand Paging on Symbian
contents of the core ROM image so that all the unpaged files appear at the start
of the image, followed by all the paged files. This is so that the kernel can copy
the unpaged part of the image into RAM during boot, while leaving the rest of the
image to be paged into RAM on demand (see Figure 2 on page 11).
If the keyword compress is also specified, then the paged part of the image
will be compressed using the byte-pair algorithm. The unpaged part of the im-
age remains uncompressed. Contrast this with the behavior of compress when
pagedrom is not specified. In that case, the entire image is compressed using
the deflate algorithm.
pagingoverride keyword
The pagingoverride keyword determines the pageability of the executables in
the ROM/ROFS section it is defined in. It is operated on by ROMBUILD/ROFS-
BUILD and takes a single argument, which can be one of those shown in Table 6:
Table 6
Argument Effect
nopaging non- Marks all executables unpaged, irrespective of whether they are
aging already marked as paged or unpaged in their MMP file.
Marks all executables paged, irrespective of whether they are
alwayspage already marked as paged or unpaged in their MMP file. This can be
useful for debugging or analysis.
All executables that are neither marked as paged or unpaged in their
defaultunpaged
MMP file are marked as unpaged.
All executables that are neither marked as paged or unpaged in their
defaultpaged
MMP file are marked as paged.
demandpagingconfig keyword
The demandpagingconfig keyword takes the following arguments in the order
shown in Table 7:
Table 7
Argument Effect
This is the minimum number of RAM pages to reserve for
the paging subsystem. The number must be at least equal
to 2*(YoungOldPageRatio+1). If a smaller number is
<MinLivePages>
specified, a number equal to this formula is used instead. If
zero is specified or the demandpagingconfig keyword is
missing, then a value of 256 is used.
The maximum number of RAM pages the paging subsys-
tem may use. The number must be greater than or equal to
MinLivePages. If zero is specified or the demandpag-
ingconfig keyword is missing, then the system uses the
<MaxLivePages>
maximum possible value (1,048,575). On a production
system, it should always be set to zero, so that as many
pages as possible are used. Low values may be used to
test paging under more stressed conditions.
The ratio of young to old pages maintained by the paging
<YoungOldPageRatio> subsystem. This is used to maintain the relative sizes of the
two live lists. The default value is three.
Each platform can have its own way of configuring the primary ROFS, so there is
142 Demand Paging on Symbian
no generic way of removing it. On Symbian’s reference platform, you would con-
figure out the ‘ROM_IMAGE[1] {’ statement in \epoc32\rom\include\base.
iby, which defines the start of the ROFS section.
// MyDPConfig.oby
// MyDPConfig.oby
#if !defined PAGED_ROM
#define PAGED_ROM
#if !defined PAGED_ROM
#endif
#define PAGED_ROM
#endif
ROM_IMAGE[0] {
pagedrom
ROM_IMAGE[0] {
compress
pagedrom
// Min Max Young/Old
compress
// Live Live Page
// Min Max Young/Old
// Pages Pages Ratio
// Live Live Page
demandpagingconfig 512 32767 3
// Pages Pages Ratio
pagingoverride defaultpaged
demandpagingconfig 512 32767 3
}
pagingoverride defaultpaged
}
If the OBY file that defines the start of the primary ROFS image (for example,
base.iby) contains a section like this:
#endif
#endif
#endif
#endif
To build a demand-paged ROM, simply include the new OBY file in the buil-
drom statement. For example, when using the Symbian Techview reference
platform on H4:
It is possible that you will still end up with a small ROFS image being produced if
one of the included OBY/IBY files explicitly places files in the primary ROFS sec-
tion. However, most or all of the ROM contents will be in the core image.
When you are building a code-paged ROM, you can choose whether to enable
XIP ROM paging or not. The pagedrom keyword only affects XIP ROM paging,
but the demandpagingconfig keyword applies to code paging as well because
both types of demand paging use the same underlying paging cache.
There are two further things to consider, the pagingpolicy keyword and the
location of executables to be code paged (for example, in ROFS or in the user
data area).
Note the difference between this keyword, which operates on files at runtime, and
pagingoverride, which operates at rombuild/rofsbuild time.
Note also that pagingpolicy only has any meaning in the core ROM image, so
it should be directed to ROMBUILD, not ROFSBUILD. You can ensure that this is
the case by enclosing the keyword in a ‘ROM_IMAGE[0] {}’ block.
The easiest way to use code paging instead of XIP ROM paging is to ensure
paged executables are placed in a ROFS partition that supports code paging, in-
stead of in the core ROM image. This may involve reversing the decision to move
the contents of the ROFS to the core, as required for XIP ROM paging.
You need to do one of the following for an executable to be code paged from
internal writeable media:
1. In the MMP file of the executable, explicitly specify the paged keyword.
This will implicitly ensure the executable is byte-pair compressed (or un-
compressed if compression is disabled).
2. Explicitly convert the executable to the byte-pair or uncompressed format.
For example, use the elftran command as follows:
3. elftran -compressionmethod bytepair \epoc32\release\
armv5\urel\mylibrary.dll.
4. Modify the build tools to compress executables in the byte-pair format by
default.
5. Modify the build tools to uncompress executables by default.
All these options except option four have BC implications (see Section 5.6.2 for
the BC impact on installed executables).
// MyDPConfig.oby
// MyDPConfig.oby
#if !defined PAGED_ROM
#if !defined PAGED_ROM
#define PAGED_ROM
#define PAGED_ROM
#endif
#endif
146 Demand Paging on Symbian
#if
#if !defined
!defined USE_CODE_PAGING
USE_CODE_PAGING
//
// Uncomment
Uncomment next
next line
line if
if code
code paging
paging is
is wanted
wanted
#define
#define USE_CODE_PAGING
USE_CODE_PAGING
#endif
#endif
#if
#if !defined
!defined CODE_PAGING_FROM_ROFS
CODE_PAGING_FROM_ROFS
//
// Uncomment
Uncomment next
next line
line if
if code
code paging
paging from
from primary
primary rofs
rofs is
is wanted
wanted
#define
#define CODE_PAGING_FROM_ROFS
CODE_PAGING_FROM_ROFS
#endif
#endif
ROM_IMAGE[0]
ROM_IMAGE[0] {{
pagedrom
pagedrom
compress
compress
//
// Min
Min Max
Max Young/Old
Young/Old
//
// Live
Live Live
Live Page
Page
//
// Pages
Pages Pages
Pages Ratio
Ratio
demandpagingconfig
demandpagingconfig 256
256 512
512 33
pagingoverride
pagingoverride defaultpaged
defaultpaged
#if
#if defined
defined USE_CODE_PAGING
USE_CODE_PAGING
pagingpolicy
pagingpolicy defaultpaged
defaultpaged
#endif
#endif
}}
#if
#if defined
defined CODE_PAGING_FROM_ROFS
CODE_PAGING_FROM_ROFS
ROM_IMAGE[1]
ROM_IMAGE[1] {{
pagingoverride
pagingoverride defaultpaged
defaultpaged
}}
#endif
#endif
You would also adjust the OBY file that determined the start of the primary ROFS
partition (for example, base.iby) like this:
#if
#if defined(_NAND)
defined(_NAND) ||
|| defined(_NAND2)
defined(_NAND2)
#if !defined PAGED_ROM
#if !defined PAGED_ROM |||| defined
defined CODE_PAGING_FROM_ROFS
CODE_PAGING_FROM_ROFS
REM Start of ROFS image
REM Start of ROFS image
ROM_IMAGE[1]
ROM_IMAGE[1] { {
#endif
#endif
#endif
#endif
Configuring Demand Paging on a Device 147
buildrom
buildrom –D_NAND2
–D_NAND2 –DPAGED_ROM
–DPAGED_ROM –DUSE_CODE_PAGING
–DUSE_CODE_PAGING –DCODE_PAGING_
–DCODE_PAGING_
FROM_ROFS
FROM_ROFS
h4hrp
h4hrp techview
techview MyDPConfig
MyDPConfig
In the previous two sections, I explained how to create demand-paged ROMs and
configure the general paging behavior (that is, the default paging policy and size
of the paging cache). This section describes how to configure whether individual
files are paged or not.
If the build tools are configured to compress executables while building (the
default behavior), then executables marked with the paged keyword will be byte-
pair-compressed.
You should only use these modifiers for user-side ROM objects (such as ‘file=’,
‘dll=’, ‘data=’ and ‘secondary=’ statements). Kernel-side ROM objects (‘primary=’,
‘extension=’, ‘variant=’ and ‘device=’ statements) are always unpaged so any
modifier will be ignored. Furthermore, the modifier will be ignored for ‘data=’
statements if the object is in a ROFS partition; the pageability of non-executable
files is only relevant in an XIP ROM image (see Section 3.4).
The OBY file pageability modifier overrides any pageability defined in the execut-
able’s MMP file. However, it does not override the pagingoverride statement
in the case of nopaging or alwayspage.
This optional tool runs during the OBY pre-processing phase, before ROMBUILD
and ROFSBUILD are executed. You provide it with a centralized list of paged
and unpaged files, and the tool uses this to add paged/unpaged modifiers to
individual statements in the intermediate OBY files. This centralized list overrides
any individually specified paged/unpaged modifiers and any pageability defined
in individual MMP files. However, the list will not override the pagingoverride
statement in the case of nopaging or alwayspage.
Configuring Demand Paging on a Device 149
Table 8
Keyword Effect
defaultpaged Sets all unspecified files to be paged. Note this
will effectively override any pagingoverride
statement in the case of defaultpaged or de-
faultunpaged because all ROM files will then
contain a pageability modifier.
<file regex> unpaged Sets the file(s) specified by the regular expres-
or sion in <file regex> to unpaged.
unpaged:
<file regex>
<file regex>
<file regex> paged Set the file(s) specified by the regular expression
or in <file regex> to paged.
paged:
<file regex>
<file regex>
150 Demand Paging on Symbian
The simplest core/ROFS split is to put everything in the core image with an empty
ROFS – this is the optimum split if you want to page as much code as possible.
However, it is suboptimal when there is a significant amount of unpaged code. In
fact, if the amount of unpaged code plus minimum paging cache size approaches
the size of code that needs to be loaded in a non-demand-paged ROM, there
may be a net RAM loss (see Section 3.14).
Configuring Demand Paging on a Device 151
There are various strategies for dealing with this issue, and I discuss these in the
following sections.
7.4.1 efficient_rom_paging.pm
You can invoke this optional tool, which runs during the buildrom phase be-
tween configpaging and execution of rombuild/rofsbuild. It searches
for any paged files in the intermediate ROFS OBY file and moves these files to
the core ROM OBY file, together with their static dependencies (both paged and
unpaged). This ensures that as much paged code as possible is in the final core
ROM image, making best use of XIP ROM paging. However, the tool makes no
effort to limit the amount of unpaged code in the core ROM image.
The difficulty here is choosing the privileged set. If the set is too small, then many
paged executables will have to be in the primary ROFS, because they have non-
privileged, unpaged dependencies. If the set is too large, then the configuration
would be much the same as with efficient_rom_paging. An ideal set would
be one that contains the unpaged executables that are always loaded, plus those
that have a significant amount of paged code dependent upon them.
Since the introduction of demand paging, you can pass a ‘-geninc’ switch to
the buildrom command or to ROMBUILD to output additional ROM building
information. When this switch is used, the tools create a file called <rom_image_
name>.inc in the ROM building directory. The file has the following format:
You can use this information in your custom ROM building tools that wrap Sym-
bian’s ROM building tools.
Configuring Demand Paging on a Device 153
1. The pagedrom and compress keywords are not defined, because these
are already defined in the reference board IBY files (such as \epoc32\
rom\include\base_h4hrp.iby for H4).
2. CODE_PAGING_FROM_ROFS is disabled.
3. configpaging.pm (see Section 7.3) is used to include the recommended
unpaged list of components. The mandatory unpaged components are
configured via their MMP files, so they do not need to be configured cen-
trally.
4. efficient_rom_paging.pm (see Section 7.4.1) is used.
5. Different demandpagingconfig parameters are used.
To build a reference ‘Techview’ ROM using the default configuration, simply add
the pagedrom parameter to any NAND Techview buildrom command line, like
this:
buildrom –D_NAND2pagedrom
buildrom –D_NAND2 pagedrom h4hrp
h4hrp techview
techview
It is important that pagedrom appears before techview so that the flags defined
in pagedrom.oby are parsed before base.iby, which is included by tech-
view.iby.
154 Demand Paging on Symbian
The default configuration provides a generous paging cache. The aim of this is to
provide a modest RAM saving compared with a non-demand-paged NAND Tech-
view ROM, while maintaining performance for all performance-critical use cases.
Performance for some use cases is actually improved.
You can build a ROM using this configuration in the same way as for the default
configuration, but using pagedrom_functional in place of pagedrom. The
purpose of the functional configuration is to provide a more aggressive paging
environment, while maintaining functional equivalence with a non-demand-paged
NAND Techview ROM. As a result, performance is worse for most use cases but
there are significant RAM savings.
Here are some things to bear in mind when defining custom configurations:
• Symbian only warrants the functionality of the OS when all mandatory un-
paged components are unpaged. A configuration that overrides the man-
datory unpaged components, like the stressed configuration mentioned
above, is not warranted.
• Symbian only warrants the performance of Symbian components involved
in key use cases (see Section 6.1.2) when all mandatory and recommend-
ed unpaged components are unpaged.
• The size of the paging cache is dependent on the amount of code loaded
during the most extreme use case. Simple UI platforms (such as Tech-
view) need a smaller cache size, whereas larger ones (such as S60) need
a larger cache.
• On platforms that add significant additional code to Symbian, those ad-
ditional components should be evaluated (perhaps using the guidelines in
Section 6) to see if there are any further unpaged components that should
be added to the mandatory and recommended unpaged lists.
• The minimum paging cache size should be large enough so that when
you set the maximum cache size to the same value as the minimum (in
testing), then the functional equivalence and robustness of the platform is
maintained, and performance is at an acceptable minimum level. Lower
values may cause stability problems when the device is low on free RAM.
• On production devices, the maximum paging cache size should be set to
the maximum possible value to minimize the number of page faults. The
configurations provided by Symbian place a somewhat low upper limit
on the paging cache size to induce additional page faults for testing
purposes.
156 Demand Paging on Symbian
Testing and Debugging in a Demand-Paged Environment 157
8
Testing and Debugging in a Demand-Paged
Environment
If you include the BTRACE.EXE console application in an OBY file called MyD-
PBTrace.oby, then you can create a basic demand-paged Techview ROM with
tracing enabled using the following command:
btrace
btrace –f3,10
–f3,10 –m1
–m1 –b1024
–b1024
RUNNYCHEESE.EXE
RUNNYCHEESE.EXE
btrace
btrace d:\MyDPLog.txt
d:\MyDPLog.txt
The second line executes RUNNYCHEESE.EXE and the third line dumps the
BTrace log to d:\MyDPLog.txt.
//
// MyDPBTrace.oby
MyDPBTrace.oby
file=ABI_DIR\DEBUG_DIR\btrace.exe
file=ABI_DIR\DEBUG_DIR\btrace.exe \sys\bin\btrace.exe
\sys\bin\btrace.exe
ROM_IMAGE[0]
ROM_IMAGE[0] {{
//
// Set
Set the
the Btrace
Btrace flag
flag (EThreadIdentification
(EThreadIdentification =
= 3)
3) +
+ (EPaging
(EPaging =
= 10)
10)
BTrace
BTrace 1032
1032
//
// Set
Set the
the trace
trace mode
mode (enabled/not
(enabled/not free
free running)
running)
BTraceMode
BTraceMode 1
1
//
// Set
Set the
the buffer
buffer size
size
BTraceBuffer
BTraceBuffer 1024000
1024000
}
}
Then you can dump the BTrace log from a command prompt in the same way as
before:
Testing and Debugging in a Demand-Paged Environment 159
btrace d:\MyDPLog.txt
However, this approach does report additional events, especially during the boot
sequence. This may be of use if you are debugging a new hardware platform with
demand paging enabled.
The kernel trace flags for the paging subsystem and the paging media subsystem
are bits 62 and 61 respectively. To enable both these flags, adjust the kernel-
trace keyword in the relevant OBY file as follows:
kerneltrace 0x00000000 0x60000000
The flags should be OR’d with any other relevant kernel trace flags. You must
use a debug version of the kernel (kernel trace logging is not enabled in release
versions).
Essentially, this API allows the caller to retrieve information about which demand
paging attributes are enabled, how many page faults and page ins have taken
place, the paging cache size parameters and the current cache size. The API
also allows the executable it runs in to flush the cache and change the cache
size, so long as that executable has the WriteDeviceData platform security
capability.
160 Demand Paging on Symbian
For usage information, simply run dptestcons from an eshell instance with
no parameters.
2. A dialog will appear. Uncheck DABORT and PABORT from the ‘Set’ group
of checkboxes.
8.2 Testing
In this section, I discuss the potential issues that demand paging may cause or
expose during testing, and strategies you can use to expose such problems. The
unpredictable nature of demand paging makes it very difficult to anticipate
Testing and Debugging in a Demand-Paged Environment 161
problems in the system. However, there are some patterns to the kinds of prob-
lems that are likely to be observed. I describe some of these and their possible
solutions in the next sections.
Functional differences
Demand-paged ROMs contain code paths that are not executed in non-demand-
paged ROMs – but is the converse true? Is there functionality in non-demand-
paged ROMs that is no longer required (and hence not executed) in a demand-
paged environment? If this were true, there would be no option but to test both
demand-paged and non-demand-paged ROMs, because neither of them is a
subset of the other. To answer that question, we need to look again at the NAND
flash layouts in Figures 1 and 3 in chapter 3.
Assuming a typical demand-paged ROM layout like 3C or 3D, it is safe to say that
162 Demand Paging on Symbian
Performance differences
In Section 2, I discussed the trade-off between RAM and performance in
demand-paged ROMs. We’ve seen that for some configurations it is difficult to
predict whether a particular use case will be quicker on a demand-paged or a
non-demand-paged ROM. However, we can choose a sufficiently aggressive
demand-paged configuration such that all use cases run more slowly than on a
non-demand-paged ROM.
Analysis of demand-paged defects found thus far shows that those exposed by
timing differences are only reproducible when the use case runs slower than on
a non-demand-paged ROM. Furthermore, some defects are only reproducible on
aggressive demand-paged configurations but not on less aggressive configura-
tions (or non-demand-paged ROMs). There have been no cases of defect
reproducibility increasing as the configuration is made less aggressive.
RAM differences
The primary purpose of demand paging is to save RAM, so any sensible
demand-paged configuration will result in more free RAM than an equivalent non-
demand-paged ROM. It is important that you continue to run any use cases and
tests related to out-of-memory conditions in demand-paged ROMs, where out-
of-memory conditions are harder to reproduce. You can reproduce the behavior
of the paging cache in out-of-memory conditions by limiting the maximum paging
cache, but this does not limit other system memory allocations.
Functional testing
We know that the more aggressive the demand-paged configuration, the greater
the chances of reproducing problems such as those discussed in Section 8.2.1
and that Symbian only warrants configurations in which all the Symbian-
mandatory unpaged components are unpaged. So, a good configuration to use
for functional testing would be one that only has these components unpaged
(plus any additional non-Symbian components that fit the same criteria), together
with a small maximum paging cache size. However, you should not choose a
maximumpaging cache size so small that the time taken to execute the tests is
unreasonably long.
The Symbian functional configuration (see Section 7.6.2) fulfils these require-
ments for the Techview reference environment.
Performance testing
The relatively aggressive configuration used for functional testing may not be
suitable for performance testing. Some tests may have to complete within a
certain time interval to pass. So, to test performance, you may need to mark ad-
ditional components as unpaged, and/or choose a larger paging cache. However,
it is still sensible to limit the maximum paging cache size to reproduce out-of-
memory behavior.
The Symbian default configuration (see Section 7.6.1) matches the above
requirements for the Techview reference environment.
User testing
The configurations that you use for functional or performance testing are not suit-
able for production devices. At some point, you will need to perform wider system
testing with a production demand-paged configuration. Making this should be a
simple matter of taking the configuration used for performance testing, changing
the minimum paging cache size to the maximum size and changing the maximum
size to the maximum possible value. This will means that the paging cache can
164 Demand Paging on Symbian
grow much larger, which means that fewer defects will be reproducible. Testing
with this configuration should be delayed until as late as possible in the project,
otherwise some problems in out-of-memory situations may be hidden.
In Conclusion 165
9
In Conclusion
In this book, I’ve looked at demand paging at many levels - from a high-level
overview in Chapters 2 and 3, to an in-depth study of the implementation in
Chapter 4. In Chapter 5, I’ve given you a practical hands-on guide to using de-
mand paging yourself, whether you are working with device drivers in a demand-
paged system or enabling demand paging in a new device (in which case you’ll
also find Chapter 7, on configuring device parameters, very useful). If you’re
working at a higher level, then Chapter 6 gives you the nitty-gritty on getting your
component ready for demand paging. Finally, in Chapter 8, I look at testing and
debugging in a demand-paged environment.
Demand paging on Symbian has been a great success. Not only does demand
paging increase free RAM, it also speeds device boot and application start-up
times, and makes for a much more robust device under low-memory conditions.
So successful was demand paging that it has been back ported two operating
system generations, to devices that have already been released into the market. I
wish you all the best in working with it.
Index
A E
Aging a Page 36 Effective RAM Saving 19
Algorithm 55 efficient_rom_paging.pm 151
Allocating Memory 33 EStart 117
B F
Binary Compatibility 120
Boot-Loaded 90 File Server 63
BTrace 157 File System Caching 14
Byte-Pair Compression 54 Fine-Grained Configuration 147
Fragmented Write Requests 113
C Freeing a Page 38
Functional Equivalence 129, 137
Cache Support 52
Candidate Unpaged List 134 G
Chunk APIs 50
Clamping 63 General performance 137
Code-Paged ROM 143 H
Code Paging 13, 34
Code Segment Initialization 43 Hardware Debugging 160
Component Architecture 134
Composite File System 8 I
Configpaging 148 Implementing File Clamping 118
Core/ROFS 141 Improved application start-up times 3
Critical Code Paths 133 Internal MMC/SD Card 114
Custom Demand-Paging Configurations 155 Internal Writeable Media 145
D IPC Analysis 128
Data Paging 76 K
Data Structures 40 Kernel 136
DDigitiser 105 Kernel Containers 82
Debugger Breakpoint 58 Kernel Extension Entry Point 109
Debugging 157 Kernel Implementation 21
Default Demand-Paging Configuration 153 Kernel RAM Usage 6
defaultpaged 149 Kernel Trace Logging 159
defaultunpaged 149 Key Classes 21
demandpagingconfig 141
Dependency Information 127 L
Device APIs 58
Device Stability 137 Live Page List 28
Disconnected Chunk API 51 Locking Memory 48
DPTest API 159 Logical Channel 99
Dynamic Analysis 129
Dynamic RAM Cache 33
M Power management 137
Problems 133
MaxLivePages 141
Media Driver Migration 107 R
Media Drivers 58
Memory Accounting 34 RAM Cache Interface 30
Memory Allocation 32 Rejuvenating a Page 36
Migrating 76 RFile::BlockMap() API 67
Migrating Device Drivers 87 ROFS 144
Minimum Paging Cache Size 135 ROM 7
MinLivePages 141 ROM paging structure 12
Mitigation Techniques 133 S
MMP File Configuration 147
Security 137
N Shared Chunks 85
Sound driver 105
NAND Flash 7 Static Analysis 127
NAND Flash Structure 12 Stressed Demand-Paging Configuration 154
Symbian OS v9.3 74
O
Symbian Reference Configurations 152
OBY File 139
T
OBY File Configuration 147
Optimizing 150 Testing 159
The Kernel Blockmap 42
P
The Paging Algorithm 15
Pageability Categories 136 The Paging Configuration 17
Paged 138 TLocalDriveCaps 111
pagedrom keyword 140 Tracing 156
Page-Fault Handler 22
U
Page-Information Structures 22
Page Locking 47 Unique Threads 92
Paging Cache Sizes 18 Unpaged 137
Paging In 16 Unpaged Files 17
Paging In a ROM Page 40 USB driver 104
Paging Out 17 Use-Case Analysis 128
pagingoverride 140 User-Side Blockmap 41
pagingpolicy 143
Paging Requests 112 V
Paging Scenarios 75
Virtual Memory 50
PDD 101
Performance Data 130
Permanent Presence 137
Physical Device 97
Physical RAM 22