100% found this document useful (2 votes)
2K views177 pages

Demand Paging On Symbian Online Book

This document provides an in-depth overview of demand paging on Symbian. It discusses the advantages and disadvantages of demand paging, and how it is implemented on Symbian. Key aspects covered include XIP ROM paging, code paging, file system caching, the paging algorithm, and configuration options. Guidance is provided on enabling demand paging on new platforms, migrating device drivers, and testing in a demand-paged environment. The document serves as a comprehensive reference for understanding and implementing demand paging on Symbian.

Uploaded by

Symbian
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
2K views177 pages

Demand Paging On Symbian Online Book

This document provides an in-depth overview of demand paging on Symbian. It discusses the advantages and disadvantages of demand paging, and how it is implemented on Symbian. Key aspects covered include XIP ROM paging, code paging, file system caching, the paging algorithm, and configuration options. Guidance is provided on enabling demand paging on new platforms, migrating device drivers, and testing in a demand-paged environment. The document serves as a comprehensive reference for understanding and implementing demand paging on Symbian.

Uploaded by

Symbian
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 177

Demand Paging on Symbian

by

Jane Sales
Demand Paging on Symbian

by

Jane Sales

Reviewed by

John Beattie

Dan Handley

Jenny Keates

Jason Parker

Jo Stichbury

Editor and Typesetter

Satu McNabb
Symbian Foundation
1 Boundary Row
London SE1 8HP
England

Visit our Home Page at developer.symbian.org


Our email address: docs@symbian.org

This work is licensed under the Creative Commons Attribution-Share Alike 2.0
UK: England & Wales License. To view a copy of this license, visit creativecom-
mons.org/licenses/by-sa/2.0/uk or send a letter to Creative Commons, 171
Second Street, Suite 300, San Francisco, California, 94105, USA.

Designations used by companies to distinguish their products are often claimed


as trademarks. All brand names and product names used in this book are trade
names, service marks, trademarks or registered trademarks of their respective
owners.

ISBN: 978-1-907253-00-3

Typeset in 11 Arial by Symbian Foundation


Table of Contents

About the Author....................................................................................................ix

Acknowledgements................................................................................................ix

Chapter 1: Introduction........................................................................................1

1.1 Introduction to Demand Paging........................................................................2

Chapter 2: The Advantages and Disadvantages of Demand Paging...............3

2.1 Benefits of Demand Paging..............................................................................3

2.2 Costs of Demand Paging..................................................................................4

Chapter 3: Understanding Demand Paging on Symbian..................................9

3.1 ROMs................................................................................................................9

3.2 NAND Flash......................................................................................................9

3.3 The Composite File System............................................................................10

3.4 XIP ROM Paging.............................................................................................11

3.5 Code Paging...................................................................................................15

3.6 File System Caching.......................................................................................16

3.7 Writeable Data Paging....................................................................................16

3.8 The Paging Algorithm.....................................................................................17


3.9 The Live Page List..........................................................................................17

3.10 XIP ROM Paging: Paging In.........................................................................18

3.11 XIP ROM Paging: Paging Out......................................................................18

3.12 The Paging Configuration.............................................................................19

3.13 Unpaged Files...............................................................................................19

3.14 Paging Cache Sizes......................................................................................20

3.15 Effective RAM Saving...................................................................................21

3.16 Byte-Pair Compression.................................................................................22

Chapter 4: Under the Hood: The Implementation of Demand Paging...........23

4.1 Kernel Implementation....................................................................................23

4.2 Media Driver Support......................................................................................60

4.3 File Server Changes.......................................................................................65

Chapter 5: Enabling Demand Paging on a New Platform...............................77

5.1 Choosing Which Type of Demand Paging to Implement................................77

5.2 Migrating Device Drivers to a Demand-Paged System..................................78

5.3 Guidelines for Migrating Device Drivers..........................................................89

5.4 Media Driver Migration..................................................................................109

5.5 Implementing File Clamping..........................................................................120

Chapter 6: Component Evaluation for Demand Paging................................129


6.2 Dynamic Analysis..........................................................................................129

6.3 Identifying Demand-Paging Problems and Mitigation Techniques...............135

6.4 Symbian’s Pageability Categories................................................................138

Chapter 7: Configuring Demand Paging on a Device...................................141

7.1 Building a Basic Demand-Paged XIP ROM.................................................141

7.2 Building a Basic Code-Paged ROM.............................................................145

7.3 Fine-Grained Configuration...........................................................................149

7.4 Optimizing the Configuration........................................................................152

7.5 Other Demand-Paged ROM Building Features............................................154

7.6 Using the Symbian Reference Configurations..............................................155

Chapter 8: Testing and Debugging in a Demand-Paged Environment........159

8.1 Tracing and Debugging with Demand Paging..............................................159

8.2 Testing...........................................................................................................162

Chapter 9: In Conclusion.................................................................................167
About the Author

Jane Sales joined Psion in 1995 to lead the team developing a new operating
system, EPOC32, for Psion’s as-yet-unreleased Series 5. A few months later,
EPOC32 first booted in Jane’s spare bedroom in Cookham, Berkshire. Jane
notes with pleasure that the distribution of Symbian-based devices based on its
descendant are rather more widespread today.

Jane left Symbian in 2003 to move to the south of France with her husband. She
wasn’t allowed to escape completely though – under the olive trees in her garden
she wrote her first book, Symbian OS Internals, which was published by Wiley in
2005. Soon afterwards Jane moved to Ukraine where she set up a mobile strat-
egy consultancy, working with Symbian’s Research and Strategy Groups, among
others.

In 2008, Jane co-founded a company named Ambient Industries, which is de-


veloping an original way for people to discover the world around them on their
iPhones. (Jane very much hopes that she has still not escaped from Symbian
completely, and that her company will port its product to the Symbian platform in
the near future.) Two perspicacious British investors funded Ambient Industries in
2009, and Jane now divides her time between Cambridge (the first one) and San
Francisco (the famous one).

Acknowledgements

Jane would like to thank Jo Stichbury and Satu McNabb for their calm, organized
management of this project – an approach that makes making books fun. Jane
would also like to thank John Beattie for his painstaking review which exposed
many of her silly mistakes. Nevertheless, any errors that remain in the book you
are reading are Jane’s alone.

Symbian would like to thank Jane for her professional and yet relaxed approach
to this project – this book was created, apparently, effortlessly. We’d also thank
everyone involved in this project: the reviewers and technical experts, and copy-
editor Jenny Keates for her skilled edit.
Introduction 1

1
Introduction

This text supplements my earlier book, Symbian OS Internals,1 and provides a


comprehensive and highly detailed insight into the workings of demand paging on
Symbian.

This text will be invaluable for people who are:

• Creating a new, demand-paged device. The text gives clear instructions on


how to implement demand paging for the first time and on the trade-offs
that affect performance and ROM size.
• Writing device drivers for a demand-paged environment (or porting them).
• Wanting to understand demand paging on Symbian. This detailed com-
mentary on the internals of demand paging will serve a range of readers:
students studying real-time operating systems; middleware programmers
understanding the behavior of underlying systems and systems engineers
comparing Symbian with other similar operating systems.

Readers of this book should have a basic understanding of the Symbian kernel
architecture, including the process model, threads and memory management.
If you are not familiar with this material, I suggest reading Chapters 1 (Introduc-
ing EKA2), 3 (Threads, Processes and Libraries) and 7 (Memory Models) of my
book.1 To understand code-paging in depth, you should also consider reading
Chapters 9 (The File Server) and 10 (The Loader). If your interest lies in
device drivers, then Chapter 12 (Device Drivers and Extensions) provides a good
grounding in Symbian’s device driver architecture.

1 Symbian OS Internals: Real Time Kernel Programming, Jane Sales et al., 2005, John Wiley &
Sons. This entire text for the book can also be found online at developer.symbian.org/wiki/index.
php/Symbian_OS_Internals.
2 Demand Paging on Symbian

1.1 Introduction to Demand Paging


As Symbian-based devices become increasingly feature-rich, the demand on
system resources increases. One important commodity is physical RAM, which
contributes significantly to the bill of materials (BOM) of a device. Increasing
ROM from 64 MB to 128 MB is not only costly in financial terms (around $3.50 at
time of writing) but also in terms of additional power consumption and reduced
battery life.

Demand paging is a feature of virtual memory systems that makes it possible for
pages of RAM to be loaded from permanent storage when they are needed – that
is, on demand. When the contents are no longer required, the RAM used to store
them may be reused for other content. In this way, the total physical RAM re-
quired to store content is less than if all the content were permanently available,
and hence the BOM of the device can be reduced.

The most important mechanism used in demand paging is the page fault. (Mem-
ory is managed in units known as pages, which are blocks of 4 KB on all current
architectures.) When an application attempts to access a location that is not
present in memory, a page fault occurs. The kernel handles this page fault, read-
ing the missing page from disk into RAM and then restarting the faulting applica-
tion. All memory that is managed by the kernel in this way is said to be pageable
memory, and the process is controlled by an entity known as the paging system.
A good source of more information on basic operating system concepts is Andrew
Tanenbaum’s Operating Systems: Design and Implementation.2 With an unbe-
coming lack of modesty, I’ll also suggest my own book, Symbian OS Internals,2 if
you’re more interested in the specifics of the Symbian kernel.

The Symbian implementation of demand paging has been a huge success. Not
only is there considerably more free RAM in the device, but also the ROM boots
more quickly, applications start up faster and the stability of the device has in-
creased. So successful was demand paging that it has been back ported two OS
generations, to devices that have already been released into the market.

2 Operating Systems: Design and Implementation (Second Edition). Andrew Tanenbaum, 1997,
Prentice-Hall.
The Advantages and Disadvantages of Demand Paging 3

2
The Advantages and Disadvantages of
Demand Paging

2.1 Benefits of Demand Paging

Demand paging can provide a considerable reduction in the RAM usage of the
operating system. The actual size of the RAM saving depends considerably on
the way demand paging is configured, as you will see. However, it is safe to say
that on a feature-rich user-interface (UI) platform it is possible to make multi-
megabyte RAM savings without a significant reduction in performance. Internal
tests show an increase in free RAM at boot from 18.5 MB to 32 MB – a 73% in-
crease. Figures after launching a couple of applications (Web and GPS) are even
more impressive. Free RAM increases from 9 MB to 26.5 MB in the demand-
paged ROM – a 194% increase.

At least two other benefits may be observed. Note that these are highly depen-
dent on the paging configuration. See Section 7 for more details.

2.1.1 Improved Application Start-Up Times Due to Lazy Loading


Usually the cost of servicing a page fault means that paging has a negative im-
pact on performance. But sometimes, on composite file system ROMs (described
in Section 3.3), demand paging improves performance, especially when the use
case normally involves loading a large amount of code into RAM (for example,
when booting or starting large applications). In this case, the performance over-
head of paging can be outweighed by the performance gain of loading less code
into RAM. This is sometimes known as ‘lazy loading’ of code. For example, the
loading time of the Web application decreased from 2.35 seconds to 1.33 sec-
onds – a performance boost of 44%.
4 Demand Paging on Symbian

2.1.2 Improved System Boot Time


When the non-demand-paged case consists of a large core image, most or all of
the code involved in a use case will already be permanently loaded, and so there
will not be a reduction in application start-up times due to lazy loading. But there
is always a big win when booting, where the loading of the whole core image
from NAND flash into RAM in the non-demand-paged ROM is a major contributor
to the overall boot time.

During Symbian tests, typical boot time of a production ROM reduced from 35
seconds to 22 seconds – a performance boost of 37%.

2.1.3 Improved Stability When Out of Memory


A device is often at its least stable when it is out of memory (OOM). Poorly written
code may not cope well with exceptions caused by failed memory allocations. As
a minimum, an OOM situation will degrade the user experience.

If demand paging is enabled on a device, the increased RAM available to ap-


plications makes it less likely that the device will run out of memory, thus avoid-
ing many potential stability issues. Furthermore, at any particular time, the RAM
saving achieved by demand paging is proportional to the amount of code loaded
in the non-demand-paged case. For instance, the RAM saving when five applica-
tions are running is greater than the saving immediately after boot. This makes it
even harder to induce an OOM situation.

This increased stability only applies when the entire device is OOM. Individual
threads may still have OOM problems due to reaching their own heap limits, and
demand paging will not help in these situations.

2.2 Costs of Demand Paging

2.2.1 Performance
One of the expected downsides to demand paging is a reduction in performance.
Demand paging can add an execution overhead to any process that takes a
paging fault. Also, paging faults block a thread for a significant and unpredictable
time while the fault is serviced, so any thread that can take a paging fault is not
suitable for truly real-time tasks. The delay will be in the order of 1ms if the back-
The Advantages and Disadvantages of Demand Paging 5

ing store (media used to store pages that are not in main memory) is not
being used for other file system activity. However, in busy systems, the delay for
a single fault could be hundreds of milliseconds or more, and should be treated
as unbounded for the purpose of any real-time analysis. Note that a paging fault
can occur for each separate 4 KB page of memory accessed, so an eight-byte
object straddling two pages could cause two paging faults.

Threads of lower priority than a pageable thread cannot be guaranteed to be real-


time either, because they will never run while the higher-priority thread is ready to
run. In addition, the servicing of a paging fault has the thread priority of the NAND
media driver (KNandThreadPriority = 24) – so this means that a faulting
thread of higher priority than this will effectively have its priority reduced.

However, I would be giving an overly pessimistic impression if I did not mention


the fact that the paging system as a whole acts as a most recently used cache
– pages that have been accessed recently will be present in memory. Since
recently accessed pages are more likely to be accessed again, paging faults are
relatively rare in practice.

To support real-time code, Symbian provides a means to lock the memory in


a process (‘wire’ it) so that it is always present and not demand paged. Device
drivers must be modified to either access user memory in the context of the cli-
ent thread (preferable) or use a dedicated DFC thread (so a paging fault in one
driver doesn’t affect another). Servers that provide real-time services must have
their process memory wired, and be multi-threaded, so that real-time clients
are serviced in a separate thread, or they must be changed to reject clients that
aren’t wired. Of course, truly real-time applications must also have their process
memory wired.

2.2.2 Complexity
Demand paging makes more demands on system software, especially the ker-
nel. The resulting increase in complexity brings a concomitant risk of defects. For
example, if the kernel is servicing a paging fault and needs to execute a thread
which can itself take a paging fault then deadlock results. (This problem also
occurs if the servicing thread needs to block on a service provided by another
thread which takes a paging fault, or to block on a mutex held by such a thread
– or even if the servicing thread needs to hold a mutex of a higher order than the
6 Demand Paging on Symbian

one held by the faulting thread.) Because this is complex, Symbian applies the
simple, safe rule that demand-paged memory must not be accessed when hold-
ing any kernel-side mutex. Symbian had to examine and re-write the kernel and
the media drivers to avoid this issue – but, fortunately, such code is rare. (And
for code paging, of which I shall say more in the next section, Symbian also had
to look at file systems, file-system plug-ins and all the other areas of the OS that
these use.)

That said, we should congratulate the Symbian kernel engineers. They have de-
livered an extraordinarily robust implementation of demand paging – such is the
quality of their delivery that it has needed very few modifications since it was first
released.

2.2.3 Increase in ROM Size on Flash Media


A demand-paged ROM tends to be around 10% larger than the same ROM un-
paged. This is because the unpaged ROM can be compressed as a single entity.
On a demand-paged ROM, each 4 KB page must be able to be decompressed
and copied into RAM in isolation, so pages are compressed individually, which is
not as efficient.

2.2.4 Increase in Kernel RAM Usage


A demand-paged kernel itself is actually slightly less RAM efficient than its non-
demand-paged predecessor. Symbian has modified the structures used by the
kernel to manage RAM to make them more efficient and robust, as well as sim-
pler to use and understand. They have been extended to support the implemen-
tation of demand paging, file-system caching and future requirements: the result
is a RAM increase amounting to 0.58% of the total physical RAM available on a
device. (Note: the additional RAM usage will not show up in tools that monitor the
size of chunks in the system such as MEMTRACE.) There are also other mod-
est increases in RAM usage due to other supporting features, such as additional
DFC threads used by device drivers (amounting to 5 KB per thread) and byte-pair
compression support in the loader (amounting to around 20 KB).

However, this does not take into account the considerable savings from demand
paging itself. You should expect a considerable net decrease in RAM usage when
using demand paging!
Understanding Demand Paging on Symbian 7

3
Understanding Demand Paging on Symbian

3.1 ROMs

The term ‘ROM’ traditionally refers to memory devices storing data that cannot
be modified. These devices also allow direct random access to their contents,
so that code can execute from them directly, and is said to be execute in place
(XIP). This has the advantage that programs and data in ROM are always avail-
able and don’t require any action to load them into memory.

On Symbian, the term ROM has developed the looser meaning of ‘data stored
in such a way that it behaves like it is stored in read-only memory.’ The underly-
ing media may be physically writeable (RAM or flash memory) but the file system
presents a ROM-like interface to the rest of the OS, usually as drive Z.

3.2 NAND Flash

The ROM situation is further complicated when the underlying media is not XIP.
This is the case for NAND flash, which is used in almost all devices on the market
today. Here, it is necessary to copy (or shadow) any code in NAND to RAM for
execution. The simplest way to achieve this is to copy the entire ROM contents
into RAM during system boot and use the MMU to mark this area of RAM with
read-only permissions. The data stored by this method is called the core ROM
image (or just core image) to distinguish it from other data stored in NAND. The
core image is an XIP ROM and is usually the only one. It is permanently resident
in RAM.
8 Demand Paging on Symbian

Figure 1, Layout A on page 9 shows how the NAND flash is structured in this
simple case. All the ROM contents are permanently resident in RAM and any
executables in the user data area (usually the C: or D: drive) are copied into RAM
as they are needed.

This approach is costly in terms of RAM usage, so Symbian introduced the com-
posite file system, a more efficient scheme.

3.3 The Composite File System

This scheme (broadly speaking) splits the ROM contents into those parts re-
quired to boot the OS, and everything else. The former is placed in the core
image as before and the latter is placed into another area known as the read-only
file system (ROFS). The loader (part of the file server) copies code in the ROFS
into RAM as it is needed at run-time, at the granularity of an executable, in the
same way as for executables in the user data area.

There can be several ROFS images, for example, localization and/or operator-
specific images. Usually, the first one (called the primary ROFS) is combined with
the core image into a single ROM-like interface by what is known as the compos-
ite file system.

In the rest of this document, I will use the term ROM to mean the combined core
and primary ROFS images – that is, everything on drive Z. Where it is important
to refer to just the core ROM image (or an XIP ROM generally), I will specify this.
References to ROFS will mean the primary ROFS unless otherwise stated. For
clarity, I shall only consider systems with a single (primary) ROFS.

Figure 1 shows snapshots of virtual memory for two demand-paging scenarios.


Layout B in Figure 1 shows an ordinary composite file system structure. When
comparing this to layout A, you can see that layout B is more RAM-efficient be-
cause some of the contents of ROFS are not copied into RAM at any given time.
The more unused files there are in ROFS, the greater the RAM saving.
Understanding Demand Paging on Symbian 9

Figure 1: The data copied into RAM for two different NAND flash layouts. The same use
case and ROM contents are assumed for each layout.

3.4 XIP ROM Paging

Since an XIP ROM image on NAND has to be loaded into RAM in order to run, an
opportunity arises to demand page the contents of the XIP ROM. This means that
when executing the ROM, we read its data from NAND into RAM on demand.
10 Demand Paging on Symbian

An XIP ROM image is split into two parts, one containing unpaged data and one
containing data that are paged on demand. Unpaged data consists of:

1. Kernel-side code
2. All code that should not be paged for other reasons (for example, perfor-
mance, robustness or power management)
3. The static dependencies of (1) and (2).

The terms ‘locked down’ or ‘wired’ are also used to mean unpaged.

At boot time, the unpaged area at the start of the XIP ROM image is loaded into
RAM as normal, but the virtual address region normally occupied by the paged
area is left unmapped. No RAM is allocated for it.

When a thread accesses virtual memory in the paged area, it takes a page fault.
The page-fault handler in the kernel then allocates a page of physical RAM and
reads the contents for this from the XIP ROM image on NAND flash. The thread
then continues execution from the point where it took the page fault. This process
is called ‘paging in’ and I will describe it in more detail in Sections 3.10 and 4.1.7.
When the free RAM on the system reaches zero, the kernel can satisfy memory-
allocation requests by taking RAM from the paged-in XIP ROM region. As RAM
pages in the XIP ROM region are unloaded, they are said to be ‘paged out’ (I
discuss this in Section 3.11). Figure 2 on the next page shows the operations
described.

All content in the paged data area of an XIP ROM is subject to paging, not just
executable code, so accessing any file in this area may give rise to a page
fault. Remember that a page may contain data from one or more files, and page
boundaries do not necessarily coincide with file boundaries.

All non-executable files in an XIP ROM, even those in the unpaged section, will
be paged unless they are explicitly configured to be unpaged (see Section 7.3).

When XIP ROM paging is used in conjunction with the composite file system, the
non-demand-paged NAND flash layout strategy may no longer make sense. If a
small core image and a large ROFS image are used, then only a fraction of the
files in the overall ROM can benefit from XIP ROM paging. Instead, it is usual to
Understanding Demand Paging on Symbian 11

Figure 2: The operations involved in XIP ROM paging

place most (or even all) files in the core ROM image. The ROFS is mainly used
for files that would be in the unpaged area of the core ROM anyway or have too
many unpaged dependencies.

Figure 3, layout C on page 12 shows a typical XIP ROM paging structure. Al-
though the unpaged area of the core image may be larger than the total core im-
age in Figure 1, layout B, only a fraction of the contents of the paged area needs
to be copied into RAM compared to the amount of loaded ROFS code in Figure
1, layout B.
12 Demand Paging on Symbian

Figure 3: The data copied into RAM for two additional NAND flash layouts (compare with-
Figure 1). The same use case and ROM contents are assumed for each layout.

XIP ROM paging was officially first introduced in Symbian OS v9.3, but was so
successful that it was back-ported by some device manufacturers to their devices
based on Symbian OS v9.2.
Understanding Demand Paging on Symbian 13

3.5 Code Paging

Code paging extends XIP ROM paging, to include other file systems such as the
C drive, ROFS and media drives that cannot be removed (you wouldn’t want to
page to a memory card that the user might take out of the phone!). Executables
that aren’t in XIP ROM are stored as files in those other file systems. Before their
code can be executed, it must be copied to RAM. The particular location in RAM
cannot usually be determined ahead of time, so the loader, part of the file server,
must modify the file’s contents after copying to correct the memory pointers
contained within them: the modification is known as ‘relocation’ or ‘fix-up’. This
process makes code paging considerably more complex than XIP ROM paging.

The additional overhead of code paging means that device manufacturers should
usually choose to use XIP ROM paging on their devices where possible. Layout
D in Figure 3 shows a typical NAND flash structure when both code paging and
XIP ROM paging are used. Only those parts of an executable currently in use are
copied into RAM. XIP ROM paging is still used for most data in ROM, and code
paging is used for any remaining paged executables in ROFS. It is expected that
the primary use for code paging will be for executables in the user data area,
such as third-party applications.

Pages that are used for code-paged executables never contain the end of one
executable and the start of another, as XIP ROM pages might. This means that
the last page of an executable is very likely to contain less than 4 KB of data.

It is also worth noting that most other operating systems don’t implement code
paging in this way. Instead, when they need to page out the contents of ex-
ecutables, they write the modified contents to the backing store so that they can
later be recovered. Symbian does not do this because of concerns about power
consumption and the wearing of the storage media used for the backing store.
Code paging support was originally planned for Symbian OS v9.4, but was such
a success that it was actually made available in Symbian OS v9.3.
14 Demand Paging on Symbian

3.6 File System Caching

In Symbian OS v9.4, the file server uses file system caching to speed file opera-
tions. The file-system cache is built upon the disconnected chunk, which allows
physical RAM to be committed and de-committed anywhere within its virtual ad-
dress space, with no requirement that RAM in the chunk is contiguous.

In previous versions of Symbian, physical RAM could be in one of two states –


committed and owned by a thread or process, or uncommitted and owned by
the kernel. To support file caching, Symbian has introduced a new, intermedi-
ate, state – ‘unlocked.’ The kernel may de-commit unlocked memory whenever
it needs to, but before this happens a thread may lock the memory again, which
returns it to the committed state with its contents preserved.

The file-system cache is closely linked to demand paging. When the file server
caches a new file, it commits and locks memory, reads the data, and then unlocks
the memory, placing the pages at the start of the live list to become the youngest
pages in the system. Then, if the file server reads the file again and there is a file-
cache hit, it locks the cache buffer, removing the pages from the live list. The file
server reads the data, and then unlocks the pages again, returning them to the
start of the live list.

You can see that the cached data for the file will stay in memory as long as it is
accessed often enough to stay in the live list. It will only be lost from the live list
if there is no free memory in the RAM allocator, and all the other memory used
for paging and caching has been more recently accessed than it has – that is, if
these cache pages have become the oldest pages in the system.

3.7 Writeable Data Paging

Writeable data paging extends demand paging from just paging code and con-
stant date to the paging of writeable data too – data such as user stacks, heaps
and other user data stored in chunks. The big difference here is that data is
mutable, whereas code is not, so if it is modified while it is paged in, it must be
written to a backing store when it is paged out. The backing store is analogous to
the ‘swap file’ used to implement the virtual memory scheme on PC systems.
Understanding Demand Paging on Symbian 15

Writeable data paging would offer the greatest RAM saving but would also have
the greatest impact on performance. Also, the additional write activity would in-
crease the power consumption of the device and wear the backing media.
A smart memory-caching scheme might be required to mitigate this.

Writeable data paging could also be used to implement a more traditional code-
paging scheme than the one described in Section 3.5.

At the time of writing, Symbian does not support writeable data paging.

3.8 The Paging Algorithm

All memory content that can be demand paged is said to be ‘paged memory’ and
the process is controlled by the ‘paging subsystem.’ A page is a 4 KB block of
RAM. Here are some other terms that are used:

• Live page – a page of paged memory the contents of which are currently
available
• Dead page – a page of paged memory the contents of which are not cur-
rently available
• Page in – the act of making a dead page into a live page
• Page out – the act of making a live page into a dead page. The RAM used
to store the contents of this page may then be reused for other purposes.

In the rest of this section, I’ll give an overview of how the paging subsystem
works and introduce some more vocabulary that is useful for understanding later
sections.

3.9 The Live Page List

Efficient performance of the paging subsystem depends on the algorithm that


selects which pages are live at any given time, or conversely, which live pages
should be made dead. The paging subsystem approximates a least recently used
(LRU) algorithm for determining which pages to page out.
All live pages are stored on the ‘live page list,’ which is an integral part of the
16 Demand Paging on Symbian

paging cache. The live page list is split into two sub-lists, one containing young
pages and the other old pages – the paging subsystem attempts to keep the ratio
of the lengths of the two lists at a value called the ‘young/old ratio.’ The paging
subsystem uses the MMU to make all young pages accessible to programs but
all old pages inaccessible. However, the contents of old pages are preserved and
they still count as being live.

If a program accesses an old page, this causes a page fault. The paging subsys-
tem then turns that page into a young page (rejuvenates it), and at the same time
turns the last young page into an old page. See Section 4.1.4 for more detail and
illustration of this process.

3.10 XIP ROM Paging: Paging In

When a program attempts to access a page that is paged out, the MMU gener-
ates a page fault and the executing thread is diverted to the Symbian exception
handler. This then performs the following tasks:

1. Obtains a page of RAM from the system’s pool of unused RAM (the ‘free
pool’) or, if this is empty, pages out the oldest live page and uses that
instead
2. Reads the content for this page from some media, such as NAND flash
3. Updates the paging cache’s live list as described in the previous section
4. Uses the MMU to make this RAM page accessible at the correct linear
(virtual) address
5. Resumes execution of the program’s instructions, starting with the one
that caused the initial page fault.

Note that these actions are executed in the context of the thread that tries to
access the paged memory. Paging in on a code-paged system is described in
Section 4.1.6.2.

3.11 XIP ROM Paging: Paging Out

If the system needs more RAM and the free pool is empty, then RAM that is
Understanding Demand Paging on Symbian 17

being used to store paged memory is freed up for use. This is called ‘paging out’
and happens using the following steps:

1. Remove the ‘oldest’ RAM page from the paging cache.


2. Use the MMU to mark the page as inaccessible.
3. Return the RAM page to the free pool.

Freeing a page on a code-paged system is described in Section 4.1.6.5.

3.12 The Paging Configuration

Demand paging introduces three new configurable parameters to the system.


These are:

1. The amount of code and data that is marked as unpaged


2. The minimum size of the paging cache
3. The ratio of young pages to old pages in the paging cache.

The first two parameters are the most important and they are discussed in the fol-
lowing sections. The third has a less dramatic effect on the system and is usually
left unchanged at its default value of 3.

There are existing parameters that also affect how demand paging performs.
Optimizing the configuration of all these together is discussed in Chapter 7.

3.13 Unpaged Files

It is important that all areas of the operating system that are involved in servic-
ing a paging fault are protected from blocking on the thread that took the paging
fault (directly or indirectly). If they are not, a deadlock situation could occur. This
is partly achieved in Symbian by ensuring that all kernel-side components are
always unpaged. Section 5.2.1 looks at the problem of page faults in kernel-side
code in more detail.
18 Demand Paging on Symbian

In addition to kernel-side components, there is likely to be a number of compo-


nents that you will want to explicitly set as unpaged, so as to meet the functional
and performance requirements of the device. The performance overhead of
servicing a page fault is unbounded and variable – although typically around
1ms – so some critical code paths may need to be protected by making files
unpaged. You might have to make chains of files and their dependencies un-
paged to achieve this, but could possibly reduce the set of unpaged components
by breaking unnecessary dependencies and separating critical code paths from
non-critical ones.

If making a component paged or unpaged is a straightforward performance/RAM


trade-off, you can make it configurable, thus allowing the decision to be made
later based on the system requirements of the particular device.

3.14 Paging Cache Sizes

If the system needs more RAM but the free RAM pool is empty, then pages are
removed from the paging cache in order to service the memory allocation. This
cannot continue indefinitely because a situation will arise where the same pages
are continually paged in and out of the paging cache. This is known as page
thrashing. Performance is dramatically reduced in this situation.

To avoid catastrophic performance loss, a minimum paging cache size can be de-
fined. If a system memory allocation would cause the paging cache to drop below
the minimum size, then the allocation fails.

As data is paged in, the paging cache grows, but any RAM used by the cache
above the minimum size does not contribute to the amount of used RAM reported
by the system. Although this RAM is really being used, it will be recycled when-
ever anything else in the system requires the RAM. So the effective RAM usage
of the paging cache is determined by its minimum size.

In theory, it is also possible to limit the maximum paging cache size. However,
this is not useful in production devices because it prevents the paging cache from
using all the otherwise unused RAM in the system. This may reduce performance
for no effective RAM saving.
Understanding Demand Paging on Symbian 19

3.15 Effective RAM Saving

The easiest way to visualize the RAM saving achieved by demand paging is to
compare the most simplistic configurations. Consider a non-demand-paged ROM
consisting of a core with no ROFS (as in Figure 1, layout A). Compare that with
a demand-paged ROM consisting of an XIP ROM paged core image, again with
no ROFS (similar to Figure 3, layout C, but without the ROFS). The total ROM
contents are the same in both cases. Figure 4 depicts the effective RAM saving.

Figure 4: The effective RAM saving when paging a simple XIP ROM.
20 Demand Paging on Symbian

The effective RAM saving is the size of all paged components minus the mini-
mum size of the paging cache. Note that when a ROFS section is introduced,
this calculation is much more complicated because the contents of the ROFS are
likely to be different between the non-demand-paged and demand-paged cases.

You can increase the RAM saving by reducing the set of unpaged components
and/or reducing the minimum paging cache size, making the configuration more
‘stressed.’ You can improve the performance (up to a point) by increasing the set
of unpaged components and/or increasing the minimum paging cache size, mak-
ing the configuration more ‘relaxed.’ However, if the configuration is too relaxed,
it is possible to end up with a net RAM increase compared with a non-demand-
paged ROM.

3.16 Byte-Pair Compression

The default ‘deflate’ compression algorithm used by Symbian only allows a par-
ticular size of compressed data to be decompressed as a whole unit. This is fine
when decompressing a complete XIP ROM image or a whole executable, but it is
not acceptable for demand paging where only a single page of an image/execut-
able needs to be decompressed during a page-in event.

Because of this, Symbian has introduced the ‘byte-pair’ compression algorithm,


which allows data to be compressed and decompressed in individually address-
able blocks. Each decompressed 4 KB block can then be mapped directly to a
page of RAM. (It is only possible to demand page data that is byte-pair-com-
pressed or uncompressed.)

Support for byte-pair compression was introduced in Symbian OS v9.2. For more
details, see Section 4.1.13.
Under the Hood: The Implementation of Demand Paging 21

4
Under the Hood: The Implementation of
Demand Paging

4.1 Kernel Implementation

4.1.1 Key Classes in Demand Paging


The majority of the demand-paging implementation within the kernel lies in a
singleton object of class MemModelDemandPaging (xmmu.cpp1), which is
derived from DemandPaging (demand_paging.h), which in turn derives from
RamCacheBase (ramcache.h). This last provides an interface between the
MMU implementation and any form of dynamic use of the system’s free memory,
such as demand paging. This interface is mainly used for transferring ownership
of physical pages of RAM.

The kernel manages the free memory in the system via the class MmuBase
(mmubase.h). This uses RamCacheBase to access the paging system and
provide RAM from both the free pool, and, if necessary, live pages. To do this,
RamCacheBase makes use of MmuBase’s RAM allocator object
(DRamAllocator, found in ramalloc.h).

The singleton MemModelDemandPaging object is created at system boot (in the


MmuBase::Init() method) if demand paging is enabled – otherwise a
RamCache object (ramcache.h), which also derives from RamCacheBase, is
created as usual. The MMU pages tables are initialized for the demand-paged
part of the ROM. The live list is also populated with RAM pages, up to its mini-
mum size.

1 Where possible, I have given the name of the file in which you can find more information about
the class in question, although some of these files may not initially be available under the Eclipse
Public License. Where filenames are not given, you may find further information about the class or
method given by using the code search available on developer.symbian.org.
22 Demand Paging on Symbian

4.1.2 Page-Fault Handler


When a thread accesses demand-paged memory for which the content isn’t cur-
rently loaded, an ARM data/prefetch abort is generated. The exception-handling
code calls the demand paging-fault handler, which then pages in the memory.
This involves the following steps, which are executed in the context of the thread
that is trying to access the pageable memory:

1. Obtain a page of physical RAM


This is done in the usual way – refer to Symbian OS Internals,2 for more details.

2. Determine the storage media location where the page’s contents are
stored
To read demand-paged content from media, the kernel must first determine its
storage location. For the contents of ROM, it does this using an index stored in
the unpaged part of ROM. For non-XIP code, it uses the blockmap structure rep-
resenting the particular executable. This is stored in the kernel’s code segment
object for that executable. The kernel searches an address-sorted list to deter-
mine which code segment to use.

3. Read (and decompress) the contents into the RAM page


The kernel uses the paging device APIs to perform the media read. I describe
these in Section 4.2.2. Demand-paged content may be stored in a compressed
format (byte-pair compression) and so may require decompression after reading.

4. Use the MMU to make this RAM page accessible at the correct virtual
address
The kernel stores all the memory that is currently being used to store demand-
paged content in the ‘live list,’ discussed in Section 4.1.4.

5. Resume execution
The kernel resumes execution of the code that triggered the page-in event.

4.1.3 Page-Information Structures


Each physical page of RAM managed by the kernel has an associated page-

2 Symbian OS Internals: Real Time Kernel Programming, Jane Sales et al., 2005, John Wiley &
Sons. The entire text of the book can be found online at developer.symbian.org/wiki/index.php/
Symbian_OS_Internals
Under the Hood: The Implementation of Demand Paging 23

information structure, SPageInfo, which maintains information about the cur-


rent state and usage of the page. As part of the demand-paging and file-system
caching implementations, Symbian has modified SPageInfo to make it more
efficient, more robust, easier to use and easier to understand.

There were two main effects of this work on device manufacturers. The first was
a modest increase of RAM usage by the system – around 0.58% of total RAM.
The second was an effect on people porting Symbian to new platforms. If you are
doing this, you will need to recompile the bootstrap.

Implementation Overview
In brief:

• The SPageInfo structure was simplified so that members only store one
datum.
• New members were added to SPageInfo to allow the implementation of
demand paging and file-system caching.
• SPageInfo is now stored in a sparse array, indexed by a page’s
physical address. The memory allocation for the array has been moved
into the generic bootstrap code (bootmain.s).
• All use of page numbers in APIs was removed, because information
pertaining to a RAM page can now be directly obtained using only the
physical address of that page.
• All updates to SPageInfo objects are now performed by functions that
assert that the system lock is held, ensuring that all updates have been
coded in a safe and atomic manner.
• Simulated OOM conditions have been added to low-level memory-
management functions. This improves coverage on existing and new
OOM testing.

Implementation Detail
SPageInfo is defined in mmubase.h and looks as follows:

struct
struct SPageInfo
SPageInfo
{
{
enum
enum TType
TType
{
{
24 Demand Paging on Symbian

EInvalid=0,// No physical RAM exists for this page


EFixed=1, // RAM fixed at boot time
EUnused=2, // Page is unused
EChunk=3, // iOwner=DChunk* iOffset=index into chunk
ECodeSegMemory=4, // iOwner=DCodeSegMemory*
// iOffset=index into CodeSeg memory
// EHwChunk=5, // Not used
EPageTable=6, // iOwner=0
// iOffset=index into KPageTableBase
EPageDir=7, // iOwner=ASID
// iOffset=index into Page Directory
EPtInfo=8, // iOwner=0 iOffset=index into iPtInfo
EShadow=9, // iOwner=phys ROM page,
// iOffset=index into ROM
EPagedROM=10, // iOwner=0, iOffset=index into ROM
EPagedCode=11, // iOwner=DCodeSegMemory*,
// iOffset=index into code chunk
EPagedData=12, // NOT YET SUPPORTED
EPagedCache=13,// iOwner=DChunk*, iOffset=index into chunk
EPagedFree=14, // In demand paging ’live list’
// but not used for any purpose
};

enum TState
{
EStateNormal = 0, // no special state
EStatePagedYoung = 1, // demand paged and on the young list
EStatePagedOld = 2, // demand paged and on the old list
EStatePagedDead = 3, // demand paged, currently being modified
EStatePagedLocked = 4 // demand paged but temporarily not paged
};

...

private:
TUint8 iType; // enum TType
TUint8 iState; // enum TState
TUint8 iSpare1;
TUint8 iSpare2;
TAny* iOwner; // owning object
TUint32 iOffset; // page offset within owning object
Under the Hood: The Implementation of Demand Paging 25

TAny* iModifier; //// pointer


pointer toto object
object currently
currently manipulating
manipulating page
page
TUint32 iLockCount;// non-zero if page acquired
TUint32 iLockCount;//
// non-zero if pageof
by code outside acquired
the kernel
TUint32 iSpare3; // by code outside of the kernel
TUint32 iSpare3;
public:
public:
SDblQueLink iLink; // used for placing page into linked lists
SDblQueLink
}; iLink; // used for placing page into linked lists
};

The iType member indicates how the RAM page is currently being used. The
iOwner and iOffset members then typically specify a particular kernel object
(for example, DEpocCodeSeg) and location within it.

For demand paging, the key TType attributes for a page are:

• EPagedROM – Contains contents for demand-paged XIP ROM. The


iOffset member contains the page’s index into the ROM, this is (ad-
dress-of-page – address-of-ROM-header) / 4096.
• EPagedCode – Contains contents for demand-paged RAM-loaded code.
iOwner points to the DEpocCodeSegMemory object to which the page
belongs, and iOffset gives the page’s location within the code chunk
(not within the individual CodeSeg memory).
• EPagedFree – Page is in demand-paging live list but not being used to
store anything. A page is placed in this state when its contents are dis-
carded but it is not possible to return it to the system’s free poll because of
locking or performance constraints.

The iState member was added to support demand paging and has one of
these values from enum TState:

• EStatePagedYoung – Page is in the demand-paging live list as a


‘young’ page.
• EStatePagedOld – Page is in the demand-paging live list as an
‘old’ page.
• EStatePagedDead – Page has been removed from the live list and is in
the process of being modified for a new use.
• EStatePagedLocked – Page has been removed from the live list to
26 Demand Paging on Symbian

prevent it from being reclaimed for new usage, that is, it is ‘pinned.’ The
page’s contents remain valid and mapped by the MMU.

The iLink member is used by the paging system to link page-information struc-
tures together into the live page list. This member is also re-used to store other
data when the page isn’t in the live list – for example, when a page is in the
EStatePagedLocked state, iLink.iPrev is used to store a lock count.

The iModifier member is described in Section 4.1.3.5.

The remaining iSpare members pad the structure size to a power of two and will
be used in the implementation of future OS features.

Storage Location for SPageInfo Structures


SPageInfo objects are stored in a sparse array. A sparse array is an array
where nearly all of the elements have the same value (usually zero), which
means that the array can be stored more efficiently in computer memory than by
using the standard methods. In this particular case, the array is sparse because
it represents the physical memory of the system, and this will not all be in use at
any one time.

The sparse array lives at a fixed virtual address, KPageInfoLinearBase,


which means that the virtual address of the SPageInfo for a particular physical
address can be calculated efficiently for performing page-table walks in software.
The reverse look-up is equally efficient – by knowing the number of SPageInfos
that fit on a page (remember that the size of an SPageInfo is always a power of
two), we can derive the physical address that a particular SPageInfo
represents.

The actual functions for translating between an address and an SPageInfo are:

// Return the SPageInfo for a given page of physical RAM.


inline SPageInfo* SPageInfo::FromPhysAddr(TPhysAddr aAddress)
{
return ((SPageInfo*)KPageInfoLinearBase)+(aAddress>>KPageShift);
}

// Return physical address of the RAM page with which


Under the Hood: The Implementation of Demand Paging 27

// Return physical address of the RAM page with which


// this SPageInfo object is associated
inline TPhysAddr SPageInfo::PhysAddr()
{
return ((TPhysAddr)this)<<KPageInfosPerPageShift;
}

As the page info array is sparse, an invalid physical address may cause an
exception when indexing the array. So to make it possible to validate physical ad-
dresses, a bitmap is also made available at address KPageInfoMap. This con-
tains set bits for each 4 KB page of RAM in the array which is present in memory
and is used by SPageInfo::SafeFromPhysAddr(), which performs the same
task as SPageInfo::FromPhysAddr() but returns NULL if the physical ad-
dress is invalid rather than causing a data abort.

Before Symbian implemented demand paging, page-information structures were


stored in a flat array, and provided a few non-trivial functions to translate between
a physical page address and an array index (that is, a page number). Because of
this, many internal APIs passed page numbers as additional arguments to avoid
the translation process. With the new storage method for SPageInfo objects, all
APIs can operate efficiently if they use only the physical addresses of pages, so
we have removed all page-number arguments and methods.

Initialization
The bootstrap code (in bootmain.s) creates both the array at KPageInfo
LinearBase and the bitmap at KPageInfoMap. Initially, all SPageInfo struc-
tures have type EInvalid. During kernel boot, the MmuBase::Init2() func-
tion sets the correct state for each page by scanning the RAM bank list provided
by the bootstrap, and scanning all MMU page tables for RAM already allocated.

Locking and Concurrency Handling


Modifications to any SPageInfo must be performed while holding the system
lock, which guarantees exclusive access to the object.

Unfortunately, some operations on pages are long running or unbounded – for


example, where the kernel must map demand-paged code into many processes.
In situations such as these, we cannot keep the system locked, because this
would affect the real-time performance of the system. In this situation, system
28 Demand Paging on Symbian

software should flash (release/re-acquire) the system lock and then detect if an-
other thread has modified the page in question. This is done using the iModifier
member.

The kernel sets iModifier to zero whenever the SPageInfo object is changed
in any way. So if a thread sets this to a suitable unique value (for example, the
current thread pointer) then that thread may flash the system lock and then find
out whether another thread has modified the page by checking whether
iModifier has changed. The kernel provides functions for this – SetModifier()
and CheckModified(). An example of their use is as follows:
NKern::LockSystem();

SPageInfo* thePageInfo = GetAPageInfo();

NThread* currentThread = NKern::CurrentThread();


// this thread is modifying page’s usage
thePageInfo->SetModifier(currentThread);

while(long_running_operation_needed)
{
do_part_of_long_running_operation(thePageInfo);
if(NKern::FlashSystem() &&
pageInfo->CheckModified(currentThread))
{
// someone else got the System Lock and modified our page...
reset_long_running_operation(thePageInfo);
}
}

NKern::UnlockSystem();

4.1.4 Live Page List


Efficient performance of the paging subsystem depends on choosing a good
algorithm for selecting the pages that are live at any given time and, conversely,
the live pages that should be made dead. The paging system approximates an
LRU algorithm for deciding which pages to page out.
Under the Hood: The Implementation of Demand Paging 29

All live pages are stored on the ‘live page list,’ which is a linked list of SPageInfo
objects, each of which refers to a specific page of physical RAM on the device.
The list is ordered chronologically by time of last access, to enable a least recent-
ly used (LRU) algorithm to be used when discarding paged content.

To keep the pages in chronological order, the kernel needs to detect when they
are accessed. It does this by splitting the live list into two sub-lists, one contain-
ing ‘young’ pages and the other ‘old’ pages. It uses the MMU to make all young
pages accessible, but old pages inaccessible. However, the contents of old
pages are still preserved, and these pages are still considered to be live. When
an old page is next accessed, there is a data abort. The fault handler can then
simply move the old page to the young list and make it accessible again so the
program can continue as normal.

The net effect is of a first-in, first-out list in front of an LRU list, which results in
less page churn than a plain LRU.
This is shown in more detail in Figure 5:

Figure 5: Live Page Lists – Young and Old Pages

When a page is paged in, it is added to the front of the young list, making it the
youngest page in the system, as shown in Figure 6:

Figure 6: Paging-In Page J

The paging system aims to keep the relative sizes of the two lists equal to a value
called the ‘young/old ratio.’ So, if this ratio is R, the number of young pages is N,
and the number of old pages is No, then if Ny > RNo a page is taken from the back
of the young list and placed at the front of the old list. This process is called ‘ag-
ing’ and is shown in Figure 7:
30 Demand Paging on Symbian

Figure 7: Aging Page F

If a program accesses an old page, it causes a page fault, because the MMU has
marked old pages as inaccessible. The fault handler then turns the old page into
a young page (‘rejuvenates’ it), and, at the same time, turns the last young page
into an old page. This is shown in Figure 8.

Figure 8: Rejuvenating Page H (and Aging Page E)

When the kernel needs more RAM (and the free pool is empty), it needs to re-
claim the RAM used by a live page. In this case, the oldest live page is selected
for paging out, turning it into a dead page, as Figure 9 shows.

Figure 9: Paging Out Page I

If paging out leaves the system with too many young pages according to the
young/old ratio, then the kernel would age the last young page on the list (in
Figure 9, that would be Page D).

The net effect of this is that if the page is accessed at least once between
every page fault, it will just cycle around the young list. If it is accessed less often,
relative to the page-fault rate, it will appear in the old list – appearing further and
further back the less often it is accessed.

4.1.5 RAM Cache Interface


The kernel manages the free memory in the system via the class MmuBase,
Under the Hood: The Implementation of Demand Paging 31

which in turn uses a DRamAllocatorBase object, iRamPageAllocator.


Formerly, this object managed all the unused RAM pages in the system, but with
the advent of demand paging, there may also be free memory available from the
live list.

These two memory pools interact through the class RamCacheBase, which
provides:

• An abstract interface allowing the MmuBase class to make requests of the


paging system. (The class DemandPaging derives from RamCacheBase
and implements this interface.)
• Implementation of functions for the paging system to make requests of the
MmuBase class.

The class definition for RamCacheBase is as follows:

class RamCacheBase
class RamCacheBase
{
{
public:
public:
// Initialisation called during MmuBase:Init2.
// Initialisation called during MmuBase:Init2.
virtual void Init2();
virtual void Init2();
// Initialisation called from M::DemandPagingInit.
// Initialisation called from M::DemandPagingInit.
virtual TInt Init3()=0;
virtual TInt Init3()=0;
// Remove RAM pages from cache and return them to the system.
// Remove RAM pages from cache and return them to the system.
virtual TBool GetFreePages(TInt aNumPages)=0;
virtual TBool GetFreePages(TInt aNumPages)=0;
// Attempt to free-up a contiguous region of pages
// Attempt to free-up a contiguous region of pages
// and return them to the system.
// and return them to the system.
virtual TBool GetFreeContiguousPages(TInt aNumPages, TInt aA-
virtual TBool GetFreeContiguousPages(TInt aNumPages,
lign) =0;
TInt aAlign) =0;
// Give a RAM page to the cache system for managing.
// Give a RAM page to the cache system for managing.
virtual void DonateRamCachePage(SPageInfo* aPageInfo)=0;
virtual void DonateRamCachePage(SPageInfo* aPageInfo)=0;
// Attempt to reclaim a RAM page given to
// Attempt to reclaim a RAM page given to
// the cache with DonateRamCachePage.
// the cache with DonateRamCachePage.
virtual TBool ReclaimRamCachePage(SPageInfo* aPageInfo)=0;
virtual TBool ReclaimRamCachePage(SPageInfo* aPageInfo)=0;
32 Demand Paging on Symbian

/* Called by MMU class when a page is unmapped from a chunk. */


// Called
virtual by MMU
TBool class when a page isaPageInfo)=0;
PageUnmapped(SPageInfo* unmapped from a chunk.
virtual TBool PageUnmapped(SPageInfo* aPageInfo)=0;
// Return the maximum number of pages which could be
// Return
obtainedthe maximum
with number of pages which could be
GetFreePages.
// obtained
inline TInt with GetFreePages. { return iNumberOfFreePages; }
NumberOfFreePages()
inline TInt NumberOfFreePages() { return iNumberOfFreePages; }
// Put a page back on the system’s free pool.
// Put
void a page back on the system’s
ReturnToSystem(SPageInfo* free pool.
aPageInfo);
void ReturnToSystem(SPageInfo* aPageInfo);
// Get a RAM page from the system’s free pool.
// Get a RAM
SPageInfo* page from the system’s free pool.
GetPageFromSystem();
SPageInfo* GetPageFromSystem();
public:
public:
MmuBase* iMmu; // Copy of MmuBase::TheMmu
MmuBase* iMmu; // Copy of//
TInt iNumberOfFreePages; MmuBase::TheMmu
Number of pages that could
TInt iNumberOfFreePages; // Number ofby
be freed pages that could
GetFreePages()
// be freed
static RamCacheBase* TheRamCache; // by GetFreePages()
Pointer to the single
static RamCacheBase* TheRamCache; // Pointer
RamCacheto the single
object
}; // RamCache object
};

Memory Allocation
When the kernel wants to allocate memory, it uses
MmuBase::AllocRamPages(), which requests a number of pages from the
RAM allocator object. If there are insufficient pages to meet this request then this
method calls RamCacheBase::GetFreePages(). This is actually implemented
in DemandPaging::GetFreePages().

DemandPaging::GetFreePages() checks to see if the live list has the re-


quired number of spare pages; and if so, selects the oldest pages to return to the
main RAM allocator using RamCacheBase::ReturnToSystem(). The RAM
allocator then completes the original memory-allocation request.

If a memory allocation requires physically contiguous memory, then the corre-


sponding methods are MmuBase::AllocContiguousRam() and
RamCacheBase::GetFreeContiguousPages().
Under the Hood: The Implementation of Demand Paging 33

Dynamic RAM Cache


The unused RAM in the system needs to be managed so that it is available for
allocating to programs as well as being used for demand paging and/or the file-
system cache. This is done using the singleton RamCacheBase object, which
allows the paging system to request memory from the RAM allocator, and vice
versa. This object is created at system boot.

The kernel’s existing memory-allocation functions have been modified so that


upon allocation failure they attempt to reclaim RAM from the paging system and
the file-system cache via RamCacheBase. Similarly, the free RAM size APIs have
been updated to account for memory used for the paging system and file-system
cache.

When file-cache memory is unlocked, the kernel calls


DonateRamCachePage() to place each selected page on the live list. These
pages are placed at the front of the list, becoming the youngest pages in the
system, and they are given the type EPagedCache. When the memory is locked
again, the kernel calls ReclaimRamCachePage() to remove the pages from the
live list, assuming they are still present. Both of these functions are implemented
in the class DemandPaging.

Whenever memory is decommitted from a chunk by the


ArmMmu::UnmapPages() method, the kernel calls the
RamCacheBase::PageUnmapped() method implemented in class
MemModelDemandPaging. This checks if the RAM page is currently on the live
list as type EPagedCache. If it is, the page is unmapped from the chunk and
placed on the end of the live list as type EPagedFree. It is now the oldest page
in the system and so will become the first page to be reclaimed when free RAM is
required. We use this mechanism rather than using ReturnToSystem()
because it is faster, and we wish to keep execution time to a minimum as the
system lock is held.

Allocating Memory for Paging


Whenever the paging system needs RAM to store demand-paged content, it
makes a request to the RAM allocator via
RamCacheBase::GetPageFromSystem(). If there is no memory available, the
oldest page in the live list is reclaimed instead.
34 Demand Paging on Symbian

Free Memory Accounting


During system start-up, the bootstrap calls the RamCacheBase::Init2()
method, and the DemandPaging class initializes itself by populating the live list
with the configured minimum number of pages. The amount of memory used for
demand paging or file-system caching will never go below this level.

RamCacheBase’s member iNumberOfFreePages is the count of the excess


number of in-use pages over this minimum – that is, it is the number of pages
that the kernel can safely reclaim to satisfy memory-allocation requests. These
excess pages are counted as free memory by the system, even though at a par-
ticular time they may be being used to store demand-paged or file-system cache
content.

iNumberOfFreePages is incremented by GetPageFromSystem() and


DonateRamCachePage() to count pages added to the paging system, and dec-
remented by ReturnToSystem() and ReclaimRamCachePage() as pages
are removed.

4.1.6 Code Paging


The code paging of RAM-loaded executables described in this section was first
supported in Symbian OS v9.3 (although originally intended for Symbian OS
v9.4). XIP ROM paging, which is a subset of this functionality, was also first sup-
ported in Symbian OS v9.3, and is discussed in Section 4.1.7.

DCodeSeg
When an executable binary is loaded into the system, the kernel creates a
DCodeSeg (kern_priv.h) object to represent it. This mainly contains two broad
groups of information:

• The dependency graph (DCodeSeg objects representing binaries that this


binary links against)
• The memory used to store the contents of the executable.

During implementation, we realized that complexity could be reduced by moving


the second form of information into a separate object, DEpocCodeSegMemory
(plat_priv.h). This insulates the demand paging implementation from lifetime
and locking issues associated with the main DCodeSeg object.
Under the Hood: The Implementation of Demand Paging 35

Paging In
When a demand-paged DCodeSeg is first loaded, the contents of the .text sec-
tion of the executable are not present in RAM. All that exists is a reserved region
of virtual address space in the code chunk. This means that when a program ac-
cesses the contents, the MMU will generate a data abort. The exception handler
calls MemModelDemandPaging::HandleFault(), which then has to obtain
RAM, copy the correct contents to it and map it at the correct virtual address.
This is known as ‘paging in.’

Paging in consists of the following steps:

1. Check the MMU page table entry for the address that caused the abort. If
the entry is KPteNotPresentEntry then there is no memory mapped at
this address and it may need paging in.
2. Verify that the exception was caused by an access to the code chunk
memory region.
3. Find the DCodeSeg which is at this address by searching the sorted list
DCodeSeg::CodeSegsByAddress.Find(aFaultAddress).
4. Verify that the DCodeSeg is one that is being demand paged.
5. Call MemModelDemandPaging::PageIn(), which then performs the
following steps:
6. Obtain a DemandPaging::DPagingRequest (demand_paging.h)
object by using DemandPaging::AcquireRequestObject().
7. Obtain a physical page of RAM using
DemandPaging::AllocateNewPage().
8. Map this RAM at the temporary location
DemandPaging::DPagingRequest::iLoadAddr.
9. Read correct contents for into RAM page by calling
DemandPaging::ReadCodePage().
10. Initialize the SPageInfo structure for the physical page of RAM, marking
it as type EPagedCode.
11. Map the page at the correct address in the current process.
12. Add the SPageInfo to the beginning of the live page list. This marks it as
the youngest (most recently used) page.
13. Return, and allow the program that caused the exception to continue to
execute.
36 Demand Paging on Symbian

In the multiple memory model, there are separate MMU mappings for each pro-
cess into which a DCodeSeg is loaded – but step 11 in the previous sequence
updates the mapping only for the process that caused the page in to occur. This
is a deliberate decision because:

• It avoids the overhead of updating a large number of MMU mappings,


possibly unnecessarily.
• The pseudo-LRU algorithm used by the demand-paging implementation
is improved if accesses to a page by other processes generate their own
page faults.

To avoid having duplicate copies of the same DCodeSeg page, the mul-
tiple memory model keeps a list of pages which a DCodeSeg has paged
in (DMemModelCodeSegMemory::iPages). Then, before calling
MemModelDemandPaging::PageIn() in step 5 of the previous sequence, it
checks to see if there is already a page loaded and if so, simply maps that page
into the current process and ends.

Aging a Page
The pseudo-LRU algorithm used by demand paging means that as pages
work their way down the live list, they eventually reach a point where they
change from young to old. At this point, the kernel changes the MMU map-
pings for the page to make it inaccessible. It does this using the method
MemModelDemandPaging::SetOld() (xmmu.cpp).

In the moving memory model, SetOld() simply has to find the single page
table entry (MMU mapping) for the page in the user code chunk, and clear the
bits KPtePresentMask. In the multiple memory model, there can be many
page table entries that need updating. In this case, the kernel calls the method
DoSetCodeOld()(xmmu.cpp) to actually do the work. DoSetCodeOld()
examines the bit array DMemModelCodeSegMemory::iOsAsids to determine
the processes into which the DCodeSeg is loaded, and then updates each map-
ping in turn. Because the system lock must be held, this can affect the real-time
performance of the system, and so the technique described in Section 4.1.3.5
is used. If the page’s status changes before DoSetCodeOld()has modified the
mappings in all processes, then DoSetCodeOld() simply ends. This is the right
thing to do, because the page’s status can change in one of two ways: either
Under the Hood: The Implementation of Demand Paging 37

by the page being rejuvenated (which I’ll discuss next) or by the page being
removed from the live list (ceasing to be demand paged). Both of these events
logically supersede any page-aging operation.

Rejuvenating a Page
When a program accesses an old page, it generates a data abort because the
MMU has marked these pages as inaccessible. The fault handler (MemModelDe
mandPaging::HandleFault) deals with this using the following sequence of
actions.

1. Get the MMU page table entry for the address that caused the abort. If
the bits KPtePresentMask are clear, then this is an old page that needs
rejuvenating. (If all bits are clear, then the page needs paging in instead.
2. Find the SPageInfo for the page, using the physical address stored in
the page table entry.
3. If state of page is EStatePagedDead, then change the page table
entry to KPteNotPresentEntry and proceed to the paging-in operation
(described in Section 4.1.6.2) instead of rejuvenating it. This is because
a dead page is in the process of being removed from the live list and it
should be treated as though it were not present.
4. Otherwise update the page table entry to make the page accessible again.
5. Move the page’s SPageInfo to the beginning of the live page list. This
marks it as the youngest page in the system.

Similarly to paging in, we only update the page table entry for the current
process. The whole rejuvenation operation is performed with the system lock
held.

Freeing a Page
When a physical page of RAM that holds demand-paged code is needed for
other purposes, it must be free by calling
MemModelDemandPaging::SetFree(). This mainly involves setting all page
table entries that refer to the page to KPteNotPresentEntry.

However, again the multiple memory model needs to update many page table
entries, so Symbian factored this implementation out into a separate method
called DoSetCodeFree(). Unlike the rejuvenation code, this method does not
38 Demand Paging on Symbian

need to pay attention to whether pages are changed while it is processing them.
This is because all pages that are being freed have their state set
to SPageInfo:: EStatePagedDead first. This prevents other parts of the
demand-paging implementation from changing the page.

The free operation is performed with the system lock held. However, as is the
case for the rejuvenation code, DoSetCodeOld() flashes the system lock while
freeing, and this makes it possible for the code segment to be unloaded and its
DMemModelCodeSegMemory destroyed while these data structures are being
used. To prevent this situation, SetOld() must be called with the RAM-allocator
mutex held. As the destructor for DMemModelCodeSegMemory also acquires
this mutex during its operation, it cannot complete while one of its (former) RAM
pages is in the process of being freed.

4.1.7 ROM Paging


The kernel demand pages the contents of ROM in a similar way to code pag-
ing, but ROM has many attributes that make the implementation much simpler,
namely:

• The virtual address and the size of the ROM is fixed and known at boot
time. So it is a trivial matter to determine whether a particular memory
access occurred in demand-paged memory or not.
• The ROM cannot be unloaded, so the kernel does not need to guard
against as many race conditions.
• The ROM is globally mapped by the MMU, so even in the multiple memory
model there is only a single MMU mapping that needs updating when the
demand-paging subsystem manipulates pages of RAM.

ROM paging was first supported in Symbian OS v9.3.

ROM Format
When the ROMBUILD tool generates ROM images for demand paging, it
divides the contents of the ROM into two sections. All unpaged content is
placed in the first section, with paged content following it. The size of the two
sections is stored in the ROM header – the unpaged part is stored at ROM
offset zero through TRomHeader::iPageableRomStart and the paged
part is stored at ROM offset TRomHeader::iPageableRomStart through
Under the Hood: The Implementation of Demand Paging 39

TRomHeader::iUncompressedSize.

For each 4 KB page of ROM, there is an entry (SRomPageInfo) in the array


TRomHeader::iRomPageIndex giving information about the storage location
and compression used for that page’s contents. This index is stored in the
unpaged part of ROM.

If a ROM image is not to be demand paged, iPageableRomStart and


iRomPageIndex are both zero.

Initialization
The first, unpaged, part of ROM is loaded into RAM by the system’s boot/code
loader before any Symbian code is executed. Then, during kernel start-up,
MemModelDemandPaging::Init3()(xmmu.cpp) initializes ROM paging. This
method checks the ROM header information and allocates MMU page tables for
the virtual memory region to which the ROM will be mapped.

Paging In a ROM Page


When paging in ROM, it is easy to find out which page is needed by per-
forming pointer arithmetic on the virtual address being accessed. The stor-
age information for this page can then be obtained from the array stored at
TRomHeader::iRomPageIndex. Each entry in this array has this structure:
// e32rom.h
// e32rom.h
struct SRomPageInfo
struct
{ SRomPageInfo
{
enum TAttributes
enum
{ TAttributes
{
EPageable = 1<<0
EPageable
}; = 1<<0
};
enum TCompression
enum{TCompression
{
ENoCompression,
ENoCompression,
EBytePair,
EBytePair,
};
};

TUint32 iDataStart;
TUint32 iDataStart;
40 Demand Paging on Symbian

TUint16 iDataSize;

TUint8 iCompressionType;
TUint8 iPagingAttributes;
};

iDataStart gives the offset from the start of the ROM for the page, and
iDataSize gives the number of bytes actually used.

iCompressionType indicated how the data is compressed. This can be either


ENoCompression, in which case iDataSize is always 0x1000 (4 KB), or it may
be EBytePair, in which case iDataSize indicates the size of the compressed
data.

iPagingAttributes either contains EPageable to indicate the page is de-


mand paged, or zero if not. (There are entries in iRomPageIndex for the un-
paged part of the ROM, even though the paging system doesn’t access them.
This is to allow them to be used by the boot/core loader.)

As the ROM image is always stored in its own partition, starting at storage block
zero, those attributes are all that is required to locate, read and decompress the
data for a ROM page.

4.1.8 Blockmap Data Structures


When a thread takes a fault in paged code, the kernel must load the data cor-
responding to that page from the relevant file on the media drive. Because the
kernel doesn’t use the file server for this, there needs to be a way for it to work
out which parts of the media to read.

This calculation is complicated by several factors. The data may or may not be
compressed, so its size is not necessarily the same as the size of the page. The
data can start at any offset into the file, because the file contains a small header
and any number of previous (possibly compressed) pages. And finally, the file
itself may be split over multiple, discontinuous sectors on the media.

What is needed is a representation of how the file is laid out on the media. We
call this abstraction a ‘blockmap,’ because it is structured in terms of blocks that
Under the Hood: The Implementation of Demand Paging 41

roughly correspond to sectors on the media. It defines a logical-to-physical map-


ping for some portion of a file.

There are two types of blockmap used in the demand paging system: the user-
side blockmap and the kernel blockmap.

User-Side Blockmap
The file system provides the user-side blockmap to the loader, which in turn
passes it to the kernel. The user-side blockmap is defined by the
SBlockMapInfoBase and TBlockMapEntryBase classes. It consists of a
single context structure SBlockMapInfoBase that contains information relating
to the blockmap as a whole, and a series of TBlockMapEntryBase structures
describing the file layout. These structures are defined in e32ldr.h.

//
// e32ldr.h
e32ldr.h
struct
struct SBlockMapInfoBase
SBlockMapInfoBase
{
{
TUint
TUint iBlockGranularity;
iBlockGranularity;
TUint
TUint iBlockStartOffset;
iBlockStartOffset;
TInt64
TInt64 iStartBlockAddress;
iStartBlockAddress;
TInt
TInt iLocalDriveNumber;
iLocalDriveNumber;
};
};
class
class TBlockMapEntryBase
TBlockMapEntryBase
{
{
public:
public:
TUint
TUint iNumberOfBlocks;
iNumberOfBlocks;
TUint
TUint iStartBlock;
iStartBlock;
};
};

The user-side blockmap describes a series of contiguous runs of blocks by one


or more TBlockMapEntryBase structures. Each one holds the number of
blocks in the run (iNumberOfBlocks) and the index of the first block in the run
(iStartBlock).

The SBlockMapInfoBase structure defines context that applies to the whole


blockmap. The size of each block in bytes is given by iBlockGranularity.
This will typically be the sector size for the media. The address of the first block
on the partition is held in iStartBlockAddress – this is a byte offset from the
42 Demand Paging on Symbian

start of the partition. This is necessary because some file systems (for example,
FAT) have data before the first sector, and the size of this data may not be a mul-
tiple of the block size.

The blockmap does not have to start at offset zero in a file, or even if it does, the
first byte of the file may not lie on a block boundary, so iBlockStartOffset is
the offset of the first byte represented by the blockmap from the first block given.

Finally, iLocalDriveNumber indicates the media the file is stored on. It is a


local drive number (rather than the more widely used drive numbers that corre-
spond to letters of the alphabet). This is because media drivers only operate in
terms of the underlying local drive number.

The Kernel Blockmap


The kernel blockmap is defined by the TBlockMap class in kblockmap.h. The
data members are shown as follows:

class TBlockMap
class TBlockMap
{
{
struct SExtent
struct SExtent
{
{
TInt iDataOffset;
TInt iDataOffset;
TUint iBlockNumber;
TUint iBlockNumber;
};
};
TInt iDataLength;
TInt iDataLength;
TInt iExtentCount;
TInt iExtentCount;
SExtent* iExtents;
SExtent* iExtents;
// ...
// ...
};
};
While the user-side blockmap is stored as a list of runs of blocks starting at a
particular block, the kernel blockmap is stored as a list of logical file offsets that
start at a particular block. The list is ordered, so the length of a run can be found
by calculating the difference between successive file offsets in the list.

The kernel creates the kernel blockmap from the user-side blockmap so that it
can look-up physical media locations efficiently, finding the block corresponding
to a particular file offset via a binary search.
Under the Hood: The Implementation of Demand Paging 43

The kernel blockmap may have a different block size to its user-side equivalent,
because the kernel needs to communicate with media drivers in terms of read
units, which may be different to the sector size of the media. The read unit size is
usually 512 bytes.

The kernel blockmap provides a method that reads part of a file into memory. It
takes a logical file offset and a size describing the area of the file to read, a buffer,
and a function that performs the actual task of reading blocks from the media.

4.1.9 Code Segment Initialization


A code segment is a kernel data structure representing a unit of executable code,
whether it is XIP or RAM loaded, or an executable, a DLL or a device driver.
Code segments can be mapped into individual processes or they can be mapped
globally, so that they appear in the address space of all processes. Code seg-
ments can be mapped into more than one process at a time, so that (in general)
when multiple processes load the same library, only one code segment will be
used.

In this section, I will discuss the use of code segments for code-paged executa-
bles. All references to code segments will refer to non-global RAM-loaded code
segments.

When a new executable or DLL is loaded, a code segment object is created and
initialized. This is a three-stage process. First the loader calls the kernel to create
the code segment, then it loads the code into memory and fixes it up, and finally it
calls the kernel a second time to complete initialization and indicate that the code
can now be mapped into processes ready for execution.

For a demand-paged code segment, the procedure is similar, except that the
loader does not actually load the code itself. Instead, it provides the necessary
information to the kernel, which performs load and fix-up on demand. The three
stages are still necessary because the loader needs to access the code itself to
generate the code-relocation table. I describe these stages in detail in the next
section.


44 Demand Paging on Symbian

Code Segment Creation


The loader asks the kernel to create the code segment by
E32Loader::CodeSegCreate(). This results in a call to
DCodeSeg::Create(), which in turn calls DEpocCodeSeg::DoCreate() and
ultimately DMemModelCodeSeg::DoCreateRAM().

E32Loader::CodeSegCreate() passes a TCodeSegCreateIn-


fo structure, defined in e32ldr.h, to the kernel. This is also passed to
E32Loader::CodeSegLoaded(), which is called in the third stage. The data
members relevant to code paging are:

class TCodeSegCreateInfo
class TCodeSegCreateInfo
{
{
TUint32* iCodeRelocTable;
TUint32* iCodeRelocTable;
TInt iCodeRelocTableSize;
TInt iCodeRelocTableSize;
TUint32* iImportFixupTable;
TUint32* iImportFixupTable;
TInt iImportFixupTableSize;
TInt iImportFixupTableSize;
TUint32 iCodeDelta;
TUint32 iCodeDelta;
TUint32 iDataDelta;
TUint32 iDataDelta;
TBool iUseCodePaging;
TBool iUseCodePaging;
TUint32 iCompressionType;
TUint32 iCompressionType;
TInt32* iCodePageOffsets;
TInt32* iCodePageOffsets;
TInt iCodeStartInFile;
TInt iCodeStartInFile;
TInt iCodeLengthInFile;
TInt iCodeLengthInFile;
SBlockMapInfoBase iCodeBlockMapCommon;
SBlockMapInfoBase iCodeBlockMapCommon;
TBlockMapEntryBase* iCodeBlockMapEntries;
TBlockMapEntryBase* iCodeBlockMapEntries;
TInt iCodeBlockMapEntriesSize;
TInt iCodeBlockMapEntriesSize;
RFileClamp iFileClamp;
RFileClamp iFileClamp;
};
};
The location of the executable file on the media is given by a user-side blockmap,
which is made up of iCodeBlockMapCommon, iCodeBlockMapEntries and
iCodeBlockMapEntriesSize. The position of the text section in the execut-
able file is given by iCodeStartInFile and iCodeLengthInFile.

If the text section is compressed, iCompressionType is set to the unique iden-


tifier (UID) of the compression type. Otherwise, it is set to
KFormatNotCompressed. Symbian OS v9.4 supports only byte-pair compres-
sion (KUidCompressionBytePair). Individual pages of code are compressed
Under the Hood: The Implementation of Demand Paging 45

independently, and iCodePageOffsets is used to pass a look-up table of the


offsets from the start of the text section to the compressed data for each page.
The length of this table is calculated from the size of the code.

A table of code-relocation information is passed in iCodeRelocTable, its size


indicated by iCodeRelocTableSize. Similarly, import fix-up
information is passed in iImportFixupTable and iImportFixupTableSize.
iCodeDelta and iDataDelta contain the offsets to be added to code and data
relocations. These are not valid during the initial call to CodeSegCreate, but are
filled in later when CodeSegLoaded is called.

The loader passes iFileClamp to stop the file being deleted while it is being
used for paging, and sets the flag iUseCodePaging to tell the kernel to page
this code segment.

When the kernel creates a non-paged code segment, its main task is to allocate
RAM for the code and map it into the fileserver process (in which the loader
thread runs). This is done using conventional means – in the moving memory
model, the kernel commits memory to the global code chunk, and in the multiple
memory model, it allocates physical RAM pages then maps them into the fileserv-
er process’s code chunk.

For a demand-paged code segment, the kernel must allocate not physical RAM
but address space in the appropriate chunk. This is done using a new commit
type ‘ECommitVirtual.’ This marks part of the chunk’s address space as used,
so that nothing else may be committed there.

For non-paged code segments, the loader would load any static data directly
following the code in memory. This is not possible with code paging, because the
kernel loads code on demand but the loader loads the data. So the kernel allo-
cates memory, starting at the page after the end of the code, ready for the loader
to load the data into.

As well as committing address space for the code, the kernel must initialize
its internal data structures and read all the relevant parts of TCodeSegCre-
ateInfo. It copies the code page offsets table to the kernel heap, and creates
a kernel blockmap from the user-side version passed. Finally, the code segment
46 Demand Paging on Symbian

is marked as paged – from now on any read access will cause code to be loaded
from media and decompressed, although no relocation or fix up will be performed
yet.

Loading of Code and Data


For a non-paged code segment, this is when the code itself is loaded. The loader
reads text and data sections and decompresses them into the RAM allocated by
the kernel in the previous stage. It relocates the code, loads dependencies and
fixes up imports.

For a paged code segment, the loader does not need to load the code, because
the kernel handles this later, when a fault is taken. This means that the kernel
must do the relocations and fix-ups, so the loader must now generate the data
the kernel will need to perform these operations.

The loader does load static data now, to the address allocated by the kernel,
and then loads dependencies in the usual way. The loader then calls back to the
kernel, passing the relocation and fix-up tables in the TCodeSegCreateInfo
structure.

Finalization of Loaded Code Segment


The loader calls E32Loader::CodeSegLoaded(), which results in a call to the
kernel method DMemModelCodeSeg::Loaded().

For a non-paged code segment, all that the kernel has to do at this stage is the
necessary cache maintenance to ensure that the instruction cache is up-to-date
with the contents of RAM. It also unmaps the code segment from the fileserver
process.

For a paged code segment, the kernel reads the relocation and import tables and
stores them on the kernel heap, as it does the initial static data and the export
table (part for the text section). It then frees the memory allocated for the loader
to write the static data into. Any page that has already been loaded (because the
loader accessed it when compiling the relocation table) is fixed up. Cache main-
tenance happens as before, and the code segment is unmapped from the file
server.
Under the Hood: The Implementation of Demand Paging 47

4.1.10 Page Locking


In Symbian OS v9.3, kernel-side code must, at times, be able to access memory
without the risk of generating paging faults – for example, media drivers used
by the paging system must not cause faults when they are used for normal file-
system access, otherwise deadlock would result. (The driver would be called to
read the demand-paged data it was trying to access itself!) For situations such as
this, the kernel provides the DDemandPagingLock class for temporarily locking
(pinning) demand-paged contents so they will not be paged out.

class
class DDemandPagingLock
DDemandPagingLock : : public
public DBase
DBase
{
{
public:
public:
//
// Reserve
Reserve memory
memory so
so that
that this
this object
object can
can be
be
// used for locking up to aSize
// used for locking up to aSize bytes.bytes.
IMPORT_C
IMPORT_C TInt
TInt Alloc(TInt
Alloc(TInt aSize);
aSize);
//
// Perform
Perform Unlock(),
Unlock(), then
then free
free the
the memory
memory reserved
reserved by
by Alloc().
Alloc().
IMPORT_C void Free();
IMPORT_C void Free();
//
// Ensure
Ensure all
all pages
pages in
in the
the given
given region
region are
are present
present and
and lock
lock
// them so that they will not be paged out. If the
// them so that they will not be paged out. If the region region
//
// contained
contained nonodemand
demand paged
paged memory,
memory, thenthen no action
no action is per-
is performed .
formed.
// This function may not be called again until the previous
//
// This function
memory has beenmay not be called again until the previous
unlocked.
// memory has been unlocked.
IMPORT_C TBool Lock(DThread* aThread, TLinAddr aStart,
IMPORT_C TBool Lock(DThread* aThread, TLinAddr
TInt aStart,
aSize); TInt
aSize);
//Unlock any memory region which was previously locked with Lock().
//Unlock any Unlock();
inline void memory region which was previously locked with
Lock().
inline void Unlock();
IMPORT_C DDemandPagingLock();
inline ~DDemandPagingLock() { Free(); }
IMPORT_C
}; DDemandPagingLock();
inline ~DDemandPagingLock() { Free(); }
};

Initialization (Reserving Memory)


When kernel-side code locks demand-paged memory, there must be sufficient
free RAM into which to load the contents of that memory. But, usually, in situa-
48 Demand Paging on Symbian

tions in which we wish to lock memory, the operation must not fail due to out-of-
memory conditions. Because of this, the DDemandPagingLock object provides
the Alloc() method which reserves memory for later use.

Any code that needs to lock pages should create a DDemandPagingLock object
and allocate memory during its initialization phase. It can later lock this memory
without risk of failing.

To avoid wasting reserved memory when it is not being used, the kernel does the
reservation by increasing the minimum size of the live list
(iMinimumPageCount) by the number of reserved pages, (iReservePage-
Count). The kernel calls DemandPaging::ReserveAlloc() to do this. After
this, because the live list is now larger than it would otherwise have been, more
demand-paged and file-cache content can reside in RAM, so the memory re-
served for locking is being put to good use until it is needed.

So, at any one time, a page is in one of the following states:



• young
• old
• transitioning (dead)
• locked

Locking Memory
To lock memory ready for safe access, kernel-side code calls
DemandPaging::LockRegion(), specifying a region of virtual memory in
a given thread’s process.This method first checks if the region can contain
demand-paged memory, and immediately returns false if it can’t. This provides
very fast execution for typical use cases. (The only memory that can be demand
paged is code and constant data in executable images, and it is very unlikely that
an application would ask a device driver to operate on this sort of data.)

If the memory region to be locked does reside in a pageable area – that is, it is a
ROM or code chunk – then the method goes on to repeatedly call
LockPage() for each page in the region.

LockPage() calls EnsurePagePresent() to page the memory in (if it is not


Under the Hood: The Implementation of Demand Paging 49

already present), then examines the SPageInfo for the page to determine its
type. If the page is pageable and on the live list, then it is removed from the live
list and its state is changed to EStatePagedLocked with a lock count of one. If
the page was already locked, then the lock count is simply incremented.

Because locked pages are not on the live list, they will not be selected for paging
out.

Locking Large Regions


When locking memory, the system must reserve it ahead of time. This means
the kernel-side code doing the locking must know the maximum amount of RAM
that it will need. But in practice, this size is often not known, is unbounded or is
deemed to be too large to reserve in advance. In situations such as these, op-
erations on memory must be broken down into smaller fragments, so that only a
small region of memory is locked at any one time.

Symbian provides a way to optimize this fragmentation:


DDemandPagingLock::Lock() returns a truth value indicating whether the
memory is in a pageable region. This means that in the typical case, where this
is false, no fragmentation is necessary. The following code example demon-
strates this, but omits error checking for brevity.

const
const TInt
TInt KMaxFragmentSize
KMaxFragmentSize =
= 0x8000;
0x8000;

//
// one
one time
time initialisation...
initialisation...
DDemandPagingLock*
DDemandPagingLock* iPagingLock
iPagingLock =
= new
new DDemandPagingLock;
DDemandPagingLock;
iPagingLock.Alloc(KMaxFragmentSize);
iPagingLock.Alloc(KMaxFragmentSize);

//
// example
example function
function which
which locks
locks memory
memory in
in fragments...
fragments...
void DoOperation(TUint8* buffer, TInt
void DoOperation(TUint8* buffer, TInt size)size)
{
{
while(size)
while(size)
{
{
TUint8*
TUint8* fragmentStart
fragmentStart == buffer;
buffer;
do
do
{ {
TInt
TInt lockSize
lockSize =
= Min(size,
Min(size, KMaxFragmentSize);
KMaxFragmentSize);
if(iPagingLock->Lock(buffer,lockSize))
if(iPagingLock->Lock(buffer,lockSize))
break;
break; //
// lock
lock used
used so
so now
now process
process fragment
fragment
50 Demand Paging on Symbian

// expand fragment...
buffer += lockSize;
size -= lockSize;
} while(size);

// process fragment...
TInt fragementSize = buffer-fragmentStart
DoOperationOnFragment(fragmentStart, fragmentSize);
iPagingLock->Unlock();
}
}

Note that although the previous code makes use of the DDemandPagingLock
API, as it should, the actual work is done by the DemandPaging object.

4.1.11 Chunk APIs


The kernel makes memory available to applications via chunk objects (class
RChunk). These represent a region of virtual address space, reserved upon
creation, into which physical memory may be committed or de-committed. The
granularity of this allocation is governed by the MMU page size – this is 4 KB on
all current CPU architectures. Various changes have been made to the RChunk
API in support of demand paging.

Virtual Memory Commit


New RChunk APIs have been added to allow virtual memory to be committed and
de-committed. This reserves address space in the chunk, but does not allocate
physical RAM or change any memory mappings. The paging subsystem uses
these APIs to reserve address space for code that will be paged on use.

A new commit type ECommitVirtual has been added, and the


DMemModelChunk::Decommit() method has been expanded to take a TDe-
commitType argument, which allows the caller to specify EDecommitVirtual.

These new APIs are present in non-demand-paged systems, but they are never
called.
Under the Hood: The Implementation of Demand Paging 51

Additions to Disconnected Chunk API


The existing API for disconnected chunks is briefly summarized as follows:

RChunk::CreateDisconnectedLocal(TInt aInitialBottom, TInt aIni-


RChunk::CreateDisconnectedLocal(TInt aInitialBottom,
tialTop,
TInt aInitialTop,
TInt aMaxSize, TOwnerType
TInt aMaxSize, TOwnerType aType)
aType);

This API creates a chunk that is local (that is, private) to the process creating it.
The size of the reserved virtual address space is given by aMaxSize. Memory
may be committed at creation by setting appropriate values for
aInitialBottom and aInitialTop.

RChunk::Commit(TInt
RChunk::Commit(TInt aOffset,
aOffset, TInt
TInt aSize);
aSize);

Commit (allocate) aSize bytes of memory at position aOffset within the chunk.

RChunk::Decommit(TInt
RChunk::Decommit(TInt aOffset,
aOffset, TInt
TInt aSize);
aSize);

Decommit (free) aSize bytes of memory at position aOffset within the chunk.

New Lock/Unlock APIs


In the past, physical RAM could be in one of two states – committed and owned
by a thread or process, or uncommitted and owned by the kernel. To support file
caching, Symbian has introduced a new, intermediate, state – ‘unlocked.’ The
kernel may de-commit this memory whenever it needs to, but a thread may re-
lock the memory, which returns it to the committed state with its contents pre-
served.

Two new methods have been added for disconnected chunks:

RChunk::Unlock(TInt
RChunk::Unlock(TInt aOffset,
aOffset, TInt
TInt aSize);
aSize);

This method places aSize bytes of memory at position aOffset within the
chunk in the unlocked state. While the memory is in this state, the system is
free to reclaim it for other purposes, so its owner must no longer access it. Both
aSize and aOffset should be a multiple of the MMU page size.
52 Demand Paging on Symbian

RChunk::Lock(TInt aOffset, TInt aSize);

This method reverses the operation of Unlock() for aSize bytes of memory at
position aOffset. It returns the RAM to the fully committed state, which means
its owner can access it once more, and its contents are exactly as they were
when Unlock() was originally called. Both aSize and aOffset should be a
multiple of the MMU page size.

However, if in the interim the system has reclaimed any RAM in the region, then
the method fails with KErrNotFound and the region remains in the unlocked
state.

RChunk::Commit(TInt aOffset, TInt aSize);

The existing Commit() function has been modified to operate on a region of the
chunk that has been unlocked. It performs the lock operation on all pages that
have not been reclaimed by the system and allocates new pages where the pre-
vious ones have been reclaimed. After this operation, the contents of the region
are undefined.

This chance was made to optimize situations in which a virtual address region
is to be reused for new cache contents. It avoids the extra overheads involved
in separate Decommit() and Commit() operations, which could be significant
because any new RAM allocated to a chunk must be wiped clean first, to avoid
security issues with data leakage between processes.

RChunk::Decommit(TInt
RChunk::Decommit(TInt aOffset,
aOffset, TInt
TInt aSize);
aSize);

The existing Decommit() function has been modified so that it also returns any
unlocked memory in the region back to the system.

4.1.12 File-System Cache Support


Support for file-system caching first appears in Symbian OS v9.4, and is built
upon the disconnected chunk, which allows physical RAM to be committed and
de-committed anywhere within the virtual address space of the chunk, with no
requirement that it is contiguous. Most of the changes to the disconnected chunk
mentioned were made in support of file-system caching.
Under the Hood: The Implementation of Demand Paging 53

File-System Cache in Use


Here is a brief overview of the interactions the file server makes with the chunk
API to implement caching:

Initial file read request

1. File read request received by file server.


2. File data determined not to be in cache.
3. Cache buffers allocated using RChunk::Commit().
4. File system reads data from media into cache.
5. Some data copied from cache to client and read request completed.
6. Cache buffer is unlocked with RChunk::Unlock().

At the end of this operation, the pages used for this cache are placed at the front
of the live list because they are now the youngest pages in the system. As de-
mand-paging or other file-system caching operations occur, the cache pages will
grow older and move down the live list.

Subsequent file read request (cache hit)

1. File read request received by file server.


2. File data determined to be in cache.
3. Cache buffer locked with RChunk::Lock(). This removes the buffer’s
pages from the live list.
4. Some data copied from cache to client and read request completed.
5. Cache buffer unlocked again with RChunk::Unlock().

At the end of this operation, the pages used for this cache are back on the live
list as the youngest pages in the system. The cached data for the file will stay
in RAM for as long as it is accessed sufficiently often to remain in the live list. It
will only be lost from the live list if there is no free memory in the RAM allocator
and all the other memory used for paging and caching has been more recently
accessed than it has (that is, when these cache pages have become the oldest
pages in the system).
54 Demand Paging on Symbian

Subsequent file read request (cache miss)


If the cache pages are removed from the live list, then the file-system cache will
behave like this:

1. File read request is received by file server.


2. File data determined to be in cache.
3. Cache buffer locked with RChunk::Lock(). This returns an error
because the cache pages have been reclaimed by the kernel.
4. Cache buffer re-allocated with RChunk::Commit().
5. File system reads data from media into cache.
6. Some data copied from cache to client and read request is completed.
7. Cache buffer unlocked with RChunk::Unlock().

File-system caching uses a ‘backing off’ algorithm, so that when cache contents
are lost, the size of the memory being used for caching is reduced, reducing the
likelihood of future cache misses.

4.1.13 Byte-Pair Compression


The default ‘deflate’ compression algorithm used in Symbian only allows com-
pressed data to be decompressed in one whole block. This is fine when decom-
pressing a complete XIP ROM image or a whole executable, but it is not accept-
able for demand paging where only a single page of an image/executable needs
to be decompressed during a page-in event.

To support demand paging, Symbian introduced the ‘byte-pair’ compression


algorithm in Symbian OS v9.3, which allows data to be compressed and decom-
pressed in individually addressable blocks. Each decompressed 4 KB block can
then be mapped directly to a page of RAM. (Symbian can only demand-page
data that is byte-pair-compressed or uncompressed.)

Another reason to use byte-pair compression is its fast decompression time.


Decompressing byte-pair-compressed data is approximately twice as fast as de-
compressing deflate-compressed data. Compression time is slower but this is not
a concern because compression usually occurs at build-time and performance
here is less critical than performance at run-time. (A possible exception to this is
flash over the air (FOTA) functionality, which may require run-time compression
of data.)
Under the Hood: The Implementation of Demand Paging 55

The compressed size of byte-pair-compressed data is around 10%-20% larger


than deflate-compressed data due to the additional administrative overhead re-
quired for each 4 KB block.

Where performance is more important than ROM size, it may be better to use
byte-pair compression instead of deflate compression, even outside the context
of demand paging.

Byte-pair compression can be applied while building executables or during ROM


building. This is implicitly applied as appropriate when using the demand paging
keywords (discussed further in Section 7.15). The Symbian Developer Library
documentation at developer.symbian.org explains how to apply this compres-
sion method explicitly, if you need it for other purposes than demand paging.

Support for byte-pair compression has been part of the loader since Symbian
OS v9.2. It is only possible to demand-page data that has either been byte-pair
compressed or is uncompressed.

Byte-pair decompression is implemented in the method BytePairDecom-


press(). The loader calls this method on byte-pair compressed, non-paged
executables, and the kernel calls it on similar demand-paged code.

Algorithm
The input stream is compressed one 4 KB block at a time. This block size was
chosen to match the MMU page size, enabling the paging subsystem to decom-
press each page as needed. Fortuitously, this seems to be the optimum block
size for compression efficiency. In some cases, the block size may be less than 4
KB – for example, when the last block in a compression stream is ‘short.’

The compression algorithm is as follows:

1. Find the least frequently occurring byte, X. This will be used as the escape
byte.
2. Replace all occurrences of X with the pair of bytes {X,X}.
3. Find the new least frequently occurring byte, B.
4. Find the most frequently occurring pair of consecutive bytes, {P,Q}.
5. Check for terminating condition.
56 Demand Paging on Symbian

6. Replace all occurrences of B with {X,B}.


7. Replace all occurrences of {P,Q} with token B.
8. Go to step three.

When calculating frequencies in steps three and four, exclude those that involve
any bytes from the escaped form {X,?}.

When calculating frequencies in step four, do not count overlapping pairs that
occur in a repeated byte sequences. So the sequence of five identical bytes
(P,P,P,P,P) contains the two pairs and singleton ({P,P},{P,P},P) – not four pairs.

The terminating condition in step five checks if the substitution performed in steps
six and seven will not result in any further compression. This depends on the
storage format for the compressed data (discussed in the next section) and is
calculated as follows:

1. Let f(B) be the frequency of byte B, and f(P,B) the frequency of the pair
{B,P
2. If the number of byte-pair tokens created so far is <32 then terminate if
f(P,B) - f(B) <= 3.
3. If the number of byte-pair tokens created so far is >=32 then terminate if
f(P,B) - f(B) <= 2.

Storage Format
The compressed data for each block is stored in one of three forms, depending
on the number of tokens created during compression:

0 Tokens
When the compression hasn’t performed any token substitutions, the ‘com-
pressed’ data for the block is stored in a format as shown in Table 1.

Table 1

Values Meaning
0x00 One byte token count value (zero)
... Bytes of uncompressed data
Under the Hood: The Implementation of Demand Paging 57

1 to 31 tokens
When the compression step produced between one and 31 token substitutions,
the compressed data for the block is stored in a format as shown in Table 2.

Table 2

Values Meaning
N One byte token count value N
X One byte with the value of the escape character X
{B0,P0,Q0}...
N times 3 bytes of token/pair values {B,P,Q}
{BN-1,PN-1,QN-1}
... Bytes of compressed data

32 to 255 tokens
When the compression step produced between 32 and 255 token substitutions,
the format is modified to store the token values in a bit vector, as shown in
Table 3 on the next page.

Table 3

Values Meaning
N One byte token count value N
X One byte with the value of the escape character X

32 bytes containing a vector of 256 bits. The least


significant bit of the first byte is index zero. A set bit
<TokenBits>
indicates that the corresponding token value B is
used and has its substitution pair {P,Q} as follows.

N times two bytes of pair values {P,Q}. These are


{P0,Q0}...
stored in ascending order of token values that repre-
{PN-1,QN-1}
sent them.

... Bytes of compressed data


58 Demand Paging on Symbian

4.1.14 Debugger Breakpoint Support


The kernel provides the DebugSupport::ModifyCode() method to allow
debuggers to modify the contents of code to implement breakpoints, and this
method has been updated to work as expected when demand paging is in use.
Each breakpoint now makes use of the new DDemandPagingLock object to
force paged memory to be loaded and locked before inserting breakpoints. This
prevents the paging system from discarding the page in question, losing the
breakpoint.

When a breakpoint is removed, the kernel unlocks the page and demand pages it
as usual.

The code to create shadow pages now makes use of the new
M::LockRegion() and M::UnlockRegion() methods to force demand-
paged ROM to be loaded into memory before the page is shadowed.

4.2 Media Driver Support

4.2.1 Media Drivers


Since they form the interface to backing store for the paging system, media
drivers are of prime importance. I shall discuss the modifications you will need
to make to them in you are enabling demand-paging for the first time in Section
5.4. Here, I will limit myself to a brief overview of the new classes provided by the
kernel in support of media drivers and demand paging.

4.2.2 Paging Device APIs


Media drivers must now implement and expose a new API, DPagingDevice,
which allows the kernel to access storage media in support of demand paging.
This class provides methods for reading data from the storage media. It also sup-
plies media metrics.

Media drivers that support paging should register themselves with the paging
system by calling Kern::InstallPagingDevice() during system boot. If
demand paging is to operate for all user-side code, media drivers must be cre-
ated and installed via kernel extensions, rather than waiting until they are loaded
Under the Hood: The Implementation of Demand Paging 59

by EStart. To support this, Symbian has added two new APIs to enable device
driver creation from kernel-side code – Kern::InstallLogicalDevice() and
Kern::InstallPhysicalDevice().

The DPagingDevice class is shown as follows:

//
// kernel.h
kernel.h
class
class DPagingDevice
DPagingDevice :: public
public DBase
DBase
{
{
public:
public:
enum
enum TType
TType //
// The
The type
type of
of device
device this
this represents.
represents.
{
ERom = 1<<0, /**< Paged ROM device type. */
{
ECode = 1<<1 /**< Code paging device type. */
ERom
}; = 1<<0, /**< Paged ROM device type. */
ECode = 1<<1 /**< Code paging device
virtual TInt Read(TThreadMessage* aReq, type. */
}; TLinAddr aBuffer, TUint aOffset,
virtual TInt Read(TThreadMessage*
aReq,TLinAddr
TUint aSize, TInt aDrvNumber)aBuffer,
= 0;
public: TUint aOffset,TUint aSize, TInt aDrvNumber) = 0;
public:
TUint32 iType;
TUint32
TUint32 iType;
iDrivesSupported;
TUint32 iDrivesSupported;
const char* iName;
const char* iName;
TInt iReadUnitShift;
TInt
TInt iReadUnitShift;
iDeviceId;
TInt
}; iDeviceId;
};

The read method is called by the paging system to read data from the media rep-
resented by this device. This method should store aSize bytes of data at offset
aOffset in the buffer aBuffer.

iType is the type of paging device: ERom or ECode.

iDrivesSupported tells the system which local drives are supported for code
paging. It is a bitmask containing one bit set for each local drive supported, where
the bit set is 1 << the local drive number. If this device does not support code
paging, iDrivesSupported should be zero.
60 Demand Paging on Symbian

iName is a zero-terminated string representing the name of the device. This is


only used for debug tracing purposes.

iReadUnitShift is the Log2 of the read unit size. A read unit is the number of
bytes that the device can optimally read from the underlying media. For example,
for small block NAND, a read unit would be equal to the page size, 512 bytes,
and iReadUnitShift would be set to nine.

iDeviceId is the value, chosen by the kernel, that the device should use to
identify itself.

Paging Request Objects


When loading a page from media, the kernel needs a buffer into which the media
driver will load the page’s data, and some virtual address space into which it can
temporarily map the new page, before copying (and possibly decompressing) the
data from the buffer.

The kernel can’t rely on allocating a buffer every time a page fault occurs, be-
cause of performance issues and the possibility that there might not be enough
free RAM. Accordingly, these resources are packaged up into paging request
objects, and a fixed number of them are allocated when the system is initialized.

The paging request object is implemented by the


DemandPaging::DPagingRequest class, declared in demand_paging.h.
The relevant parts of the class declaration are:
class DemandPaging : public RamCacheBase
class
{ DemandPaging : public RamCacheBase
{
...
...
// Resources needed to service a paging request.
//
classResources needed to
DPagingRequest : service a paging request.
public SDblQueLink
class
{ DPagingRequest : public SDblQueLink
{
public:
public:
~DPagingRequest();
~DPagingRequest();
public:
public:
TThreadMessage iMessage;
TThreadMessage iMessage;
DMutex* iMutex;
DMutex* iMutex;
TInt iUsageCount;
TInt iUsageCount;
TLinAddr iBuffer;
TLinAddr iBuffer;
Under the Hood: The Implementation of Demand Paging 61

TLinAddr iLoadAddr;
TPte* iLoadPte;
};

TUint iPagingRequestCount;
DPagingRequest* iPagingRequests[KMaxPagingRequests];
SDblQue iFreeRequestPool;
TInt iNextPagingRequestCount;
...
};

Overview
When a thread takes a paging fault, it acquires a paging request object and
maps a new page of RAM at the temporary address (iLoadAddr). It then calls
the media driver to load the data into the request object’s buffer (iBuffer) and
copies or decompresses the data from it into the page at iLoadAddr. Finally, the
request object is released and the new page mapped in to the correct location in
the memory map.

This process is implemented in DemandPaging::ReadRomPage() and


DemandPaging::ReadCodePage() in mmubase.cpp.

Concurrency
Since request objects are only created at boot, the number of them limits the
number of paging faults that can be processed concurrently. Because there can
be many threads taking paging faults at the same time, some care is needed to
co-ordinate access to request objects. One problem that must be avoided is prior-
ity inversion, in which a high-priority thread waits to get its paging fault processed
because low-priority threads are holding all the request objects.

To avoid this, Symbian added a mutex (iMutex) and a usage count


(iUsageCount) to each request object. The kernel maintains a pool of free re-
quest objects (iFreeRequestPool) as well as an array of all the request ob-
jects in the system (iPagingRequests).

When a thread tries to acquire a request object, the kernel checks the free pool
first. If it is non-empty, an unused request object is taken for the pool. If not, the
62 Demand Paging on Symbian

kernel selects a request object at random. The kernel increments the request
object’s usage count and makes the thread wait on its mutex.

When a thread releases a request object, the kernel signals the object’s mutex
and decrements its usage count. Once the count reaches zero, the kernel places
the object back in the free pool. Both these operations occur with the system lock
held, to synchronize access to the relevant data structures. They are implement-
ed in DemandPaging::AcquireRequestObject() and
ReleaseRequestObject() in mmubase.cpp.

This scheme distributes free objects while they are available, and then makes
threads queue for a randomly selected object. Use of a mutex for this purpose
provides priority inheritance, avoiding the issue of priority inversion. Random
selection mitigates the possibility of pathological behavior.

A media driver may also need to queue multiple incoming requests, and so we
added a TThreadMessage member (iMessage) to DPagingRequest, allowing
these request objects to be placed on a message queue and used in inter-thread
communications.

Initialization
During initialization, the kernel creates a fixed number of paging request objects
per paging device in the system. This is set by the constant KPagingRequest-
sPerDevice, which is (at the time of writing) two. This number was chosen be-
cause media drivers cannot issue more than one request to hardware at a time,
and if we have two objects, then one thread can be waiting for data to be read
in and another thread can be decompressing data. Adding more request objects
would only result in more threads waiting for data to be read.

The request objects are created by DemandPaging::CreateRequest


Object(), defined in mmubase.cpp. This is called repeatedly from
DemandPaging::InstallPagingDevice() when a paging device is in-
stalled.

CreateRequestObject() creates a single request object. It allocates an ID


for the object (by atomically incrementing iNextPagingRequestCount with
NKern::LockedInc()). It allocates the object’s buffer from a chunk used for
Under the Hood: The Implementation of Demand Paging 63

the purpose, basing the offset on the ID. It calls down to the memory-model-spe-
cific code to allocate the temporary virtual address (using
AllocLoadAddress()), passing the ID. It creates the mutex.

CreateRequestObject() also resizes the live list if necessary, to ensure that if


every request object is in use (and the corresponding number of pages removed
from the live list), then there are still enough pages on the live list to satisfy the
system’s constraints. The ResizeLiveList() method makes use of
iNextPagingRequestCount, which now holds the number of request objects
that will be present in the system if this call to CreateRequestObject()
succeeds.

When the object has been initialized, CreateRequestObject() adds it to


the free pool and request object array, and increments the request object count
(iPagingRequestCount). This is done with the system lock held to serialize
access to the data structures. No locking is necessary for the rest of the function
because the system will only use the new request object when
iPagingRequestCount is updated at the end.

4.3 File Server Changes

4.3.1 File Clamping


If a file system is to support code paging (first available in Symbian OS v9.3), it
must implement file clamping. This is because when the kernel is demand paging
read-only code, the content and location of the files it is paging from must remain
unchanged throughout the duration of this paging.

Symbian introduced file-clamping functionality to the file server in support of this.


While a file is clamped, calls to methods that would otherwise modify its content
are prevented from doing so and return the error code KErrInUse (-14). Simi-
larly, if an attempt is made to perform a synchronous dismount of a file system
mount while one of the files is clamped, the dismount will be prevented and
KErrInUse will be returned. Attempts to perform an asynchronous dismount will
be deferred until all clamps have been closed.
64 Demand Paging on Symbian

Only open, read-only3, non-empty files may be clamped. For each clamp, a
handle is generated (encapsulated by the RFileClamp class) and returned to
the user. Files may be clamped multiple times. A new handle is generated in each
case.

To close (remove) a clamp, the specific handle must be passed back to the file
server (invalid handles lead to the error code KErrNotFound). It is only when all
the clamps for a file have been closed that the file can be considered
unclamped.

Operations Affected by File Clamping


The following list shows the APIs that are affected by file clamping.

RFs
• Replace()
• Delete()
• NotifyDismount() (with argument EFsDismountNotify
Clients and EFsDismountForceDismount)
• AllowDismount()
• DismountFileSystem()
• SwapFileSystem()

RFile
• Replace()
• SetSize()
• Open() (with argument EFileWrite)

File System Support for File Clamping


The FAT and ROFS file systems both support file clamping (they provide a unique
identifier for a file when clamping is requested).

The composite and ROM file systems provide ‘pseudo’ support for file clamping
– they always return a zero-value identifier for a file when clamping is requested.
Because files cannot be modified on these file systems, nor can the mount be
dismounted, clamping them is pointless.

3 To ensure that their contents are not changed.


Under the Hood: The Implementation of Demand Paging 65

Other file systems will show the default behavior. Requests for file clamping will
return the error code KErrNotSupported.

4.3.2 Implementation and Operation of File Clamping


When a file is clamped, the mount generates a handle. This handle is encapsu-
lated by the RFileClamp class:

//
// e32ldr.h
e32ldr.h
class
class RFileClamp
RFileClamp
{
{
public:
public:
inline
inline RFileClamp()
RFileClamp()
{
{
iCookie[0]
iCookie[0] = = 0;
0;
iCookie[1]
iCookie[1] = = 0;
0;
}
}
IMPORT_C
IMPORT_C TInt
TInt Clamp(RFile&
Clamp(RFile& aFile);
aFile);
IMPORT_C
IMPORT_C TInt Close(RFs&
TInt Close(RFs& aFs);
aFs);
public:
public:
TInt64
TInt64 iCookie[2];
iCookie[2];
};
};

The class keeps two cookies. The first, iCookie[0], is generated by the file
system to which the file belongs and represents a unique identifier for the file.

The second, iCookie[1], is made up of two 32-bit values: the drive number
and a count value that is incremented by the file-server mount instance on cre-
ation of each new clamp.

The RFileClamp class provides two APIs:

EXPORT_C
EXPORT_C TInt
TInt RFileClamp::Clamp(RFile&
RFileClamp::Clamp(RFile& aFile)
aFile)

Called on an uninitialized RFileClamp object. Clamps the supplied file, stores


the cookie in the RFileClamp object and returns an error.

EXPORT_C
EXPORT_C TInt
TInt RFileClamp::Close(RFs&
RFileClamp::Close(RFs& aFs)
aFs)
66 Demand Paging on Symbian

Unclamps the file that was clamped with this RFileClamp object. It is safe to call
this function with a handle that was not successfully opened.

Classes and Basic Clamp Functionality


The bulk of the file clamping functionality is provided by the class ‘CMountBody’,
a composite member of the file-server class that represents a file-system inde-
pendent mount, CMountCB. (If CMountBody has not been initialized, calls to file
clamping functionality receive the return value KErrNotSupported.)

Two key member data of CMountBody are:

RArray<RFileClamp> iClampIdentifiers;
RArray<RFileClamp> iClampIdentifiers;
TInt32 iClampCount;
TInt32 iClampCount;

iClampIdentifiers holds RFileClamp instances, ordered to facilitate search-


ing, while iClampCount is a one-based count value that increments for each
new clamp.

When a user-side thread makes a request to clamp a file, CMountBody seeks a


file-system specific unique identifier for the file and stores it in iCookie[0] of
the RFileClamp instance. It then increments iClampCount and stores a 64-
bit composite of the drive number and the incremented count in iCookie[1].
Finally, it inserts the RFileClamp instance into its rightful place in iClampI-
dentifiers and sets a flag for the mount to indicate that (at least) one file is
clamped.

When a request is made to unclamp a file, CMountBody searches


iClampIdentifiers for an RFileClamp instance that matches both the value
in iCookie[0] and the count value in iCookie[1]. If no matching instance
is found, it returns KErrNotFound. If it does find a match, the RFileClamp
instance is removed from iClampIdentifiers. If iClampIdentifiers is now
empty, the flag for the mount is cleared and any pending dismount is instigated.

File System Functionality


To support file clamping, the file system of the clamped file must implement the
CMountCB::MFileAccessor interface. This provides the method:
Under the Hood: The Implementation of Demand Paging 67

TInt GetFileUniqueId(const TDesC& aName, TInt64& aUniqueId)


TInt GetFileUniqueId(const TDesC& aName, TInt64& aUniqueId)

The parameter aName is name of the file and aUniqueId is the value returned
by the file system.

During initialization of the CMountCB object, its InitL() method is invoked,


which executes these lines of code:

MFileAccessor*
MFileAccessor* fileAccessor
fileAccessor =
= NULL;
NULL;
GetInterface(CMountCB::EFileAccessor,
GetInterface(CMountCB::EFileAccessor, (void*&)
(void*&) fileAccessor, NULL);
fileAccessor,NULL);
iBody = new(ELeave)CMountBody(this, fileAccessor);
iBody = new(ELeave)CMountBody(this, fileAccessor);

The GetInterface() call to the file system determines whether it provides sup-
port for the MFileAccesor interface. If not, the CMountBody is passed a NULL
parameter for fileAccessor, and requests for clamping functionality receive
the error code KErrNotSupported. Otherwise, the CMountBody is passed a
pointer to the file-system mount object – the object on which the
GetFileUniqueId() method is invoked.

Clamping and Deferred Dismount


To support deferred dismount, the CMountBody class includes the following
member data:

TBool
TBool iDismountRequired;
iDismountRequired;
TInt
TInt (*iCallBackFunc)(TAny*);
(*iCallBackFunc)(TAny*);
TDismountParams*
TDismountParams* iCallBackParams;
iCallBackParams;

iDismountRequired indicates if there is a dismount pending.


iCallBackFunc is a function to call when dismount may proceed.
iCallBackParams is a list of parameters to pass to iCallBackFunc.

These members are initialized when an asynchronous request to dismount fails


because the mount has one or more clamps. The details of the request are
stored for later and used to dismount when the last clamp is removed.
68 Demand Paging on Symbian

4.3.3 RFile::BlockMap() API


In Symbian OS v9.3, a new API was added to provide a mechanism for retrieving
a map of the logical sectors representing the file to be paged. The API enables
code to access the paged file by first obtaining its blockmap, and then accessing
the media directly.

Each file on the media will consist of a number of groups of contiguous blocks.
Each group is represented by its starting address, the media block size, the ad-
dress of the first block on the media and the number of consecutive blocks in the
group. If the file is not fragmented, an RFile::BlockMap() call will retrieve the
blockmap for the whole file (or a specified part of it). If the file is fragmented, sub-
sequent calls to RFile::BlockMap() will return the blockmap information for
the next group of contiguous blocks in the file. RFile::BlockMap() will return
KErrCompletion when it has reached the end of the requested part of the file.

Implementation
Symbian has added another new class. This one represents a group of contigu-
ous blocks, and is called TBlockMapEntry. It contains the number of blocks in
the group and the number of the first block in that group:

class
class TBlockMapEntry
TBlockMapEntry
{
{
public:
public:
TBlockMapEntry();
TBlockMapEntry();
void
void SetNumberOfBlocks(
SetNumberOfBlocks( TUint
TUint aNumberOfBlocks
aNumberOfBlocks ););
void
void SetStartBlock( TUint aStartBlock );
SetStartBlock( TUint aStartBlock );
public:
public:
TUint
TUint iNumberOfBlocks;
iNumberOfBlocks; //
// number
number of
of contiguous
contiguous blocks
blocks in
in map
map
TUint iStartBlock;
TUint iStartBlock; // number for first block in the map
// number for first block in the map
};
};

A container structure, SBlockMapInfo, describes a group of such blockmaps,


and also carries information such as the media block size in bytes, the offset to
start of the file or the requested file position within a block and the address of the
first block on the media. This container structure is passed as a parameter to the
following API to carry blockmap information:
Under the Hood: The Implementation of Demand Paging 69

TInt RFile::BlockMap(SBlockMapInfo& aInfo,


TInt64& aStartPos,
TInt64 aEndPos=-1,
TInt aBlockMapusage=EBlockMapUsagePaging);

Parameters:

SBlockMapInfo&
SBlockMapInfo& aInfo:
aInfo:

A structure describing a group of blockmaps:


const TUint KMaxMapsPerCall=8;
const TUint
typdef KMaxMapsPerCall=8;
TBuf8< KMaxMapsPerCall*sizeof(TBlockMapEntry)> TBlockAr-
typdef TBuf8< KMaxMapsPerCall*sizeof(TBlockMapEntry)> TBlockAr-
rayDes;
rayDes;
struct SBlockMapInfo
struct
{ SBlockMapInfo
{
TUint iBlockGranularity; // size of a block in bytes
iBlockGranularity; //
TUint iBlockStartOffset; //size of to
offset a block
startinofbytes
the file or
TUint iBlockStartOffset; ////offset to start
requested of the file
file position or
within
block // requested file position within block
Tint64 iStartBlockAddress // address of the first block
// of the file
TBlockArrayDes iMap;
};

iBlockGranularity is the size of the block for a given device in bytes. This
field is only filled on the first call to RFile::BlockMap().

iBlockStartOffset is the offset into the first block of a file containing the start
of the file, or the start of the section of the file that has been requested. This field
is only filled on the first call to RFile::BlockMap().

iStartBlockAddress is the address of the first block of the file. This field is
only filled on the first call to RFile::BlockMap().

iMap is a descriptor holding an array of up to


KMaxMapsPerCall TBlockMapEntry entries.

If you don’t need the blockmap for the whole, then you can specify a start and
70 Demand Paging on Symbian

end position for a section of the file (aStartPos, aEndPos). This is useful for the
demand paging where only the executable section of a file will be needed. Both
of these parameters specify offsets from the start of the file in bytes, and if they
are not passed, the whole file is assumed.

aBlockMapUsage is the reason for the blockmap API use, which is set to
EBlockMapUsagePaging by default.

The return value is one of the following:

• Until the end of the file or the file section is reached, KErrNone will be
returned.
• If the end of the file is reached, KErrCompletion will be returned. In this
case the length of the iMap may be smaller than its maximum.
• An error code.

RFile::BlockMap() was implemented by extending the interface of CFileCB


using the GetInterface() mechanism, and modifying the derivative classes of
CFileCB for file systems that support blockmaps
(CRofsFileCB and CFatFileCB).

A new API, BlockMapReadFromClusterListL() has been added to


CFatFileCB. This is used to calculate physical addresses from FAT table en-
tries. There is no ROFS equivalent function as ROFS organizes all its data con-
tiguously. The default implementation conveniently returns KErrNotSupported
for all other file systems. Example:
RArray<SBlockMapInfo> map;
RArray<SBlockMapInfo>
SBlockMapInfo info; map;
SBlockMapInfo
do info;
do{
{
r = myFile.BlockMap(info, aStartPos, aEndPos);
r = myFile.BlockMap(info, aStartPos, aEndPos);
map.Append(info);
map.Append(info);
} while ( r == KErrNone && r != KErrCompletion );
} while ( r == KErrNone && r != KErrCompletion );

TInt granularity;
TInt
for (granularity;
TInt c = 0; c < map.Count(); c++ )
for{ ( TInt c = 0; c < map.Count(); c++ )
{
granularity = map[c].iMap.Size()/KMaxMapsPerCall;
granularity = map[c].iMap.Size()/KMaxMapsPerCall;
Under the Hood: The Implementation of Demand Paging 71

TBlockMapEntry* myBlockMapEntry =
(TBlockMapEntry*)map[c].iMap.Ptr();
// Then read the contents of iMap using
// the myBlockMapEntry pointer
}
map.Close();
if (KErrCompletion!=pos) … // deal with error
...

This API will not be available to any process other than the file server. It is used in
the loader (for demand paging), and in the loopback proxy extension. Appropriate
security vetting by SID will be required on the file server.

Changes to File Systems


The API obtains the information it needs by calling into the file. To ensure binary
compatibility with earlier versions, the API uses the GetInterface() method
of the CFileCB-derived object. This means that file systems that do not store
files in a manner consistent with this API will not require any changes. This is
because their current implementations will automatically return KErrNot-
Supported. If a file system does implement block-based storage, you will have
to add a new interface, and return it through the GetInterface() method.
For example, CFileCB gets a new inline:

inline TInt CFileCB::BlockMap(SBlockMapInfo& aInfo, TInt64&


inline TInt CFileCB::BlockMap(SBlockMapInfo& aInfo, TInt64&
aStartPos, TInt64 aEndPos = -1)
aStartPos, TInt64 aEndPos = -1)
{
{
CBlockMapInterface* pM;
CBlockMapInterface* pM;
TInt r = GetInterface(EBlockMapInterface,(TAny*) pM,(TAny*) this);
TInt r = GetInterface(EBlockMapInterface,(TAny*) pM,(TAny*)
if (KErrNone!=r) return r;
this);
return ( pM->BlockMap(anInfo,aStartPos,aEndPos) );
if (KErrNone!=r) return r;
}
return ( pM->BlockMap(anInfo,aStartPos,aEndPos) );
}

Supporting file systems such as FAT could then implement the interface like this:
class CFATBlockMapInterface:
class CFATBlockMapInterface: public
public CBlockMapInterface
CBlockMapInterface
{
{
...
...
// overwrite pure virtual
// overwrite pure virtual
72 Demand Paging on Symbian

TInt BlockMap(SBlockMapInfo& aInfo, TInt64& aStartPos,


TInt64 aEndPos = -1);
};

TInt CFatFileCB::GetInterface (TInt aInterfaceId,
TAny*& aInterface, TAny* aInput)
{
switch(aInterfaceId)
{
case EBlockMapInterface:
aInterface = (TAny*) &iBlockMapInterface;
return KErrNone;
//...
}
return KErrNotSupported
}

The same applies for ROFS. You need a CRofsBlockMapInterface class


to inherit CBlockMapInterface and implement a BlockMap() function.
Additionally, CRofsFileCB::GetInterface() needs to be EBlockMapIn-
terface aware.

Finally, the FAT file system itself would return the blockmap, using the mount
object to walk the FAT table to obtain the requisite information.

4.3.4 Loader Server Loading Code to be Paged


In Symbian OS v9.3 we have updated the loader so that it can load code-paged
binaries.

When loading code into RAM, the loader now checks whether the code should
be paged. If so, it does not load, relocate or fix up the code, but instead collates
extra information that it passes to the kernel, allowing the latter to perform these
steps when each page is loaded. The loader also prevents the image file from be-
ing deleted or modified while it is being paged.



Under the Hood: The Implementation of Demand Paging 73

Implementation
The method E32Image::ShouldBeCodePaged() has been added to deter-
mine whether an image should be code paged. This decision is based on several
factors:

1. Whether the kernel’s paging policy allows code paging


2. Whether the code is appropriately compressed (uncompressed or byte-
pair compressed)
3. Whether the code itself is marked as paged or unpaged.

The file clamp API is used to prevent the image file from being deleted.

The method E32Image::BuildCodeBlockMap() has been added to retrieve


the blockmap data for the file being paged. The kernel uses the blockmap to es-
tablish where the code is located on the media.

E32Image::LoadCompressionData() and associated methods have been


added to locate where each page of code resides in the image file, for the sup-
ported compression types.

E32Image::BuildFixupTable(), E32Image::AllocateRelocationData
() and associated methods have been added to build the import fix-up table and
relocation tables to pass to the kernel. The functions that perform the relocation
and import fix-up are still called for code-paged images, however rather than do-
ing the fix up themselves, they populate the appropriate tables.

Summary
This chapter has covered a lot of ground! You should have the details necessary
to understand the implementation, and you can find out more by consulting
documentation and source available on developer.symbian.org.
74 Demand Paging on Symbian
Enabling Demand Paging on a New Platform 75

5
Enabling Demand Paging on a New Platform

In this section, we’ll discuss the main factors that should influence the form of
demand paging you choose to implement on your system. For example, demand
paging can have a significant impact on device driver code; we’ll discuss the rea-
sons and some ways to manage the changes necessary. We’ll also discuss some
of the broader impacts of demand paging on the underlying system.

5.1 Choosing Which Type of Demand Paging to Implement

Here we examine the different paging scenarios and describe factors to consider
when choosing which type to implement.

5.1.1 All Paging Scenarios


If large parts of the system have to be wired for real-time/performance reasons,
then the RAM savings achieved by demand paging may not be worth the cost of
implementation.

You should also note that device driver and system server APIs may need chang-
ing to support the implementation changes you’ll have to make to enable demand
paging.

5.1.2 ROM Paging (Symbian OS v9.3)


This is the simplest form of demand-paging to implement, but offers the least
RAM saving, especially if large parts of ROM have to be wired for real-time and
performance reasons. Also, over-the-air updates are more complex for XIP ROM
images.

5.1.3 Code Paging (Symbian OS v9.3)


This allows the paging of after-market applications, but is the scheme that takes
the longest to implement. Code paging requires a new executable image format,
76 Demand Paging on Symbian

in which data compression and relocation information are based around page-
sized data blocks.With code paging, there is much scope for deadlock and poor
performance, because:

• File systems must provide a control path that doesn’t cause paging faults.
aPriority is the priority of the thread create
• File system meta-data caching is required for acceptable performance
• Third-party plug-ins must not take paging faults or use any service that
might do so.

5.1.4 Data Paging (not yet implemented)


At the time of writing, data paging is not implemented on Symbian.

Data paging has the potential to offer the greatest RAM saving, but can only work
on devices with a suitable backing store. Power management would be a signifi-
cant problem because the backing store is likely to be heavily used. You would
need to implement a smart memory-caching system to mitigate this.

5.2 Migrating Device Drivers to a Demand-Paged System

Impact of Demand-Paging on Kernel-Side Code


I have already referred (in Section 2.2) to some of the disadvantages of a
demand-paged operating system. In this section, I discuss in more detail the
impacts of these on kernel-side code, such as device drivers, and show how af-
fected areas may be identified and modified.

Page faults lead to unpredictable delays


If a thread takes a paging fault, it is blocked for a significant and unpredictable
time while the kernel services the fault. As I have already explained in Section
2.2.1, the delay will be of the order of one millisecond (assuming that the media
you are paging from is not being used for other file system activity) and in busy
systems the delay for a single fault could be hundreds of milliseconds or more.
You should also note that a page fault could occur for each separate 4 KB page
of memory accessed, so an object straddling two pages may cause two page
faults.
Enabling Demand Paging on a New Platform 77

A faulting driver can have a wider impact on system performance:

• The Symbian kernel services page faults at the priority of the NAND media
driver thread (KNandThreadPriority = 24). This means any thread of
higher priority will have its priority reduced while the driver faults.
• In a system in which two or more device drivers share the same thread,
the page fault taken by one driver can reduce the performance of the other
drivers using that thread.

Mutex misuse causes system deadlock


When the kernel services a page fault, it must execute system code that uses
certain system mutexes. If the thread that caused the fault already holds one of
those mutexes, system deadlock will result.

The only safe rule to apply is that demand-paged memory must not be accessed
while holding any kernel-side mutex.

In the following sections, I’ll look at these problem areas in more detail and dis-
cuss the actions you can take to mitigate them.

5.2.1 Page Faults and Device Drivers: Where Can Page Faults Occur?
Device drivers, like all kernel-side components, are wired in memory; once
loaded, their code and read-only data sections will never be paged out. The
ROMBUILD tool ensures this by placing drivers in the unpaged part of an XIP
ROM. If any device driver is not resident in ROM, the loader will always copy it (in
its entirety) into RAM, and wire it there. Correctly written kernel-side code should
only access user memory using special functions provided by the kernel, which I
discuss next.

1. Functions for Accessing Memory in an Arbitrary Process


If a device driver needs to access data structures via pointers passed by its
paged clients, and these data structures live in the client’s read-only (constant)
data section, then passing a pointer to this paged data to the driver could result in
a page fault when the driver de-references the pointer. (This situation might arise,
for example, when accessing image data that is built into the ROM.) Typically,
if the client needs to give the driver access to an arbitrary amount of data in its
78 Demand Paging on Symbian

address space, it will pass a descriptor encapsulating the data, or a pointer to a


buffer containing the data. In this case, the driver needs to access the data from
a different address space from the one associated with the thread it executes in,
using the inter-thread kernel APIs listed as follows:

// kernel.h
// kernel.h
Kern::ThreadDesRead()
Kern::ThreadDesRead()
Kern::ThreadGetDesLength()
Kern::ThreadGetDesLength()
Kern::ThreadGetDesMaxLength()
Kern::ThreadGetDesMaxLength()
Kern::ThreadGetDesInfo()
Kern::ThreadGetDesInfo()
Kern::ThreadRawRead()
Kern::ThreadRawRead()

Note that, at time of writing, Symbian only supports the demand paging of read-
only data, and so writes to pageable memory will not cause paging faults (but
will cause the normal permissions check exception). The write functions can be
considered safe for demand paging:

// kernel.h
// kernel.h
Kern::ThreadDesWrite(DThread*, TAny*, const TDesC8&, TInt, TInt,
DThread*)
Kern::ThreadDesWrite(DThread*,TAny*,const TDesC8&,TInt,TInt,DThread*)
Kern::ThreadRawWrite(DThread*,TAny*,const
Kern::ThreadRawWrite(DThread*, TAny*, const TAny*,TInt,DThread*)
TAny*, TInt, DThread*)
Kern::ThreadDesWrite(DThread*,TAny*,const
Kern::ThreadDesWrite(DThread*, TAny*, const TDesC8&,TInt,DThread*)
TDesC8&, TInt,
DThread*)

2. Functions for Accessing Memory in the Current Process


Driver code may also need to access user memory directly using the following
functions. These functions cause an exception if the user memory address is
invalid, so they should not be used while the driver thread is in a critical section.
This means the driver thread should not hold any mutexes. If this condition is
met, then deadlock caused by demand paging is impossible, unless the func-
tion is called under an XTRAP harness. In this case, the code could be holding a
mutex and you should check it for demand paging safety.

//
// klib.h
klib.h
kumemget(TAny*
kumemget(TAny* aKernAddr,
aKernAddr, const
const TAny*
TAny* aAddr,
aAddr, TInt
TInt aLength)
aLength)
kumemget32(TAny*
kumemget32(TAny* aKernAddr, const TAny* aAddr, TInt
aKernAddr, const TAny* aAddr, TInt aLength)
aLength)
umemget(TAny*
umemget(TAny* aKernAddr,
aKernAddr, const
const TAny*
TAny* aUserAddr,
aUserAddr, TInt
TInt aLength)
aLength)
Enabling Demand Paging on a New Platform 79

umemget32(TAny* aKernAddr, const TAny* aUserAddr, TInt aLength)

// kernel.h
Kern::KUDesGet(TDes8& aDest, const TDesC8& aSrc)
Kern::KUDesInfo(const TDesC8& aSrc, TInt& aLength, TInt&
aMaxLength)
Kern::KUDesSetLength(TDes8& aDes, TInt aLength)

Similarly, you can also assume the following write functions to be demand-paging
safe:

// klib.h
kumemput(TAny* aAddr, const TAny* aKernAddr, TInt aLength)
kumemput32(TAny* aAddr, const TAny* aKernAddr, TInt aLength)
kumemset(TAny* aAddr, const TUint8 aValue, TInt aLength)
umemput(TAny* aUserAddr, const TAny* aKernAddr, TInt aLength)
umemput32(TAny* aUserAddr, const TAny* aKernAddr, TInt aLength)
umemset(TAny* aUserAddr, const TUint8 aValue, TInt aLength)

// kernel.h
Kern::KUDesPut(TDes8& aDest, const TDesC8& aSrc)

The following functions are Symbian’s internal technology and so shouldn’t be


used in partner code. Because they do not generate exceptions, you would need
to check them for demand-paging safety, irrespective of whether an XTRAP har-
ness is (unnecessarily) used.

// kernel.h
Kern::KUSafeRead(const TAny* aSrc, TAny* aDest, TInt aSize)
Kern::KUSafeWrite(TAny* aDest, const TAny* aSrc, TInt aSize)
Kern::KUSafeInc(TInt& aValue)
Kern::KUSafeDec(TInt& aValue)
Kern::SafeRead(const TAny* aSrc, TAny* aDest, TInt aSize)
Kern::SafeWrite(TAny* aDest, const TAny* aSrc, TInt aSize)

3. Access to ROM headers Headers for User Mode Executables


You should check both the following functions for demand paging safety. The first
is defined in kernel.h and the second in platform.h.
80 Demand Paging on Symbian

// kernel.h
// kernel.h
Kern::CodeSegGetMemoryInfo(DCodeSeg&, TModuleMemoryInfo&, DPro-
cess*)
Kern::CodeSegGetMemoryInfo(DCodeSeg&, TModuleMemoryInfo&, DProcess*)
// platform.h
Epoc::RomProcessInfo(TProcessCreateInfo&, const TRomImageHeader&)

4. Debugger Support APIs that Set Breakpoints


The kernel implements the functions below in a demand-paging-aware manner.
You do not need to modify code that uses them, unless such code isn’t tolerant of
the indefinite delay caused by a page fault.

//
// platform.h
platform.h
DebugSupport::CloseCodeModifier()
DebugSupport::CloseCodeModifier()
DebugSupport::ModifyCode(DThread*,
DebugSupport::ModifyCode(DThread*, TLinAddr,
TLinAddr, TInt,
TInt, TUint,
TUint, TUint)
TUint)
DebugSupport::RestoreCode(DThread*, TLinAddr)
DebugSupport::RestoreCode(DThread*, TLinAddr)

5. Direct Access to XIP ROM


Any code that reads directly from the contents of an XIP ROM may cause a page
fault. The only parts of the ROM that you can safely assume not to be demand
paged are:

• The ROM header (TRomHeader)


• The contents of any kernel-mode executable
• The ROM C++ exception search table (addressed by
TRomHeader::iRom ExceptionSearchTable).

You should assume that all other parts of the ROM could be demand paged.

Mutex Problems with Page Faults in Drivers


Page faults in device drivers can cause a different problem, related to mutex
ordering. Page faults are handled in the context of the thread that took the fault.
The code that handles page faults makes use of kernel resources and requires
the use of synchronization objects such as NFastMutex and DMutex. NFast-
Mutex objects are not nestable, and the nesting of DMutex objects can lead to
deadlock if the mutex ordering is violated.
If the order of the DMutex used by the device driver is higher than any other
Enabling Demand Paging on a New Platform 81

DMutex objects used by the kernel in all possible operations where the device
driver may take a page fault, then the mutex order will not be violated. In practice,
it is very difficult for the device driver writer to guarantee this in all the possible
situations that mutex nesting could take place. Because of this, Symbian has
decided that the only safe rule to apply is that demand-paged memory must not
be accessed while holding any kernel-side mutex.

The following lists the most likely code in the base port or debugging modules to
hold system mutexes.

1. Code that Creates its Own Mutex Using Kern::MutexCreate()


This is to be expected.

2. Device Driver Power Handlers


Base port code implements classes deriving from DPowerController
(kpower.h). The kernel calls the following methods in those classes with the
power management feature lock mutex held:

• DPowerController::DisableWakeupEvents()
• DPowerController::EnableWakeupEvents()
• DPowerController::PowerDown(TTimeK aWakeupTime)

3. The Power Handler


Base port code implements classes deriving from DPowerHandler
(kpower.h). The kernel calls the following methods in those classes with the
power-management feature lock mutex held:

• DPowerHandler::PowerUp()
• DPowerHandler::Wait()
• DPowerHandler::PowerDown(TPowerState aState)

4. DKernelEventHandler Implementations
The kernel calls these in many different places while it holds internal kernel
mutexes.
82 Demand Paging on Symbian

5. Code that Examines Kernel Containers


This means code that calls Kern::Containers() and then calls Wait() on
the container mutex. Note that these APIs are internal to Symbian and shouldn’t
be used in partner code. Changing this code so that it doesn’t wait on the mutex
is NOT a good solution, because a container cannot be used safely unless the
caller holds its mutex lock.

In the next section, I discuss how to avoid this mutex issue, and other issues
arising from demand paging.

5.2.2 Addressing Issues Arising from Demand Paging

Each device driver should have its own DFC thread


A good first step is to investigate whether any part of the driver executes in a
shared kernel thread context. Device drivers execute operations in a kernel
thread context by placing deferred function calls (DFCs) on a queue, from which
they are later executed sequentially by the corresponding DFC thread. DFCs are
typically (but not exclusively) queued in response to interrupts so that they can
perform operations that are not possible in an interrupt service routine (ISR).

Since EKA2, there has been support for multiple DFC queues and threads.
However, it is common for device drivers that execute in a context other than
their client’s thread to share the kernel’s DFC thread zero, which is the thread
associated with DFC queue zero. This approximates to the behavior of drivers on
EKA1, where only a single DFC queue is supported. Any driver that uses DFC
thread zero will execute in a shared thread context.

So Symbian now recommends that each driver uses its own thread and DFC
queue. We have already modified all of our own drivers in this way. To assist you
in this change, the kernel (from version Symbian OS v9.3 onwards) provides a
new dynamic queue class, TDynamicDfcQue.

// TDynamicDfcQue derives from TDfcQue and adds


// TDynamicDfcQue derives from TDfcQue and adds
// a method to destroy the queue easily:
// a method to destroy the queue easily:
class TDynamicDfcQue : public TDfcQue
class TDynamicDfcQue : public TDfcQue
{
{
public:
public:
Enabling Demand Paging on a New Platform 83

TDynamicDfcQue();
IMPORT_C void Destroy();

private:
TDfc iKillDfc;
};

You create queues by calling a new method in the Kern class:

TInt Kern::DynamicDfcQCreate(TDynamicDfcQue*& aDfcQ, TInt aPrior-


TInt Kern::DynamicDfcQCreate(TDynamicDfcQue*& aDfcQ,
ity,
TInt aPriority, const TDesC& aBaseName);
const TDesC& aBase-
Name);

The arguments are used as follows:

• aDfcQ is set to the created queue if the operation is successful.


• aPriority is the priority of the thread created.
• aBaseName is used to name the thread; an eight-digit hex number is
appended to it to make it unique.

The method returns KErrNone if successful, or one of the standard error codes.
For example, the following code could be added to a physical device driver (PDD)
entry point to create a DFC queue (error handling omitted for brevity):

const
const TInt
TInt KThreadPriority
KThreadPriority =
= 27;
27;
_LIT8(KThreadBaseName,”DriverThread”);
_LIT8(KThreadBaseName,”DriverThread”);
TDynamicDfcQue*
TDynamicDfcQue* pDfcQ;
pDfcQ;

TInt
TInt r
r =
= Kern::DynamicDfcQCreate(pDfcQ,
Kern::DynamicDfcQCreate(pDfcQ, KThreadPriority,
KThreadPriority, KThread-
BaseName);
KThreadBaseName);

pdd->SetDfcQ(pDfcQ);
pdd->SetDfcQ(pDfcQ);

Remember to delete the DFC queue from the PDD object’s destructor:
84 Demand Paging on Symbian

DPddObject::~DPddObject()
DPddObject::~DPddObject()
{
if (iDfcQ)
iDfcQ->Destroy();
}

Note that if you have several drivers making use of a common peripheral bus,
then you will need to ensure that the code managing the bus is thread safe. You
will do this by using mutexes to protect state, rather than relying on only one
driver being able to execute at once as you might have done before.

There are several types of device driver architecture. A driver may be dynamically
loaded or boot loaded (if it is a kernel extension). It might have a single channel
or multiple channels, or a PDD and an LDD, or an LDD only. Each of these dif-
ferent architectures needs you to create the dedicated DFC queue at a different
place in your code. I discuss this in detail, with code examples, in Section 5.3.

Code running in DFCQue1 must not access user memory


Symbian’s system timer code uses DFCQue1, so any paging fault caused in this
queue’s associated DFC thread one will have a serious negative impact on the
system and could result in deadlock.

Try to access user memory in the context of the client thread


If the driver takes a page fault while it executes in the client thread’s context, or in
a thread context exclusively used by the driver channel associated with the client,
then it will only affect the performance of the driver and its client. It will not delay
the running of other drivers as it might if the fault were taken in a kernel context.
This makes it a better design decision to derive your device drivers from
DLogicalChannelBase rather than DLogicalChannel. This ensures that
code will access user memory in the context of the client thread rather than the
kernel’s DFC thread, and means that only clients accessing demand-paged
memory will suffer the impacts of paging. This is especially important if it’s pos-
sible for the driver to have more than one client.

Following this rule effectively moves the impact of demand paging into the user-
Enabling Demand Paging on a New Platform 85

side client, which can then choose to use demand-paged memory (or not), know-
ing that other clients won’t affect this choice.

Avoiding page faults in your kernel-side driver code


Instead of the driver copying demand-paged client memory directly into kernel
memory that uses mutex protection, it should copy this data to a temporary ker-
nel-side buffer first, and then copy it safely to its final location. There are a couple
of techniques you can use for this:

Exchanging data using shared chunks


Your device driver can create a shared chunk and map it to a linear address
space in the kernel process, which is never paged out. Future accesses to the
data in this shared chunk will not cause page faults. But note that changing an
existing driver that exchanges data using buffers or descriptors to use shared
chunks is not a trivial task. I recommend that you only use this method in excep-
tional circumstances.

Copying data to the kernel stack


Another useful technique is to copy the data to a kernel-side stack-based buffer
in the context of the client’s thread. This means that any accesses to paged data
will fault in the context of the client thread. This method is not efficient because
it involves a data copy. It should only be used when the amount of the data is
known to be small (less than about 512 bytes), and when the additional overhead
is considered acceptable.

Unfortunately, apart from the methods mentioned above, there is no general tech-
nique you can use when reworking kernel-side code to avoid accessing pageable
memory while holding mutexes. You will need to find specific solutions for your
own particular situation, and may have to re-architect your code.

If it were feasible to have complete knowledge of all the software on the phone
and its interactions, then it might be possible to prove that a certain mutex usage
could never cause deadlock, and so was ‘safe’. This is almost impossible on a
complex phone and, even if safe mutex usage could be proven, this is likely to be
a fragile situation, susceptible to breaking when system code changes. I repeat
that the only safe assumption you can make is that any access to pageable
memory while holding a mutex has the potential to cause system deadlock.
86 Demand Paging on Symbian

Mutex use
If a device driver accesses paged data from a thread other than that of its client,
it should use one of the Kern::ThreadXxx() APIs listed in Section 5.2.2.1.
These APIs use the system lock, which automatically excludes the use of another

NFastMutex. You should ensure that they are never called while holding a
DMutex.

If a driver reads from its client’s user side memory space while executing in its cli-
ent’s thread context, it must use one of the following published APIs:

//
// klib.h
klib.h
umemget()
umemget()
umemget32()
umemget32()
kumemget()
kumemget()
kumemget32()
kumemget32()
//
// kernel.h
kernel.h
Kern::KUDesGet()
Kern::KUDesGet()
Kern::KUDesInfo()
Kern::KUDesInfo()
Kern::InfoCopy()
Kern::InfoCopy()

These APIs have a precondition that excludes their use with a DMutex, because
they cannot be called from a critical section. Again, you must ensure they are not
called while holding an NFastMutex.

Kernel ASSERTs
The functions listed in Section 5.2.3.1 contain asserts, active in UDEB builds,
which cause a system fault if called while (most kinds of) system mutex are held.
This can help you to identify code that needs modification for demand paging.
However, to ease the integration of demand paging, these asserts do permit mu-
texes with an order value of KMutexOrdGeneral0 through
KMutexOrdGeneral7. This should not be taken as indicating that these mutex-
es are safe for use with demand paging.

The assertion statements in Kern::ThreadRawWrite() do not trigger if the


source address is in the kernel heap or a thread’s supervisor stack. This is be-
cause this usage cannot cause paging faults and is explicitly allowed.
Enabling Demand Paging on a New Platform 87

The assertions are not active unless there is demand-paged memory in the sys-
tem, so will not affect products that do not make use of demand paging.

5.3 Guidelines for Migrating Device Drivers

5.3.1 Typical Device Driver Architectures in EKA2


In this section, I’ll take look at typical device driver architectures in EKA2 and
point out those that are more susceptible to taking page faults. As we saw in Sec-
tion 5.2.2, if a device driver needs to access data structures via pointers passed
by its paged clients and these data structures live in the client’s read-only (con-
stant) data section, then passing a pointer to this paged data to the driver could
result in a page fault when the driver de-references the pointer.

I’ll also point out those architectures that are most likely to affect the performance
of other device drivers or other clients of those drivers.

Boot-loaded non-channel-based drivers


Boot-loaded device drivers are built as kernel extensions. Typically, this form is
used by simple device drivers with limited or no user-side client interface.

Kernel extensions are used to provide user-interface services, such as sup-


port for hardware keypads, keyboards, LCD and touch-sensitive screens. The
interface to the user-side client (that is, the window server and associated com-
ponents such as the user-mode screen driver) is either event-based (when the
driver needs to pass data to the client) or uses the hardware abstraction layer
(HAL).

Kernel extensions in this category may execute some of their operation in the
context of a kernel thread. Although HAL calls can be used to pass data struc-
tures (often configuration) to drivers in this category, their operation is usually
safe because, typically, these data structures are accessed in the context of the
calling client thread. But you must take care to validate this assumption, especial-
ly when HAL calls are used to reconfigure the driver and/or hardware. When

the HAL call needs to synchronize with driver operation, it will be done in a kernel
context, and paging issues could arise.
88 Demand Paging on Symbian

Another possible use of extensions is to provide services to other kernel-side


components, such as other device drivers. These extensions do not have a user-
side client interface and are typically used to provide access to services offered
by hardware such as direct memory access (DMA) controllers, I2C (inter inte-
grated circuit) buses and power resource controllers. They may execute in the
context of a kernel thread, which may not be the same one as their client device
driver’s thread. In this situation, you need to take care when the extension ac-
cesses user-side data passed in by its client. This driver may pass a pointer to a
data structure received from its user-side client, without verifying the pointer or
accessing the data. In this situation, the extension could fault.

Media drivers
Media drivers are channel-based device drivers (PDDs) that are also kernel
extensions. They interface to user-side clients (file systems) via the local media
subsystem (an LDD and a kernel extension) which creates and manages the
channel objects. Typically, the extension part of the media subsystem will perform
early initialization – creating the LDD factory object – but the extension entry
point of media drivers servicing page-in requests will also create and install the
PDD factory object. The channel objects are created on the first access.

The recommended model for media drivers either uses a unique DFC queue and
associated kernel thread, or has the media driver executing wholly in the context
of its client (as the internal RAM media driver does). As we’ve seen, this is be-
cause operations on media can be long running, so sharing the execution thread
with another driver could result in unacceptable impact on the performance of
that driver. The parallelism of file and disk operations would also be impaired.
This means that there is no issue with page faults and shared thread contexts in
media drivers.

One issue arises with media drivers that service page-in requests: if their cli-
ent – a file system running on an associated file-server drive thread – passes
an address in memory that is paged out, and the driver needs to read this in the
context of its unique kernel thread, a deadlock could occur, with the media driver
thread taking a page fault and thus becoming unable to service the ensuing
page-in request. To mitigate this, the media subsystem now ensures that the data
is paged in (if necessary taking a page fault in the context of the file-server drive
thread) before passing it to the media driver thread. Please refer to Section 5.5
Enabling Demand Paging on a New Platform 89

for more detail on media driver migration, and to Section 4.3 for more on demand
page locking.

Dynamically loaded, channel-based IO device drivers


Channel-based IO drivers may derive from DLogicalChannelBase or DLogi-
calChannel. They may require a PDD to interface to the hardware, or the LDD
may interface to the hardware directly. They may enforce a single client policy or
allow multiple clients to open channels, or share an open channel handle. They
may allow multiple channels to be opened or enforce a single channel policy.
They may support one or many hardware units (typically there is a one-to-one
relationship between channels and units, but a driver may support multiple chan-
nels on the same unit). These different options affect the likelihood of the occur-
rence of the issues described in previous sections.

Drivers derived from DLogicalChannelBase usually execute in the context of


their client. In this case, there is no impact on other drivers if they take a page
fault. Multi-threaded drivers may also derive from DLogicalChannelBase, in
situations where accesses to hardware can be done concurrently. In this case,
the driver will typically create separate kernel threads rather than using shared
DFC queues. This architecture is demand-paging safe. If the driver uses a shared
DFC queue and single associated kernel thread, then the discussion in the next
paragraph about drivers derived from DLogicalChannel applies.

The DLogicalChannel framework requires a message queue, DFC queue and


associated kernel thread. The class has a pointer to a DFC queue so that it is
possible to have each channel execute on a separate kernel thread. Although not
enforced, it is expected that each channel will operate on a separate hardware
unit, if more than one channel per hardware unit is supported. Those channels
should then use the same DFC queue, to avoid complex synchronization mecha-
nisms. If the driver needs a PDD to interface to the hardware, it makes sense to
relegate the decision about which DFC queue and kernel thread to use to the
PDD, which creates the mapping between channels and hardware units.

Although the mechanism exists to allow unique channel contexts, it has been
common practice to use the shared DFC queue thread zero on drivers derived
from DLogicalChannel. Much of the following discussion will consider this type
of driver.
90 Demand Paging on Symbian

I shall next propose some simple solutions for migrating ‘problem’ drivers to a
demand-paged system. The overriding principle, which we have seen, is that
drivers accessing data structures in their client’s user-side address space do so
from either the client’s thread context or from the context of the driver’s unique
kernel thread – and if the driver can have more than client, in a driver kernel-
thread context that is particular to the client.

Device drivers execute operations in a kernel-thread context by queuing DFCs


on a DFC queue running on that thread. DFCs may be queued as a result of cli-
ent requests, interrupts, IDFCs, timer expiration and system power up/down. So
the requirement for a unique kernel thread translates into one for a unique DFC
queue.

5.3.2 Boot-Loaded Device Drivers


You will typically create a DFC queue and associated kernel thread in the kernel
extension entry point:
const TInt KMyDriverDfcQuePriority = XX;
const TInt KMyDriverDfcQuePriority = XX;
_LIT(KMyDriveThreadName,MY_DRIVER_THREAD_NAME);
_LIT(KMyDriveThreadName,MY_DRIVER_THREAD_NAME);
DMyDriver : public DPowerHandler
DMyDriver : public DPowerHandler
{
{
...
...
public:
public:
...
...
TDfcQue* iDfcQ;
TDfcQue* iDfcQ;
};
};
DECLARE_STANDARD_EXTENSION()
DECLARE_STANDARD_EXTENSION()
{
{
TInt r=KErrNoMemory;
TInt r=KErrNoMemory;
DMyDriver* pH=new DMyDriver;
DMyDriver* pH=new DMyDriver;
if (pH)
if (pH)
{
{
r = Kern::DfcQCreate(iDfcQ, KMyDriverDfcQuePriority,
r = Kern::DfcQCreate(iDfcQ, KMyDriverDfcQuePriority,
&KMyDriveThreadName);
&KMyDriveThreadName);
if(KErrNone==r)
if(KErrNone==r)
{
{
// second phase construction of DMyDriver
// second phase construction of DMyDriver
// second phase construction of DMyDriver
Enabling Demand Paging on a New Platform 91


} }
}

return r; r;
return
}

Kern::DfcQCreate() creates a DFC queue in the kernel heap, sets iDfcQ to


point to it and then invokes Kern::DfcQInit() which creates the kernel thread
associated with this DFC queue. Both these APIs must be called in a critical sec-
tion, which is true for any extension entrypoint.

One variation of this form has a global DFC queue and simply invokes
Kern::DfcQInit(..) from the entry point – this is the construct used by me-
dia drivers:

TDfcQue MyDriverDfcQ;
const TInt KMyDriverDfcQuePriority = XX;
_LIT(KMyDriveThreadName,MY_DRIVER_THREAD_NAME);

DECLARE_STANDARD_EXTENSION()
{
TInt r=KErrNoMemory;
DMyDriver* pH=new DMyDriver;
if (pH)
{
r = Kern::DfcQInit(&MyDriverDfcQ, KMyDriverDfcQuePriority,
&KMyDriveThreadName);
if(KErrNone==r)
{
// second phase construction of DMyDriver
}
}
return r;
}

Kernel extensions are never unloaded, so there is no need to call destructors on


the DFC queue or the associated thread.
92 Demand Paging on Symbian

5.3.3 Use of Unique Threads

Ownership of DFC queues


If a channel-based device driver does not need a PDD, then the DFC queue
should be associated with either the LDD factory object (DLogicalDevice-de-
rived) or the logical channel object (DLogicalChannel or DLogicalChannel-
Base-derived). The following discussion, on associating the DFC queue with the
LDD factory or the logical channel, and on the time of the queue’s creation and
destruction, applies to LDDs as well as to PDDs.

1. If the driver enforces a single-channel policy then the DFC queue should
be associated with the PDD factory object (DPhysicalDevice-derived).
The DFC queue should be created as a result of loading the driver and
destroyed as a result of unloading the driver – so to enforce a single chan-
nel policy, the LDD factory DLogicalDevice-derived object’s Create()
function will typically include something like the following:

DMyLDDFactory::Create(DLogicalChannelBase*& aChannel)
{
if(iOpenChannels!=0) // iOpenChannels is a member
of DLogicalDevice
// of DLogicalDevice
return KErrInUse;
... // now create the Logical Channel
}

2. If the driver does not support more than one hardware unit, then the DFC
queue should again be associated with the PDD factory object (DPhysi-
calDevice-derived) that is created when the driver is loaded and de-
stroyed when it is unloaded. The constructor of the LDD factory object of
a driver that does not support more than one unit will not set bit one of
DLogicalDevice::iParseMask (KDeviceAllowUnit).
3. If a driver supports more than one hardware unit, it might be that the units
are implemented by the same hardware block with a shared control inter-
face. In this case, it might be possible to bring the device to an inconsis-
tent state if the shared control interface is accessed from multiple threads.
Rather than implementing complex synchronization mechanisms, it may
be easier to have all channel operations of the shared interface executing
from the same kernel thread context. Again, the DFC queue is associated
Enabling Demand Paging on a New Platform 93

with the PDD factory object, and the queue and kernel thread are created
when the driver is loaded, and destroyed when the driver is unloaded.
4. If a driver supports multiple hardware units that are independent from
each other and independently controlled, then the ownership of the DFC
queue should be given to the PDD object – the physical channel. The DFC
queue (and its associated thread) should be created whenever a channel
is opened, and destroyed when the channel is closed.

Creation and destruction of DFC queues


To support the use of DFC queues in dynamically loaded drivers, a new class
TDynamicDfcQue has been added to the kernel. It derives from TDfcQue and
adds a method to destroy the queue easily:

class
class TDynamicDfcQue
TDynamicDfcQue :
: public
public TDfcQue
TDfcQue
{
{
public:
public:
TDynamicDfcQue();
TDynamicDfcQue();
IMPORT_C
IMPORT_C void
void Destroy();
Destroy();
private:
private:
TDfc
TDfc iKillDfc;
iKillDfc;
};
};

You create queues by calling a new method in the Kern class:

TInt Kern::DynamicDfcQCreate(TDynamicDfcQue*& aDfcQ,


TInt aPriority,
const TDesC& aBaseName);

Where:

• aDfcQ is set to the created queue if the operation is successful


• aPriority is the priority of the thread create
• aBaseName is used to name the thread; an eight-digit hex number is ap-
pended to it to make it unique.
The method returns KErrNone if successful, or one of the standard error codes.

The destruction of the DFC queue used by a device driver should be triggered by
94 Demand Paging on Symbian

the destruction of the object it is associated with. So, to destroy the DFC queue
and terminate the thread associated with it, the destroy method must be called:

• from the LDD factory destructor or PDD factory destructor, whichever


owns the DFC queue (steps 1 to 3 from Section 5.3.3.1)
• from the LDD destructor or PDD destructor, whichever owns the DFC
queue (point 4 in Section 5.3.3.1).

The DLogicalChannel class holds a pointer to the DFC queue used by each
logical channel object derived from it. This pointer is typically set up during the
second-phase construction of the LDD in the DoCreate() function, which
means the DFC queue must have been created by the time the LDD’s DoCre-
ate() is invoked. This is guaranteed when the DFC queue is owned by the LDD
or PDD factory objects, because these are created when first loading the logical
or physical device. It is also guaranteed in the case when the PDD object owns
the DFC queue, as the order of channel construction is as follows:

1. LDD constructor
2. PDD constructor
3. PDD DoCreate()
4. LDD DoCreate()

However, when the LDD owns and creates the DFC queue, it is down to you, the
developer, to guarantee that the correct pointer to the DFC queue is stored in
DLogicalChannel::iDfcQ as part of DoCreate().

Note that the LDD has access to the LDD factory object, the PDD factory object
and the PDD through the iDevice, iPhysicalDevice and iPdd pointers in
DLogicalChannelBase base class.

In the next four sections, I’ll give code examples for these different situations.


Enabling Demand Paging on a New Platform 95

5.3.4 DFC Queue in Logical Device

const TInt KMyDriverThreadPriority = XX;


_LIT(KMyDriverThread,”MyDriverThread”);
const TInt KMyDriverThreadPriority = 27;
_LIT(KMyDriverThread,”MyDriverThread”);
class DMyLogicalDevice : public DLogicalDevice
{
class DMyLogicalDevice : public DLogicalDevice
public:
{
DMyLogicalDevice();
public:
~DMyLogicalDevice();
DMyLogicalDevice();
void Construct(TDynamicDfcQue* aDfcQ);
~DMyLogicalDevice();
virtual TDfcQue* DfcQ();
void Construct(TDynamicDfcQue* aDfcQ);
...
virtual TDfcQue* DfcQ();
public:
...
...
public:
TDynamicDfcQue*
... iDfcQ;
};
TDynamicDfcQue* iDfcQ;
};
DMyLogicalDevice::DMyLogicalDevice()
// Constructor
DMyLogicalDevice::DMyLogicalDevice()
{
// Constructor
//
{ sets iVersion and iParseMask but leaves bit 2
//
// (KDeviceAllowPhysicalDevice)
sets iVersion and iParseMask unset
but leaves bit 2
}
// (KDeviceAllowPhysicalDevice) unset
}
DMyLogicalDevice::~DMyLogicalDevice()
// Destructor
DMyLogicalDevice::~DMyLogicalDevice()
{
// Destructor
...
{
//
...cancel any other DFCs owned by this device
if
// (iDfcQ)
cancel any other DFCs owned by this device
ifiDfcQ->Destroy();
(iDfcQ)
} iDfcQ->Destroy();
}
void DMyPhysicalDevice::Construct(TDynamicDfcQue* aDfcQ)
{ DMyPhysicalDevice::Construct(TDynamicDfcQue* aDfcQ)
void
iDfcQ=aDfcQ;
{
}
iDfcQ=aDfcQ;
}
TDfcQue* DMyLogicalDevice::DfcQ()
{
96 Demand Paging on Symbian

TDfcQue* DMyLogicalDevice::DfcQ()
return iDfcQ;
}{
return iDfcQ;
}
DECLARE_STANDARD_LDD()
{
DECLARE_STANDARD_LDD()
DMyLogicalDevice* pD=new DMyLogicalDevice;
{
if(pD)
DMyLogicalDevice*
{ pD=new DMyLogicalDevice;
if(pD)
TDynamicDfcQ* q;
{
TInt r = Kern::DynamicDfcQCreate( q, KMyDriverThreadPriority,
TDynamicDfcQ* q; KMyDriverThread);
TInt r = Kern::DynamicDfcQCreate( q, KMyDriverThreadPriority,
if(KErrNone==r)
{ KMyDriverThread);
if(KErrNone==r)
pD->Construct(q);
{
return pD;
pD->Construct(q);
}
return pD;
pD->AsyncClose();
} }
pD->AsyncClose();
return NULL;
} }
return NULL;
// }Logical Channel
DMyDriverLogicalChannel::DoCreate(TInt aUnit,
// Logical Channel const TDesC8* /*anInfo*/,
DMyDriverLogicalChannel::DoCreate(TInt aUnit,
const TVersion &aVer)
{ const TDesC8* /*anInfo*/,
... const TVersion &aVer)
{
SetDfcQ(iDevice->DfcQ());
...
}SetDfcQ(iDevice->DfcQ());
...
}

In the previous code extract, we create a dynamic DFC queue on the kernel heap
and arrange for the logical device object to have a pointer to it.

When the logical device is loaded, the DLL entry point is invoked with
KModuleEntryReasonProcessAttach, which invokes the LDD-specific
initialization DECLARE_STANDARD_LDD(). The LDD-specific entry point creates
the LDD factory object and, if successful, creates the dynamic DFC queue (and
Enabling Demand Paging on a New Platform 97

associated thread). The logical channel uses the pointer to the logical device to
obtain the DFC queue.

A possible variation to this scheme creates the dynamic DFC queue the
DLogicalDevice-derived Install() function. This simplifies the entry point:

_LIT(KLddName,”MyDriver”);
TInt DMyLogicalDevice::Install()
// Install the device driver.
{
TDynamicDfcQ* q;
TInt r = Kern::DynamicDfcQCreate(q, KMyDriverThreadPriority,
KMyDriverThread);
if(KErrNone==r)
{
Construct(q);
r=SetName(&KLddName);
}
return r;
}

DECLARE_STANDARD_LDD()
{
return new DMyLogicalDevice;
}

5.3.5 DFC Queue in Physical Device

class
class DMyPhysicalDevice
DMyPhysicalDevice :: public
public DPhysicalDevice
DPhysicalDevice
{
{
public:
public:
DMyPhysicalDevice();
DMyPhysicalDevice();
~DMyPhysicalDevice();
~DMyPhysicalDevice();
void
void Construct(TDynamicDfcQue*
Construct(TDynamicDfcQue* aDfcQ);
aDfcQ);
virtual
virtual TDfcQue*
TDfcQue* DfcQ();
DfcQ();
...
...
public:
public:
...
...
TDynamicDfcQue*
TDynamicDfcQue* iDfcQ;
iDfcQ;
};
};
98 Demand Paging on Symbian

DMyPhysicalDevice::DMyPhysicalDevice()
// Constructor
{
// sets iVersion and iUnitMask (if required)
}

DMyPhysicalDevice::~DMyPhysicalDevice()
// Destructor
{
...
// cancel any other DFCs owned by this device
if (iDfcQ)
iDfcQ->Destroy();
}

void DMyPhysicalDevice::Construct(TDynamicDfcQue* aDfcQ)


{
iDfcQ=aDfcQ;
}

TDfcQue* DMyPhysicalDevice::DfcQ()
{
return iDfcQ;
}

DECLARE_STANDARD_PDD()
{
DMyPhysicalDevice* pD=new DMyPhysicalDevice;
if(pD)
{
TDynamicDfcQ* q;
TInt r = Kern::DynamicDfcQCreate(q, KMyDriverThreadPriority,
KMyDriverThread);
if(KErrNone==r)
{
pD->Construct(q);
return pD;
}
pD->AsyncClose();
}
return NULL;
}
Enabling Demand Paging on a New Platform 99

// Logical Channel
DMyDriverLogicalChannel::DoCreate(TInt aUnit,
const TDesC8* /*anInfo*/,
const TVersion &aVer)
{
...
SetDfcQ(iPhysicalDevice->DfcQ());
...
}

The previous code extract uses the same principles outlined in the previous para-
graph, with the main differences being:

• The pointer to the DFC queue is owned by the physical device.


• The PDD entry point creates the physical device and the DFC queue.
• The logical channel uses the pointer to the physical device to obtain the
DFC queue.

Again, you can create the DFC queue in the DPhysicalDevice-derived In-
stall() function.

5.3.6 DFC Queue in Logical Channel


class DMyDriverLogicalChannel : public DLogicalChannel
class
{ DMyDriverLogicalChannel : public DLogicalChannel
{
public:
public:
DMyDriverLogicalChannel();
DMyDriverLogicalChannel();
virtual ~DMyDriverLogicalChannel();
virtual
virtual ~DMyDriverLogicalChannel();
TInt DoCreate(TInt aUnit, const TDesC8* anInfo,
virtual TInt DoCreate(TInt aUnit,const
constTVersion&
TDesC8* anInfo,
aVer);
... const TVersion& aVer);
...
public:
public:
...
...
// DLogicalChannel has a public pointer to a TDfcQue, iDfcQ
//
}; DLogicalChannel has a public pointer to a TDfcQue, iDfcQ
};
// Logical Device
// Logical Device
DMyLogicalDevice::DMyLogicalDevice()
DMyLogicalDevice::DMyLogicalDevice()
// Constructor
//{Constructor
{
100 Demand Paging on Symbian

...
// sets iVersion and iParseMask with bit 1 (KDeviceAllowUnit)
// set and bit 2 (KDeviceAllowPhysicalDevice) unset
}

TInt DMyLogicalDevice::Create(DLogicalChannelBase*& aChannel)


{
aChannel=new DMyDriverLogicalChannel;

if(!aChannel)
return KErrNoMemory;
return KErrNone;
}

// Logical Channel
DMyDriverLogicalChannel::DMyDriverLogicalChannel()
// Constructor
{
// may set up pointer to owning client’s thread
// and increase its reference count
// iDfcQ=NULL;
}

DMyDriverLogicalChannel::~DMyDriverLogicalChannel()
// Destructor
{
// may also decrease the owning client’s thread reference count
...
// cancel any other DFCs owned by this channel
if (iDfcQ)
iDfcQ->Destroy();
}

TInt DMyDriverLogicalChannel::DoCreate(TInt aUnit,


const TDesC8* anInfo, const TVersion& aVer)
{
// check platform security capabilities
...
TInt r= KErrNoMemory;
Enabling Demand Paging on a New Platform 101

TDynamicDfcQue* q;
r = Kern::DfcQCreate(q, KMyDriverThreadPriority,
KMyDriverThread);
if (KErrNone==r)
{
SetDfcQ(q);
iMsgQ.Receive();
return r;
}

return r; // if error, framework will delete Logical Channel


}

In the example, the DFC queue is owned by the logical channel. The second-
phase constructor (DoCreate()) creates the queue. If successful, it sets the
DLogicalChannel pointer to DFC queue (iDfcQ).

When the channel is closed, the destructor of the logical channel is invoked and
this destroys the DFC queue.

5.3.7 DFC Queue in PDD


class DMyDriver
class DMyDriver :
: public
public DBase
DBase
{
{
public:
public:
DMyDriver();
DMyDriver();
~DMyDriver();
~DMyDriver();
TInt DoCreate(TInt
TInt DoCreate(TInt aUnit,
aUnit, const
const TDesC8*
TDesC8* anInfo);
anInfo);
virtual TDfcQue* DfcQ(TInt aUnit);
virtual TDfcQue* DfcQ(TInt aUnit);
...
...
public:
public:
...
...
DLogicalChannel* iLdd;
DLogicalChannel* iLdd;
TInt iUnit;
TInt iUnit;
TDynamicDfcQue* iDfcQ;
TDynamicDfcQue* iDfcQ;
};
};

// Logical
// Logical Device
Device
DMyLogicalDevice::DMyLogicalDevice()
DMyLogicalDevice::DMyLogicalDevice()
// Constructor
// Constructor
{// Sets iVersion
{// Sets iVersion and
and iParseMask
iParseMask with
with bit
bit 1
1 (KDeviceAllowUnit)
(KDeviceAllowUnit)
102 Demand Paging on Symbian

// and bit 2 (KDeviceAllowPhysicalDevice) set


...
}

// Physical Device
TInt DMyPhysicalDevice::Create(DBase*& aChannel, TInt aUnit,
const TDesC8* aInfo, const TVersion& aVer)
{
DMyDriver* pD=new DMyDriver;
aChannel=pD;
TInt r=KErrNoMemory;
if (pD)
r=pD->DoCreate(aUnit,aInfo);
return r;
}

// PDD
DMyDriver::DMyDriver()
// Constructor
{
...
//iDfcQ=NULL;
}

DMyDriver::~DMyDriver()
// Destructor
{
...
// cancel any other DFCs owned by this channel
if (iDfcQ)
iDfcQ->Destroy();
}

TInt DMyDriver::DoCreate(TInt aUnit, const TDesC8* /*anInfo*/)


{
iUnit=aUnit;
TInt r=KErrNoMemory;
TDynamicDfc* q;
r = Kern::DfcQCreate(q, KMyDriverThreadPriority,
KMyDriverThread);
if (KErrNone==r)
Enabling Demand Paging on a New Platform 103

{
}
iDfcQ=pDfcQ;
return r; // if error, framework will delete LDD and PDD
} }
return r; // if error, framework will delete LDD and PDD
}
TDfcQue* DMyDriver::DfcQ(TInt aUnit)
{
TDfcQue* DMyDriver::DfcQ(TInt
TDfcQue* pDfcQ=NULL; aUnit)
{
if(aUnit==iUnit)
TDfcQue* pDfcQ=NULL;
pDfcQ=iDfcQ;
if(aUnit==iUnit)
return pDfcQ;
} pDfcQ=iDfcQ;
return pDfcQ;
//}Logical Channel
DMyDriverLogicalChannel::DoCreate(TInt aUnit, const TDesC8* /*anInfo*/,
// Logical Channel const TVersion &aVer)
{
DMyDriverLogicalChannel::DoCreate(TInt aUnit, const TDesC8* /*anInfo*/,
... const TVersion &aVer)
{
SetDfcQ(iPdd->DfcQ(aUnit));
...
...
SetDfcQ(iPdd->DfcQ(aUnit));
}
...

Key points:

• The PDD owns pointers to DFC queue.


• The PDD requires a second-phase construction, which creates the DFC
queue.
• The DFC queue is associated to a hardware unit.
• The logical channel obtains the DFC queue through its pointer to the PDD.

5.3.8 Changes to Symbian Device Drivers


In Symbian driver code, we have moved the initialization of DFC queues from
platform-independent code to platform-specific code. This provides a mecha-
nism for platform-specific code to create its own DFC queue as recommended
in Section 5.2.4.1. To make use of this mechanism, some specific drivers have
been modified in Symbian OS v9.3, breaking binary compatibility in the process.
These cases are described next. Base ports that use these drivers will need to be
migrated. The base ports supplied by Symbian have already been migrated.
104 Demand Paging on Symbian

USB driver
The USB driver originally used DfcQue0 for its iPowerUpDfc and
iPowerDownDfc members and set this in the platform-independent layer.

From e32/drivers/usbcc/ps_usbc.cpp:

DUsbClientController::DUsbClientController()
DUsbClientController::DUsbClientController()
{
{
__KTRACE_OPT(KUSB,
__KTRACE_OPT(KUSB,
Kern::Printf(“DUsbClientController::DUsbClientController()”));
Kern::Printf(“DUsbClientController::DUsbClientController()”));
#ifndef SEPARATE_USB_DFC_QUEUE
#ifndef SEPARATE_USB_DFC_QUEUE
iPowerUpDfc.SetDfcQ(Kern::DfcQue0());
iPowerUpDfc.SetDfcQ(Kern::DfcQue0());
iPowerDownDfc.SetDfcQ(Kern::DfcQue0());
iPowerDownDfc.SetDfcQ(Kern::DfcQue0());
#endif
#endif
}
}
The driver now requires initialization of iPowerUpDfc and iPowerDownDfc in
the platform-specific layer. In the Symbian-provided base-ports, a dedicated DFC
queue is also created.

From omap/shared/usb/pa_usbc.cpp:

#ifdef SEPARATE_USB_DFC_QUEUE
#ifdefTInt
const SEPARATE_USB_DFC_QUEUE
KUsbThreadPriority = 27;
const TInt KUsbThreadPriority = 27;
_LIT8(KUsbThreadName,”UsbThread”);
_LIT8(KUsbThreadName,”UsbThread”);
TInt TOmapUsbcc::CreateDfcQ()
TInt
{ TOmapUsbcc::CreateDfcQ()
{
TInt r=Kern::DfcQCreate(iDfcQ,KUsbThreadPriority,&KUsbThreadNa
TInt r=Kern::DfcQCreate(iDfcQ,KUsbThreadPriority,&KUsbThreadName);
me);
if (KErrNone
if (KErrNone !=
!= r)
r)
{
{
__KTRACE_OPT(KHARDWARE,
__KTRACE_OPT(KHARDWARE,Kern::Printf(“PSL: > Error
Kern::Printf(“PSL: initializing
> Error initial-
izing USB client support. Can’t create DFC Que”)) ;
return r; USB client support. Can’t create DFC Que”))
; }
return r;
iPowerUpDfc.SetDfcQ(iDfcQ);
}

iPowerUpDfc.SetDfcQ(iDfcQ);
Enabling Demand Paging on a New Platform 105

iPowerDownDfc.SetDfcQ(iDfcQ);
return KErrNone;
}
#endif

You can restore the original functionality by undefining the SEPARATE_USB_DFC_


QUEUE macro in e32/kernel/kern_ext.mmh.

Sound driver
A new virtual abstract method has been added to the class DSoundPDD:
virtual TDfcQue* DfcQ() = 0;

You must define this method in the base-port-derived object and ensure it returns
the DFC queue to use.

I recommend that you change the driver so that it creates its own DFC queue, as
described in Section 5.2.4.1. However, the minimum required change is to imple-
ment the function to return a pointer to DFC queue zero. For example, the follow-
ing implementation would suffice:

TDfcQue*
TDfcQue* DSoundPddDerived::DfcQ()
DSoundPddDerived::DfcQ()
{
{
return
return Kern::DfcQue0();
Kern::DfcQue0();
}
}

DDigitiser
The class DDigitiser no longer initializes the member variable iDfcQ (DFC
queue pointer). You must initialize this variable in the base-port derived object.

We recommend that you change the driver so that it creates its own DFC queue,
as described in Section 5.2.4.1. However, the minimum required change is to set
the variable to use DFC queue zero in the derived object. For example, you could
add the following code to the derived class constructor:
DDigitiserDerived::DDigitiserDerived()
DDigitiserDerived::DDigitiserDerived()
{
{
// ...
// ... = Kern::DfcQue0();
iDfcQ
iDfcQ
} = Kern::DfcQue0();
106 Demand Paging on Symbian

// ...
}

5.3.9 Additional Impact of Migrating Device Drivers


Once you have implemented the recommendations in Section 5.2.4, you will
have a more multi-threaded base port. In general, this is a good thing because it
provides fairer scheduling of kernel-side code. However, there may be negative
impacts that need considering:

• A base-port component may provide a service to device drivers, hard-


ware or software, which, once initiated, should not be disturbed until it
completes – but now the service might be pre-empted by another kernel
thread.
• A hardware component may have a control interface that can be used by
a number of drivers. Operations on the control interface, although almost
instantaneous, may not be atomic and therefore should not be interrupted.

In the first situation, when the state of a resource needs to be protected from
the effects of pre-emption for an appreciable period of time, the recommended
approach is to use mutual exclusion, protecting the resource with a DMutex. An
exception to this is where the only risk is of the same driver triggering the same
operation before the previous one completes – that is, when an operation is non-
blocking and occurs from different thread contexts. In that case, an NFastMutex
should suffice.

An example of the second situation is a ‘set-clear’ control interface, with a pair


of registers, where one (A) contains bits to be set and the other (B) contains bits
to be cleared. You have to write to both registers to produce the desired state. If
the operation is pre-empted after A is set but before B is cleared, and a new set-
clear operation is initiated, the final state of the interface may be undetermined.
Pre-emption protection in this case may be achieved by simply locking the ker-
nel (using NKern::Lock()) before the operation starts and unlocking it (using
NKern::Unlock()) after it completes. If the interface is to be used from an
interrupt context, it is sufficient to disable all interrupts around it to protect against
thread concurrency.
Enabling Demand Paging on a New Platform 107

5.4 Media Driver Migration


The media driver is a key component in the operation of demand paging. As you
can see in Figure 2, the NAND driver on an NAND XIP ROM is responsible for
servicing page-in requests from the paging subsystem. This means that if you
are implementing demand paging on your platform, you will have to modify your
media drivers so that they support these additional requests.

As before, it is essential that the thread in which the driver runs does not itself
take a page fault, otherwise deadlock will occur.

A media driver is typically a PDD with a filename in the form ‘med*.pdd’. Like
other kernel-side components, it is always marked as unpaged, which means
that its code and read-only data sections will never be paged out. The only time
the media driver could theoretically take a page fault is when it accepts a write
request from a user-side client whose source data is paged out – this could be
data in the paged area of an XIP ROM or code that has been loaded into RAM
from code-paging-enabled media. To remedy this, Symbian has modified the lo-
cal media subsystem to ensure that the source data in a write request is paged
in before the write request is passed to the media driver thread. This may mean
taking a page fault in the context of the file-server drive thread before passing the
request on.

Large write requests of paged data are fragmented into a series of smaller ones
to avoid exhausting available RAM. Such fragmentation is quite rare but it might
happen, for example, when copying a large ROM data file into a temporary loca-
tion on the user data drive.
I explain the steps needed to enable a media driver to support XIP ROM and/or
code paging in the following sections. For the specific changes required to sup-
port paging from internal MMC/SD card, see Section 5.5.6.

5.4.1 Changes to variantmediadef.h


To support paging, you should define the following parameters using
appropriate macro names (the names are not important) in the variant’s
variantmediadef.h file:

1. The paging flags – whether code paging and/or XIP ROM paging is sup-
ported.
108 Demand Paging on Symbian

2. The paging fragment size. If a write request points to paged data, then the
request will be split up into separate fragments of this size. This value
needs to be chosen with care. If it is too small, writes may take an
unacceptably long time to complete. If it is too large, paging requests may
take an unacceptably long time to be satisfied.
3. The number of drives that support code paging. If code paging is not
supported (that is, only XIP ROM paging is supported), this should be
zero.
4. The list of local drives that support code paging (if code paging is
supported). This should be a subset of the overall drive list supported by
the media driver.

For example, here (in bold italics) are the changes made to support paging on
NAND on the H4 reference platform:

// Variant parameters for NAND flash media driver (mednand.pdd)


// Variant parameters for NAND flash media driver (mednand.pdd)
#define NAND_DRIVECOUNT 8
#define NAND_DRIVECOUNT 8
#define NAND_DRIVELIST 2,3,5,6,7,9,10,11
#define NAND_DRIVELIST 2,3,5,6,7,9,10,11
#define NAND_NUMMEDIA 1
#define NAND_NUMMEDIA 1
#define NAND_DRIVENAME “Nand”
#define NAND_DRIVENAME “Nand”
#define PAGING_TYPE DPagingDevice::ERom | DPagingDevice::ECode
#define PAGING_TYPE DPagingDevice::ERom | DPagingDevice::ECode
// code paging from writeable FAT, Composite FAT and first ROFS
// code paging from writeable FAT, Composite FAT and first ROFS
#define NAND_PAGEDRIVELIST 2,5,6
#define NAND_PAGEDRIVELIST 2,5,6
#define NAND_PAGEDRIVECOUNT 3
#define NAND_PAGEDRIVECOUNT 3
#define NUM_PAGES 8 // defines the size of fragment
#define NUM_PAGES 8 // defines the size of fragment

The macros can then be picked up in the media driver source code and
passed to LocDrv::RegisterPagingDevice(). This function is similar to
LocDrv::RegisterMediaDevice() in that it takes a drive list as a parameter
but in this case it identifies the drive(s) to be used for code paging (if any).


Enabling Demand Paging on a New Platform 109

5.4.2 Changes to the Driver’s Kernel Extension Entry Point


There are two initial stages in a media driver’s lifetime that need to be consid-
ered:
1. The kernel extension entry point – normally identified by the
DECLARE_STANDARD_EXTENSION macro

2. The PDD entry point – identified by the DECLARE_EXTENSION_PDD


macro.
A media driver’s kernel extension entry point is called very early on in the boot
sequence. Sometime later, the file server loads all media drivers and calls their
PDD entry points. Each PDD exports a single function at ordinal one for creat-
ing the PDD factory object. When the file server issues the first request to a drive
object associated with the media, the local media subsystem calls the factory
object’s Create() function to instantiate the media driver object. However, for
demand paging to start as soon as possible in the boot sequence, we need to in-
stantiate and install the PDD factory object earlier – in the kernel extension entry
point.

Some media drivers may have no kernel extension entry point defined (for exam-
ple, the MMC media driver). These will have a DECLARE_STANDARD_PDD macro
defined rather than DECLARE_EXTENSION_PDD. You will need to modify these to
have a DECLARE_EXTENSION_PDD / DECLARE_STANDARD_EXTENSION pair.

The kernel extension entry point must create a dedicated DFC queue (as dis-
cussed earlier) – otherwise a page fault in a drive thread cannot be satisfied. The
entry point must then create a DPrimaryMediaBase object and register it with
the local media subsystem. To support demand paging, you should modify the
entry point to register the paging device with the paging subsystem, and instanti-
ate and install the driver factory object.

The following is an example of such a change (changes in bold italics):

static const TInt NandPagingDriveNumbers[NAND_PAGEDRIVECOUNT+1] =


static const TInt NandPagingDriveNumbers[NAND_PAGEDRIVECOUNT+1] =
{NAND_PAGEDRIVELIST};
{NAND_PAGEDRIVELIST};
DECLARE_STANDARD_EXTENSION()
DECLARE_STANDARD_EXTENSION()
110 Demand Paging on Symbian

{
TInt r=Kern::DfcQInit(&NandMediaDfcQ,
andThreadPriority,
&KNandMediaThreadName);
if (r!=KErrNone)
return r;

DPrimaryMediaBase* pM=new DPrimaryMediaBase;


if (!pM)
return r;

pM->iDfcQ=&NandMediaDfcQ;
r=LocDrv::RegisterMediaDevice(MEDIA_DEVICE_NAND,
NAND_DRIVECOUNT,
NandDriveNumbers,
pM,
NAND_NUMMEDIA,
KNandDriveName);
if (r != KErrNone)
return r;

r = LocDrv::RegisterPagingDevice(pM,
NandPagingDriveNumbers,
NAND_PAGEDRIVECOUNT,
PAGING_TYPE,
SECTOR_SHIFT,
NUM_PAGES);

if (r == KErrNone)
{
device = new DPhysicalDeviceMediaNand;
if (device == NULL)
return KErrNoMemory;

r = Kern::InstallPhysicalDevice(device);
}

// Ignore error if demand paging not supported by kernel
else if (r == KErrNotSupported)
r = KErrNone;
else
Enabling Demand Paging on a New Platform 111

return r;
return r;
pM->iMsgQ.Receive();
pM->iMsgQ.Receive();
return KErrNone;
return KErrNone;
}
}

Note that:

• A hardware component may have a control interface that can be used by


a number of drivers. Operations on the control interface, although almost
instantaneous, may not be atomic and therefore should not be interrupted.
• The DECLARE_EXTENSION_PDD entry point will still be called some time
later when the file server tries to load all the media drivers in the system.
When this happens, the media driver will create a second factory object,
but this will be deleted by the kernel when it discovers that another factory
object bearing the same name is already in its internal list.
• The fifth parameter passed to LocDrv::RegisterPagingDevice() is
the log2 of the sector size for the given media, for example, nine (corre-
sponding to a sector size of 512) for most media.
• To prevent compilation errors when code paging is disabled (NAND_
PAGEDRIVECOUNT is zero), the drive number array passed to
LocDrv::RegisterPagingDevice() is one greater in length than the
drive count.

5.4.3 Changes to TLocalDriveCaps


You should modify the TLocalDriveCaps structure so that:

• The KMediaAttPageable flag is set in iMediaAtt.


• The KDriveAttPageable flag is set if the particular drive has
been registered as a code-paging drive (determined by testing
TLocDrvRequest::Drive()->iPagingDrv). Here is an example
(changes in bold italics):

TInt DMediaDriverNand::Request(TLocDrvRequest& aRequest)


112 Demand Paging on Symbian

{
TInt r=KErrNotSupported;
TInt id=aRequest.Id();

if (id == DLocalDrive::ECaps)
{
TLocDrv* drive = aRequest.Drive();
TLocalDriveCapsV4& c =
*(TLocalDriveCapsV4*)aRequest.RemoteDes();
r=Caps(*drive,c);
}
// etc
}

TInt DMediaDriverNand::Caps(TLocDrv& aDrive, TLocalDriveCapsV4&


caps)
{// fill in rest of caps structure as usual…
if(aDrive.iPrimaryMedia->iPagingMedia)
caps.iMediaAtt|=KMediaAttPageable;
if(aDrive.iPrimaryMedia->iPagingMedia)
if(aDrive.iPagingDrv)
caps.iMediaAtt|=KMediaAttPageable;
caps.iDriveAtt|=KDriveAttPageable;
if(aDrive.iPagingDrv)
} caps.iDriveAtt|=KDriveAttPageable;
}

Additionally, the TLocalDriveCaps::iDriveAtt member must have the


KDriveAttLocal and KDriveAttInternal flags set, and the KDriveAt-
tRemovable flag cleared. Demand paging is only supported for internal non-
removable media.

5.4.4 Handling Paging Requests


You need to handle four new request types to support paging; the enumeration is
TPagingRequestId in the DMediaPagingDevice class.

• ERomPageInRequest – treat this as a normal read except that the


position stored in the request is the offset from the start of the XIP ROM
image, not the start of the media. This is because the local media sub-
system has no way of knowing the absolute position of a particular XIP
ROM page from the start of the media. Also, to write the data back to the
Enabling Demand Paging on a New Platform 113

client, use TLocDrvRequest::WriteToPageHandler() instead of


TLocDrvRequest::WriteRemote().
• ECodePageInRequest – treat this as a normal read, but use
TLocDrvRequest::WriteToPageHandler() instead of
TLocDrvRequest::WriteRemote() to write data back to the client.
The position in the request is the offset from the start of the media, as for
a normal read.
• EWriteRequestFragment, EWriteRequestFragmentLast – these
requests mark the start, middle or end of a sequence of writes. Each se-
quence is terminated by a EWriteRequestFragmentLast request (so
long as one of the previous requests does not complete with an error).

5.4.5 Coping with Fragmented Write Requests


In many respects, you can treat EWriteRequestFragment and EWriteRe-
questFragmentLast as normal write requests. However, you should note
that any of these write requests may be interleaved with requests from other
file-server drive threads (assuming the media supports more than one partition)
– which could be seen as a functional break in behavior. If you need to maintain
backwards compatibility, and prevent write requests from being interleaved in this
way, it is up to the media driver itself to keep track of the ‘current’ write request
chain and defer requests from other drive threads while a write fragment chain is
in progress. To achieve this, two steps are necessary:

1. Ensure the local media subsystem LDD (elocd.ldd) has been built with
the __ALLOW_CONCURRENT_FRAGMENTATION__ macro undefined. This
ensures that the local media subsystem never issues more than one write
fragment at a time.
2. Change the paging media driver so that it keeps track of write request
chains and defers any read or format requests received after the first frag-
ment and before the last in a sequence. Note that write fragments should
never be deferred.

One way in which you could implement step two is for the media driver to main-
tain a bit mask, with each bit representing a ‘write fragment in progress’ flag for a
particular drive. For example:
114 Demand Paging on Symbian

iFragmenting |= (0x1<<iCurrentReq->Drive()->iDriveNumber);
iFragmenting |= (0x1<<iCurrentReq->Drive()->iDriveNumber);

Then if a read or format request is received while any of the bits in


iFragmenting is set, then the request may be deferred.

5.4.6 Paging From an Internal MMC/SD Card

MMC PSL changes


You can enable ROM and code paging for an MMC card, provided the card is
non-removable. (If a page-in request was issued when the card was removed,
the kernel would fault.) Because the MMC media driver is entirely generic, we
need a way of returning the paging-related information contained in
variantmedia.def to the generic part of the MMC stack. We do this by modi-
fying the PSL layer of the MMC stack to implement the (new)
DMMCStack ::MDemandPagingInfo interface method, as shown in the follow-
ing code block (new code is in bold italics).
// mmc.h
// mmc.h
class
classDMMCStack : :
DMMCStack public
publicDBase
DBase
{
{
public:
public:
...
...
// Demand
// Demand paging
paging support
support
// see
// see KInterfaceDemandPagingInfo
KInterfaceDemandPagingInfo
class TDemandPagingInfo
class TDemandPagingInfo
{
{
public:
public:
const TInt*
const TInt* iPagingDriveList;
iPagingDriveList;
TInt iDriveCount;
TInt iDriveCount;
TUint iPagingType;
TUint iPagingType;
TInt iReadShift;
TInt iReadShift;
TUint iNumPages;
TUint iNumPages;
TBool iWriteProtected;
TBool iWriteProtected;
TUint iSpare[3];
TUint iSpare[3];
};
};

class MDemandPagingInfo
class MDemandPagingInfo
{
{
Enabling Demand Paging on a New Platform 115

public:
virtual TInt DemandPagingInfo(TDemandPagingInfo& aInfo) = 0;
};
...
};

Here is an example, taken from the H4 HRP:

variantmedia.def changes (shown in bold italics):


Variantparameters
Variant parametersfor
forthetheMMC
MMCController
Controller(EPBUSMMC.DLL)
(EPBUSMMC.DLL)
#define MMC_DRIVECOUNT
#define MMC_DRIVECOUNT 1 1
#define MMC_DRIVELIST
#define MMC_DRIVELIST 1 1
#define MMC_NUMMEDIA
#define MMC_NUMMEDIA 11
#define MMC_DRIVENAME
#define MMC_DRIVENAME “MultiMediaCard0”
“MultiMediaCard0”

#define MMC_PAGING_TYPE
#define MMC_PAGING_TYPE DPagingDevice::ERom |
DPagingDevice::ERom |


DPagingDevice::ECode
DPagingDevice::ECode
#define MMC_PAGEDRIVELIST
#define MMC_PAGEDRIVELIST 1
1 //
// code
code paging
paging from
from user
user data
data
#define MMC_PAGEDRIVECOUNT 1
#define MMC_PAGEDRIVECOUNT 1
#define MMC_NUM_PAGES
#define MMC_NUM_PAGES 8
8

H4 MMC stack class definition (changes shown in bold italics):


DMMCStack::MDemandPagingInfo
DMMCStack::MDemandPagingInfo
{
{
public:
public:
virtual TInt DemandPagingInfo(DMMCStack::TDemandPagingInfo& aInfo);
virtual TInt DemandPagingInfo(DMMCStack::TDemandPagingInfo& aInfo);
};
};
class DOmapMMCStack : public DCardStack
class
{ DOmapMMCStack : public DCardStack
{
public:
public:
virtual void GetInterface(TInterfaceId aInterfaceId,
virtual void GetInterface(TInterfaceId aInterfaceId,
MInterface*& aInterfacePtr);
... MInterface*& aInterfacePtr);
...
private:
private:
DDemandPagingInfo* iDemandPagingInfo;
DDemandPagingInfo*
... iDemandPagingInfo;
...
};
};
116 Demand Paging on Symbian

H4 MMC stack class implementation:


TInt DOmapMMCStack::Init()
{
if((iDemandPagingInfo = new DDemandPagingInfo()) == NULL
return KErrNoMemory;
}
void DOmapMMCStack::GetInterface(TInterfaceId aInterfaceId,
MInterface*& aInterfacePtr)
{
if (aInterfaceId==KInterfaceDemandPagingInfo)
aInterfacePtr=(DMMCStack::MInterface*)iDemandPagingInfo;
}
TInt DDemandPagingInfo::DemandPagingInfo(DMMCStack::TDemandPagi
ngInfo& aDemandPagingInfo)
{
static const TInt pagingDriveNumbers[MMC_PAGEDRIVECOUNT] =
{MMC_PAGEDRIVELIST+1};

aDemandPagingInfo.iPagingDriveList = pagingDriveNumbers;
aDemandPagingInfo.iDriveCount = MMC_PAGEDRIVECOUNT;
aDemandPagingInfo.iPagingType = MMC_PAGING_TYPE;
aDemandPagingInfo.iReadShift = 9;
aDemandPagingInfo.iNumPages = MMC_NUM_PAGES;
return KErrNone;
}
}

Preparing an internal MMC card for ROM paging – MMCLoader


The MMCLoader utility can be found in e32utils/mmcloader. It is used to
write a ROM image to the internal MMC card, ready for paging. The syntax is as
follows:

mmcloader <RomSrcFileName> <UnPagedRomDstFileName> <PagedRomDstFileName>

For example:

mmcloader
mmcloader z:\\core.img
z:\\core.img d:\\sys$rom.bin
d:\\sys$rom.bin d:\\sys$rom.pag
d:\\sys$rom.pag

MMCLoader performs the following steps:


Enabling Demand Paging on a New Platform 117

1. Splits RomSrcFileName into non-paged and paged files


2. Formats the MMC card
3. Writes the paged part of the ROM to a standard FAT image file on the
MMC card
4. Checks that the file’s sectors are contiguous (which should normally be
case because the card has just been formatted)
5. Stores a pointer to the image file in the boot sector.

Then, when the board is rebooted, the MMC/SD media driver reads the boot sec-
tor and uses the pointer stored to determine the location of the image file, so that
it can begin to satisfy paging requests.

Modifying EStart
Now we need to prevent the paged and unpaged image files from being uninten-
tionally deleted from the internal MMC drive. To support this, Symbian has added
a new mechanism to EStart to allow it to permanently clamp the image files. The
variant part of EStart must now implement a new virtual function that returns the
image file names:

Tint TFSStartup::SysFileNames(RArray<TPtrC>& aFileNames);


Here is an example taken from the H4 variant layer (in
\omap_hrp\h4\estart\estartmain.cpp):

// Return the filenames of any “System” files on


// Return the filenames of any “System” files on
// a writeable drive (e.g internal MMC).
// a writeable drive (e.g internal MMC).
// If the files are found, then they are clamped
// If the files are found, then they are clamped
// (and never unclamped) to prevent them
// (and never unclamped) to prevent them
// from being overwritten.
// from being overwritten.
TInt TH4FSStartup::SysFileNames(RArray<TPtrC>& aFileNames)
TInt TH4FSStartup::SysFileNames(RArray<TPtrC>& aFileNames)
{
{
_LIT(KPagedRomFileName,”\\SYS$ROM.BIN”);
_LIT(KPagedRomFileName,”\\SYS$ROM.BIN”);
aFileNames.Append(KPagedRomFileName());
aFileNames.Append(KPagedRomFileName());
_LIT(KUnPagedRomFileName,”\\SYS$ROM.PAG”);
_LIT(KUnPagedRomFileName,”\\SYS$ROM.PAG”);
aFileNames.Append(KUnPagedRomFileName());
aFileNames.Append(KUnPagedRomFileName());
return KErrNone;
return KErrNone;
}
}
118 Demand Paging on Symbian

5.5 Implementing File Clamping


To implement support for file clamping in other file systems, the file system mount
class (derived from CMountCB) must implement the MFileAccessor interface.
This requires that:

• A call to GetInterface() with CMountCB::EFileAccessor updates


the fileAccessor argument to point to the mount class.
• A call to GetFileUniqueId() updates the aUniqueId argument to a
valid, file-specific identifier.
Here are examples for the ROFS:

TInt CRofsMountCB::GetInterface(TInt aInterfaceId,


TInt CRofsMountCB::GetInterface(TInt aInterfaceId,
TAny*& aInterface, TAny* aInput)
TAny*& aInterface, TAny* aInput)
{
{
TInt r= KErrNone;
TInt r= KErrNone;
switch(aInterfaceId)
switch(aInterfaceId)
{
{
case (CMountCB::EFileAccessor):
case (CMountCB::EFileAccessor):
((CMountCB::MFileAccessor*&) aInterface) = this;
((CMountCB::MFileAccessor*&) aInterface) = this;
break;
break;
...
...
}
}
TInt CRofsMountCB::GetFileUniqueId(const TDesC& aName,
TInt CRofsMountCB::GetFileUniqueId(const TDesC& aName,
TInt64& aUniqueId)
TInt64& aUniqueId)
{// Get unique identifier for the file
{// Get unique identifier for the file
const TRofsEntry* entry=NULL;
const TRofsEntry* entry=NULL;
TInt err;
TInt err;
TRAP(err,iDirectoryCache->FindFileEntryL(aName, entry));
TRAP(err,iDirectoryCache->FindFileEntryL(aName, entry));
if(err!=KErrNone)
if(err!=KErrNone)
return err;
return err;
aUniqueId = MAKE_TINT64(0,entry->iFileAddress);
aUniqueId = MAKE_TINT64(0,entry->iFileAddress);
return KErrNone;
return KErrNone;
}
}
If pseudo clamping is required (file content will not be modified, nor will dismount
be attempted, but the kernel is required to load executables from the file system),
then a ‘random’ value may be provided for aUniqueId – an example of this is
Enabling Demand Paging on a New Platform 119

provided by the ROM and composite file systems.

In addition, the file system implementations of the methods affected by file clamp-
ing must check for the existence of file clamps. An example of this in writeable file
systems is provided in the FAT code. ROFS provides an example for a read-only
file system. Here is the FAT DeleteL() method:

void
void CFatMountCB::DeleteL(const
CFatMountCB::DeleteL(const TDesC&
TDesC& aName)
aName)
{
{
__PRINT(_L(“CFatMountCB::DeleteL”));
__PRINT(_L(“CFatMountCB::DeleteL”));
CheckStateConsistentL();
CheckStateConsistentL();
CheckWritableL();
CheckWritableL();
TFatDirEntry
TFatDirEntry fileEntry;
fileEntry;
TEntryPos
TEntryPos fileEntryPos(RootIndicator(),0);
fileEntryPos(RootIndicator(),0);
FindEntryStartL(aName,KEntryAttMaskSupported,fileEntry,
FindEntryStartL(aName,KEntryAttMaskSupported,fileEntry,

fileEntryPos);
fileEntryPos);
TEntryPos dosEntryPos=fileEntryPos;
TEntryPos dosEntryPos=fileEntryPos;
TFatDirEntry
TFatDirEntry dosEntry=fileEntry;
dosEntry=fileEntry;
MoveToDosEntryL(dosEntryPos,dosEntry);
MoveToDosEntryL(dosEntryPos,dosEntry);
if
if ((dosEntry.Attributes()&KEntryAttReadOnly)
((dosEntry.Attributes()&KEntryAttReadOnly) || ||
(dosEntry.Attributes()&KEntryAttDir))
(dosEntry.Attributes()&KEntryAttDir))
User::Leave(KErrAccessDenied);
User::Leave(KErrAccessDenied);
//
// Can
Can not
not delete
delete a
a file
file if
if it
it is
is clamped
clamped
CMountCB* basePtr=(CMountCB*)this;
CMountCB* basePtr=(CMountCB*)this;
TInt
TInt startCluster=StartCluster(dosEntry);
startCluster=StartCluster(dosEntry);
if(basePtr->IsFileClamped(MAKE_TINT64(0,startCluster))
if(basePtr->IsFileClamped(MAKE_TINT64(0,startCluster)) >
> 0)
0)
User::Leave(KErrInUse);
User::Leave(KErrInUse);
EraseDirEntryL(fileEntryPos,fileEntry);
EraseDirEntryL(fileEntryPos,fileEntry);
FAT().FreeClusterListL(StartCluster(dosEntry));
FAT().FreeClusterListL(StartCluster(dosEntry));
FAT().FlushL();
FAT().FlushL();
}
}
120 Demand Paging on Symbian

5.6 System-Wide Impact of Demand Paging


5.6.1 Binary Compatibility Impact for All Systems
The demand paging functionality added to Symbian is largely transparent on plat-
forms that do not support it. However, there are some subtle binary-compatibility
(BC) breaks that do affect all platforms built on Symbian OS v9.3. These are
listed in the following sections, together with their Symbian break request (BR)
number.

BR1924: Bootstrap changes for demand paging


Symbian has modified the data structures used for kernel memory manage-
ment to support demand paging. This has necessitated changes to the platform-
independent part of the Symbian bootstrap to make use of these structures.
Also, Symbian now copies just the unpaged part of the core ROM image to RAM,
rather than the entire core image.

Since these changes are compiled into the platform-specific bootstrap, you will
need to rebuild it.

BR1982: Kernel-side read from user memory must not occur while holding
a mutex
As discussed earlier, kernel-side code must not read from paged (user-side)
memory while the current thread holds any mutexes. The kernel functions that
access user memory have been changed in debug builds to assert some (but not
all) of the new restrictions.

It is possible (though unlikely) that existing kernel-side code may panic in debug
builds if it doesn’t conform to the new restrictions. In release builds, it may inter-
mittently hang.

BR1988: Device driver deferred function call (DFC) queue migration


When Symbian migrated device drivers to use their own DFC queues, we added
a new pure virtual method to the sound driver, and the digitizer must now initialize
a new iDfcQ member variable.

The required changes are explained in Sections 5.3.8.2 and 5.3.8.3.


Enabling Demand Paging on a New Platform 121

BR1991: USB DFC queue performance improvement


Now Symbian device drivers use their own DFC queues, the same is true of
USB, which has its own separate DFC queue. (Originally, USB made use of
DfcQue0.) Also, the DFC queue initialization has moved from the platform-inde-
pendent layer to the platform-dependent layer.
The required changes are explained in Section 5.3.8.1.

5.6.2 Binary Compatibility Impact for Systems with Demand Paging


Switched On
In addition to the BC breaks listed in the previous section, there are other BC is-
sues that are only relevant when demand paging is switched on. I discuss these
next.

Behavior of file modification operations on paged executables


If an executable on non-XIP media is unpaged, the loader loads the entire ex-
ecutable from storage media into RAM when it is first accessed. The kernel does
not need to access the executable on the storage media again (unless the ex-
ecutable is unloaded and loaded again). This means that any attempt to modify
the executable while it is loaded will succeed.

If an executable on non-XIP media is code paged, the loader loads individual


pages of the executable from media as they are accessed, one by one. It is im-
portant that the partially loaded executable is not modified on media. To prevent
this, the executable is ‘clamped’ while any part of it is being used by the paging
subsystem. Any attempt to modify the ‘clamped’ executable will fail with
KErrInUse.

The impact of the above is that any code that modifies RAM-loaded executables may
now unexpectedly fail. Since executables are stored in the \sys\bin\ directory,
and only components with the TCB capability can modify or delete files in this
directory, this compatibility issue should be limited to a very few components (for
example, software install, debuggers and possibly some Java code).

Symbian has introduced a new API called RLoader::Delete() to support


the deletion of code-paged executables. This is used by the Symbian software
installer instead of RFs::Delete() when uninstalling components. This API
should also be used by other components affected by this issue. When using the
122 Demand Paging on Symbian

new function, the system keeps track of any pages that are in use within the ex-
ecutable to be deleted, and only deletes the executable when it is no longer used
by the paging subsystem. As a result, disk space may not be released until some
time after the call completes.

Modification of read-only code areas


This issue affects any code that writes into code chunks or an XIP code area. If
the target code area is paged, the page will eventually be evicted from the paging
cache, causing any code modifications to be forgotten. In practice this problem
is very rare, since the DebugSupport API is available to support kernel-side
components (such as debuggers) that need to modify code. This DebugSupport
API works transparently with demand paging. Potentially, any code that uses
Cache::IMB_Range() is affected but there may be other causes.

You should modify any source that writes into code chunks or an XIP code area,
so that it makes its own copy of the code before writing back into it. For some
components, such as FOTA clients, this may not be possible. In these cases, you
will have to take application-specific measures to ensure the code you are writing
to is not paged.

Visibility of paged data for tools


This issue affects any tools that make assumptions about the visibility of code
segments or an XIP ROM area. Existing stop-mode debuggers with Symbian
awareness assume that:

• XIP ROM memory is visible at all times after the system has booted.
• RAM-loaded code segments are visible at all times.

Demand paging makes these code areas intermittently unavailable – if your tool
accesses them at such a time, it will be faulted by the MMU. This means that
existing stop-mode debuggers (or similar tools) will not work reliably on demand-
paged ROMs.

Image format changes for tools


The XIP ROM image format is now different, since it needs to support a paged
area of XIP ROM, which can be byte-pair compressed, as can executable files.
Any tools that parse ROM images or dynamically analyze executables will need
Enabling Demand Paging on a New Platform 123

to be changed to support the new format. This also affects any tool that updates
the XIP ROM image (such as FOTA clients).

It is likely that most tools affected by this are contained within a software develop-
ment context and so the impact is likely to be limited.

Compatibility of installed executables


With the introduction of code paging in Symbian OS v9.3, the possibility arises of
third parties developing software on an older SDK, which is installed on a de-
mand-paged device, or vice versa. Whether these executables run as expected
depends on a number of factors:

• Whether the device supports code paging (Symbian OS v9.3+).


• Whether the device supports the byte-pair compression format (Symbian
OS v9.2+).
• What the compression format of the executables is. This will probably be
the default compression format used by the SDK build tools, since de-
velopers are unlikely to change this. The Symbian build tools compress
executables in the ‘deflate’ format by default.
• Whether the executables are marked as paged or unpaged. The Symbian
build tools automatically compress executables marked as paged in the
byte-pair format (unless they are explicitly made uncompressed).

• Whether the loader demand-pages executables that are neither marked


as paged nor as unpaged. The default Symbian configuration does de-
mand page unmarked executables.

Table 4 below lists a number of possible scenarios and their impact. Scenario 1 is
the default case if the device manufacturer does not alter the Symbian build tools.
Scenarios 2 to 4 look at what happens if the device manufacturer makes different
modifications to those. Please note that scenarios 5 to 7 require support in Sym-
bian that does not exist at the time of writing, and these scenarios are included
for completeness only.
124 Demand Paging on Symbian

Table 4

Scenario Impact

1 Unmodified Symbian tools. Third-party executables are compatible


Device manufacturer build tools with pre-Symbian OS v9.2 devices.
compress executables in the ‘deflate’ Executables are not paged by default.
format by default. The developer has to explicitly mark
executables as paged or change the
compression format to enable paging.
Not all developers will choose to do
this, so RAM savings for third-party
code will be reduced. The additional
RAM usage may affect the perfor-
mance
of other software in the device, includ-
ing ROM-based software.

2 Device manufacturer build tools com- Most third-party software is auto-


press executables in the ‘byte-pair’ matically paged. The developer has
format by default. to make a conscious effort to dis-
able demand paging. RAM savings
are maximized. Disk usage is slightly
increased. Executables will not run on
pre-Symbian OS v9.2 devices.

3 Device manufacturer build tools Third-party executables are compatible


leave executables uncompressed by with pre-Symbian OS v9.2 devices.
default. Most third-party software is automati-
cally paged. Default size of executa-
bles
on disk increases by ~50%. The
developer can explicitly compress
executables if required. SIS files will be
approximately the same size since they
are compressed.
Enabling Demand Paging on a New Platform 125

4 Hybrid solution. Device Allows compatibility with pre-Symbian


manufacturer build tools build ‘de- OS v9.2 devices up to a certain point
flate’ format executables by default in time. At that time, there will be more
then switch to ‘byte-pair’ at some third-party software that doesn’t sup-
future point. port demand paging compared to an
early switch.

5 Device manufacturer build tools build Not supported at the time of


‘deflate’ executables by default and writing. Would also have platform
the Symbian software installer con- security implications.
verts these to ‘byte-pair’ at installation
time on Symbian OS v9.3+ devices.

6 ‘Fat’ SIS solution. Device manufactur- Not supported at the time of writing.
er tools build both ‘deflate’ and ‘byte- Larger SIS files.
pair’ binaries. The MAKESIS tool
puts both in the SIS package, and the
Symbian software installer selects the
appropriate binary at install time.

7 Patch pre-Symbian OS v9.2 devices Not supported at the time of writing.


that don’t support the byte-pair com-
pression format.
126 Demand Paging on Symbian
Component Evaluation for Demand Paging 127

6
Component Evaluation for Demand Paging

This chapter describes how you might evaluate the impact of demand paging on
a component or a group of components, and how to mitigate any negative im-
pact. The processes that are described here are heavily influenced by the Sym-
bian system-wide evaluation that was carried out during the prototype phase of
demand paging.

You are encouraged to use these ideas in whole or in part for your own com-
ponents. At a minimum, the paging categories of Symbian-owned components
(Section 6.4) must be respected.

6.1 Static Analysis

You should analyze several aspects of the component architecture before even
considering demand paging. We first consider static analysis techniques, before
moving on to dynamic analysis in the next section.

6.1.1 Dependency Information


First, list all the executables belonging to the component under evaluation,
together with their static dependencies. For each executable, note its size and
pageability (if known). Do the same for all key components that are statically
linked to executables in the component.

Also note dynamic dependency information – for example, whether the compo-
nent is a plug-in to a framework or is a framework itself.

This information is an important input into subsequent sections. It enables you to


identify the key interactions between components and to identify the executables
that are likely to be affected most by demand paging.
128 Demand Paging on Symbian

6.1.2 Use-Case Analysis


List all the real-time and performance-critical use cases for the component, including
which individual executables are used and to what extent. ‘Real-time’ here does
not mean hard real-time – it also includes use cases that lead to an acceptable
user perception of responsiveness. Here are some examples of real-time use
cases:

• Video playback. Dropped frames are not acceptable.


• VoIP phone call. Audio quality must be maintained.
• File download over USB. Unbounded packet-response times could cause
a catastrophic performance drop.

The performance-critical category contains use cases that are benchmarked to


measure the overall performance of the component (or the OS as a whole). For
instance:

• Standard boot time


• Application start-up time
• Camera image-capture time.

This category could also include complex compound use cases such as ‘Receive
text message while playing mp3 audio.’

Other use cases may have a benchmarked performance that is important to the
component owner but would not be considered real-time or performance critical
from a system-wide perspective. For example: ‘time taken to notify user of text
message reception.’ In this case, it is unlikely that the user would notice any time
delay caused by demand paging because the text message is processed in the
background.

There will also be some use cases that fall into grey areas and their importance
may be somewhat subjective. In these cases, it’s probably appropriate for the
component owner to negotiate with the system-wide design authority.
Component Evaluation for Demand Paging 129

6.1.3 IPC Analysis


Another important piece of information is whether paged clients interact with real-
time or performance-critical servers – reading paged data may have a negative
impact on the performance guarantees of the server. Note down any cases in
which a server reads paged data from a client’s address space. Paged data is
RAM-loaded code and any read-only XIP data structures – for example, XIP bit-
maps, constant descriptors, constant data arrays in code, XIP data files accessed
through a pointer or exported DLL data.

The list should also include any cases of custom IPC architectures where
one thread reads from another thread’s address space, for instance by using
RMessage::Read().

6.1.4 Analysis of Components Affected by BC issues


Also list any components affected by the binary compatibility issues mentioned in
Section 5.6.2.

6.2 Dynamic Analysis

6.2.1 Functional Equivalence


The most important practical test is whether the test code of the component
under evaluation has the same pass rate on a demand-paging ROM as a non-
demand-paging ROM. You should choose the most stressed configuration pos-
sible – the more stressed the configuration, the greater the confidence in the
robustness of the component. You should, at the very least, ensure that all the
executables under evaluation are paged. For evaluation purposes, you should
limit the maximum paging cache size to simulate OOM behavior. In general, a
sensible maximum cache used during evaluation translates into a sensible mini-
mum cache size to use in production. Some experimentation may be needed with
various configurations.

If the pass rate is not as high as expected, then you need to either fix the code
or understand the reasons for the failure. If an immediate fix is not possible,
then the demand-paging configuration should be relaxed (either by making more
dependent components unpaged or increasing the minimum/maximum size of the
paging cache), to establish the point at which functional equivalence is achieved.
130 Demand Paging on Symbian

Note this is a temporary measure for evaluation: eventually all known defects
exposed by demand paging should be fixed.

Marking a component unpaged to make it more robust should only be done in


extenuating circumstances and increasing the minimum paging cache size for
this reason is never a good idea.

6.2.2 Performance Data


Your aim here is to characterize the RAM footprint versus performance trade-off
for the component – that is, to find the minimum acceptable performance configu-
ration. This allows use-case-specific optimization and gives you an indication of
the performance profile as free RAM declines.

You can use your existing benchmarking code for the component, or write new
code specifically for demand paging. If you don’t have any benchmarking then
re-running existing test code with time stamping enabled may provide you with
sufficient information.

You should run the tests using several demand-paging configurations with dif-
ferent maximum paging cache sizes (to simulate OOM behavior). A higher num-
ber of different configurations gives a more accurate picture of the performance
impact of demand paging.

When you make a graph of maximum paging cache size versus performance,
there is often a point at which performance drops off dramatically. This indicates
that page thrashing is occurring. Sometimes this is so dramatic that you will need
a logarithmic scale on one or both axes to determine the drop-off point. At other
times, the performance drop-off is more gradual, indicating that the use case is
less sensitive to page faults.

Figure 10 gives some example performance data. The performance profiles for
two use cases, A and B, are presented in two different ways. The top graph has
linear axes and appears to show that performance for use case A drops sharply
as the maximum paging cache goes below 96 pages. For use case B, perfor-
mance drops less sharply at around 128 pages. The bottom graph has a logarith-
mic Y-axis and shows that the drop-off point for use case A is actually nearer 160
pages. For use case B, no additional information is revealed.
Component Evaluation for Demand Paging 131
132 Demand Paging on Symbian

Figure 10: The change in performance with maximum paging cache size for two
use cases. The same data is presented in both graphs, using a linear Y-axis (top)
and a logarithmic Y-axis (bottom).
Component Evaluation for Demand Paging 133

6.2.3 Demand-Paging Logs


The logging tools described in Chapter 8 enable the production of demand-pag-
ing logs while running component tests. Analysis of these logs may reveal page
thrashing behavior and dynamic dependency information. You could try making
a key executable unpaged and re-running the tests to give better results. For
example, the functional pass rate may be improved or the performance drop-off
may occur at a lower cache size and/or be less dramatic.

6.3 Identifying Demand-Paging Problems and Mitigation


Techniques

Some demand-paging problems identified during evaluation, such as defects,


may have a clear solution. Others may require more work and I’ll discuss those
next.

6.3.1. Protecting Real-Time and Performance-Critical Code Paths


Ideally, the static analysis (Section 6.1) will make clear the separation between
the code paths involved in real-time or performance-critical use cases, and those
that are not – for example, the separation of data and control planes in a commu-
nications protocol. You must ensure that you protect the former from page faults
due to the unbounded and unpredictable cost of servicing the fault. There are two
possible strategies for doing this:

• Ensure the minimum paging cache size is large enough to accommodate


the protected code path and any other paged data required at the same
time, for all use cases.
• Make the protected code path unpaged.

In practice, option one is difficult to achieve because there is usually a compound


use case that means more data enters the paging cache than you had budgeted
for. For example, you might choose the minimum paging cache size so that it is
large enough to accommodate all the code required for audio playback. Then, in
testing, you might find that ordinary audio playback works fine, but problems oc-
cur when the user navigates the file system at the same time. This is because the
code executed when navigating the file system ejects some of the code executed
when paying audio from the paging cache, inducing additional page faults. These
134 Demand Paging on Symbian

page faults result in the audio buffers not filling in time and audio playback
stutters.

In general it is not wise to allow real-time or performance-critical code (such as


audio playback) to compete for space in the paging cache with non-critical code
(such as navigating the file system). We recommend option two, which has the
further advantage that it may well cost less in terms of RAM than option one.

6.3.2 Improving the Component Architecture


Your analysis may also show up deficiencies in the component architecture. For
example, there may not be a clear separation between the data plane and con-
trol plane, or there may be a monolithic library that contains just a small amount
of performance-critical code. In these cases, the ideal solution is to redesign the
component appropriately or split the monolithic library into two parts: one paged,
one unpaged. Then you only need set the minimum of code to be unpaged. This
re-factoring may produce benefits unrelated to demand paging, such as breaking
unnecessary static dependencies and making the component easier to maintain.

Conversely, demand paging may help you to disguise architectural problems,


when it means lazily loading only the code currently in use. A monolithic library
with many static dependencies may only need to be partially loaded and some
dependencies may not need to be loaded at all. However, demand paging should
not be used as a cure for these kinds of problems – there is no substitute for
good architecture.

Major component redesign may be impractical in the short term. In this case, you
may temporarily have to set large parts of the component to be unpaged. This
greatly reduces RAM-saving potential.

6.3.3 Building a Candidate Unpaged List


One of the outputs from the static analysis (Section 6.1) should be an initial can-
didate list of unpaged files. Dynamic analysis (Section 6.2) enables you to build
on that by refining the unpaged list. For example, you may find that an execut-
able that you think is performance critical is rarely used in practice. Conversely, a
seemingly innocuous library may actually contain a utility function that is heavily
used by real-time code.
Component Evaluation for Demand Paging 135

Dynamic analysis may also reveal that some executables are paged in for much
of the time, despite not being involved in any real-time or performance-critical
use cases. In this case, since the executable is always in RAM, you might gain
by making the executable unpaged and reducing the minimum size of the paging
cache accordingly.

As well as identifying unpaged files within the component, your evaluation may
give you data on other dependent components, possibly requiring those com-
ponents to be unpaged to meet certain performance guarantees. For example,
an unpaged real-time server may require that all its third-party plug-ins are also
unpaged. It is important that these cross-component requirements are considered
from a system-wide perspective.

Once you have a candidate unpaged list, how to act upon it is a decision shared
between the component owner, the system architects of the platform and the
(software) customers of the platform (if any). A good unpaged list should
guarantee the robustness and functionality of the system irrespective of how
small the paging cache is.

6.3.4 Choosing a Minimum Paging Cache Size


I have previously noted that production devices should not limit the maximum
size of the paging cache. You do need to choose a minimum page cache size,
however, to guarantee a sensible minimum level of performance. This minimum
is heavily dependent on the amount of paged code in the system. A value that
is suitable for a device with a simple UI may be far too small for a device with a
complex UI, where there is a lot more paged code.

It is not practical to find an optimum minimum cache size from static analysis.
Device manufacturers will determine it empirically, according to the final contents
of the ROM and the performance requirements of the device. One way they might
do this is to build a performance profile (as in Section 6.2.2) using a selection of
the most code-intensive use cases on the device.
136 Demand Paging on Symbian

6.4 Symbian’s Pageability Categories

In Section 6.1.2, I mentioned that Symbian makes a distinction between real-time


and performance-critical use cases. This is to help to distinguish between those
files that must always be unpaged in all configurations and those that ought to
be unpaged in all configurations for the best performance, but don’t have to be.
These groups are called the ‘mandatory unpaged’ and ‘recommended unpaged’
lists respectively. The existence of the latter provides some flexibility to customers
who wish to save more RAM at the expense of performance.

The complete list of categories and their inclusion criteria are described in Sec-
tions 6.4.1 to 6.4.6. Symbian now maintains a list of the all files in Symbian and
their demand-paging categories, and monitors it to ensure conformance.

6.4.1 Kernel Unpaged


This group defines all the kernel-side files that are implicitly unpaged in all de-
mand paging configurations. No action needs to be taken to make these files
unpaged but it is useful to separate them from other classifications for audit
purposes.

6.4.2 Mandatory Unpaged


This category consists of the files that are explicitly made unpaged in all demand
paging configurations. There are several criteria for being in this category, de-
scribed in Table 5 on the following page.
Component Evaluation for Demand Paging 137

Table 5

Criterion Description
Device stability When the file is paged, the device is
unstable. The instability should be fixed
by normal coding methods where possible.

Functional equivalence When the file is paged, there is a functional


failure. The failure should be fixed by normal
coding methods where possible. Being involved
in a real-time use case is a valid reason for being
in this
category.
Permanent presence When the file is paged, its contents are largely
present in the paging cache for all use cases.
Therefore there is little or no benefit to paging it.

General performance When the file is paged, performance for all use
cases is degraded due to page thrashing.
Security It is necessary to prevent the file from paging for
security reasons (for example, the file is located
in a special area or must be excluded from any
integrity checks made while paging in).

Power management When the file is paged, battery life is reduced


unacceptably for all devices.

There is some overlap between these criteria. For instance, it is likely that any file
that is made mandatory unpaged due to ‘permanent presence’ will also satisfy
‘general performance’. The list of criteria is not exhaustive and may expand in the
future.

6.4.3 Recommended Unpaged


This category is for those files that should be unpaged to sustain or improve
existing level of performance for performance-critical use cases, where this term
138 Demand Paging on Symbian

has the same definition as in Section 6.1.2. Note that paging can improve per-
formance in some use cases so there should be evidence that making a file (or
group of files) unpaged is better than making it (or them) paged.

6.4.4 Test/Reference Unpaged


In this category, place any test files that you as the component owner feel should
be unpaged. This category is also for any reference code (such as a plug-in)
where the equivalent production code would be unpaged. You should use the
same criteria as in Sections 6.4.2 and 6.4.3, but the level of justification required
for being in this category is lower.

6.4.5 Optional Unpaged


This category contains the files that you could set as unpaged to sustain or
improve the existing level of performance for any other use cases not covered in
Sections 6.4.2 and 6.4.3. By default, this category is not treated any differently to
the ‘paged’ category but it is helpful to separate out the files that you might make
unpaged if you wanted to improve performance. Within this section, it is helpful
to group files by use case in case any customer has strict performance require-
ments for those use cases.

6.4.6 Paged
This is the ‘catch-all’ category for files that don’t fit into any of the other catego-
ries. By default, all files will be paged unless otherwise stated.
Configuring Demand Paging on a Device 139

7
Configuring Demand Paging on a Device

This chapter describes how to use and configure demand paging, assuming that
the platform already has support for it enabled. I will cover the most sensible
ways to switch on demand paging for XIP ROM images and executables.1

7.1 Building a Basic Demand-Paged XIP ROM

This section discusses the configuration changes necessary for introducing de-
mand paging to a basic demand-page ROM.

7.1.1 OBY File


At least three OBY file keywords should be present to create a demand-paged
XIP ROM and I describe these in the following sections. Only the first is manda-
tory but the others are usually required in any meaningful setup. It is important
these keywords are applied to the core ROM image rather than any ROFS im-
ages. You can make sure of this by enclosing the keywords in a block as follows:

ROM_IMAGE[0] {
ROM_IMAGE[0] {
<Basic demand paging keywords>
<Basic demand paging keywords>
}
}

In addition to the new keywords, the location of files in the ROM must be con-
sidered. I will describe this in Section 7.1.2, following it with an XIP ROM paging
example.

pagedrom keyword
The pagedrom keyword takes no arguments. It instructs ROMBUILD to sort the
1 For further information about the basic usage of demand-paging tools and keywords,
please consult the Symbian Developer Library documentation at developer.symbian.
org/sfdl
140 Demand Paging on Symbian

contents of the core ROM image so that all the unpaged files appear at the start
of the image, followed by all the paged files. This is so that the kernel can copy
the unpaged part of the image into RAM during boot, while leaving the rest of the
image to be paged into RAM on demand (see Figure 2 on page 11).

If the keyword compress is also specified, then the paged part of the image
will be compressed using the byte-pair algorithm. The unpaged part of the im-
age remains uncompressed. Contrast this with the behavior of compress when
pagedrom is not specified. In that case, the entire image is compressed using
the deflate algorithm.

pagingoverride keyword
The pagingoverride keyword determines the pageability of the executables in
the ROM/ROFS section it is defined in. It is operated on by ROMBUILD/ROFS-
BUILD and takes a single argument, which can be one of those shown in Table 6:

Table 6

Argument Effect
nopaging non- Marks all executables unpaged, irrespective of whether they are
aging already marked as paged or unpaged in their MMP file.
Marks all executables paged, irrespective of whether they are
alwayspage already marked as paged or unpaged in their MMP file. This can be
useful for debugging or analysis.
All executables that are neither marked as paged or unpaged in their
defaultunpaged
MMP file are marked as unpaged.
All executables that are neither marked as paged or unpaged in their
defaultpaged
MMP file are marked as paged.

The default value of the keyword is nopaging, so it is important to specify a


different value if paging of executables is required. The most common value is
defaultpaged, because it is usually best to page all executables except those
that are marked otherwise.

Note this keyword has no effect on the pageability of non-executable files in an


XIP ROM. These will always be paged unless explicitly configured to be unpaged
(see Section 7.3).
Configuring Demand Paging on a Device 141

demandpagingconfig keyword
The demandpagingconfig keyword takes the following arguments in the order
shown in Table 7:

Table 7

Argument Effect
This is the minimum number of RAM pages to reserve for
the paging subsystem. The number must be at least equal
to 2*(YoungOldPageRatio+1). If a smaller number is
<MinLivePages>
specified, a number equal to this formula is used instead. If
zero is specified or the demandpagingconfig keyword is
missing, then a value of 256 is used.
The maximum number of RAM pages the paging subsys-
tem may use. The number must be greater than or equal to
MinLivePages. If zero is specified or the demandpag-
ingconfig keyword is missing, then the system uses the
<MaxLivePages>
maximum possible value (1,048,575). On a production
system, it should always be set to zero, so that as many
pages as possible are used. Low values may be used to
test paging under more stressed conditions.
The ratio of young to old pages maintained by the paging
<YoungOldPageRatio> subsystem. This is used to maintain the relative sizes of the
two live lists. The default value is three.

Some demandpagingconfig statements specify additional arguments to those


specified in Table 7, but these are now obsolete and will be ignored by the paging
subsystem.

7.1.2 Arranging the Core/ROFS for XIP ROM Paging


As we have seen in Section 3.3, a typical NAND ROM has a small core ROM im-
age and a large primary ROFS image. To make best use of XIP ROM paging, you
need to move any files that should be paged from the ROFS image to the core
image. In Section 7.4 I explain more sophisticated ways of doing this, but the
simplest method is to remove the ROFS altogether so that all files are in the core
ROM image.

Each platform can have its own way of configuring the primary ROFS, so there is
142 Demand Paging on Symbian

no generic way of removing it. On Symbian’s reference platform, you would con-
figure out the ‘ROM_IMAGE[1] {’ statement in \epoc32\rom\include\base.
iby, which defines the start of the ROFS section.

Basic XIP ROM paging example


Using the instructions in Section 7.1.1 with typical keyword arguments, a simple
demand-paged OBY file might look like this:

// MyDPConfig.oby
// MyDPConfig.oby
#if !defined PAGED_ROM
#define PAGED_ROM
#if !defined PAGED_ROM
#endif
#define PAGED_ROM
#endif
ROM_IMAGE[0] {
pagedrom
ROM_IMAGE[0] {
compress
pagedrom
// Min Max Young/Old
compress
// Live Live Page
// Min Max Young/Old
// Pages Pages Ratio
// Live Live Page
demandpagingconfig 512 32767 3
// Pages Pages Ratio
pagingoverride defaultpaged
demandpagingconfig 512 32767 3
}
pagingoverride defaultpaged
}

If the OBY file that defines the start of the primary ROFS image (for example,
base.iby) contains a section like this:

#if defined(_NAND) || defined(_NAND2)


REM Start of ROFS image
ROM_IMAGE[1] {
#endif

then it should be adjusted as follows (changes shown in bold italics):


#if defined(_NAND) || defined(_NAND2)
#if
#if defined(_NAND) || defined(_NAND2)
!defined PAGED_ROM
#if !defined PAGED_ROM
REM Start of ROFS image
REM Start of
ROM_IMAGE[1] ROFS
{ image
ROM_IMAGE[1] {
Configuring Demand Paging on a Device 143

#endif
#endif
#endif
#endif

To build a demand-paged ROM, simply include the new OBY file in the buil-
drom statement. For example, when using the Symbian Techview reference
platform on H4:

buildrom –D_NAND2 MyDPConfig h4hrp techview

It is important that MyDPConfig.oby appears before base.iby in the build-


rom command to ensure that the PAGED_ROM flag is acted upon; base.iby is
included by techview.oby in the example above. Alternatively, the flag can be
defined via ‘–DPAGED_ROM’ in the buildrom command and then the ordering of
OBY files is irrelevant.

It is possible that you will still end up with a small ROFS image being produced if
one of the included OBY/IBY files explicitly places files in the primary ROFS sec-
tion. However, most or all of the ROM contents will be in the core image.

7.2 Building a Basic Code-Paged ROM

When you are building a code-paged ROM, you can choose whether to enable
XIP ROM paging or not. The pagedrom keyword only affects XIP ROM paging,
but the demandpagingconfig keyword applies to code paging as well because
both types of demand paging use the same underlying paging cache.

There are two further things to consider, the pagingpolicy keyword and the
location of executables to be code paged (for example, in ROFS or in the user
data area).

7.2.1 pagingpolicy Keyword


The pagingpolicy keyword takes a single argument, which can be one of the
values specified in the pagingoverride table in Section 7.1.1.2. It sets a flag in
the core ROM that tells the loader, at runtime, what the code paging policy should
be. The default is nopaging so you need to change this for any code paging to
occur.
144 Demand Paging on Symbian

Note the difference between this keyword, which operates on files at runtime, and
pagingoverride, which operates at rombuild/rofsbuild time.
Note also that pagingpolicy only has any meaning in the core ROM image, so
it should be directed to ROMBUILD, not ROFSBUILD. You can ensure that this is
the case by enclosing the keyword in a ‘ROM_IMAGE[0] {}’ block.

7.2.2 Code Paging from ROFS


You will usually want to favor XIP ROM paging over code paging since the RAM
and performance overhead of code paging is generally higher. However, there
may be circumstances in which it is necessary to code page executables in
ROFS. For example:

• For testing purposes. To compare the functional equivalence and perfor-


mance of a code-paged system with an XIP-ROM-paged or non-demand-
paged system
• To reduce the size of the unpaged part of the core ROM image. If a paged
executable has a number of unpaged static dependencies, there may be a
RAM benefit to code paging the executable and placing its dependencies
in ROFS. This would be the case if the dependencies are not statically
linked to by other executables in the core image.

The easiest way to use code paging instead of XIP ROM paging is to ensure
paged executables are placed in a ROFS partition that supports code paging, in-
stead of in the core ROM image. This may involve reversing the decision to move
the contents of the ROFS to the core, as required for XIP ROM paging.

Furthermore, you need to add a pagingoverride keyword for each ROFS


partition that you want to code page from. Use a ‘ROM_IMAGE[<partition>]
{}’ block to distinguish each pagingoverride statement. As well as ensuring
that executables are correctly marked as paged or unpaged (rather than making
everything unpaged), this keyword implicitly ensures that executables are com-
pressed in the byte-pair format (if compression is specified). This may involve
decompressing executables from the default deflate format and recompressing
them it into the byte-pair format during the ROFSBUILD process.
Configuring Demand Paging on a Device 145

7.2.3 Code Paging from Internal Writeable Media


If you only want to code page from internal writeable media, such as the user
data drive, then no change is needed to the location of executables in ROM/
ROFS. However, simply specifying a pagingpolicy is not enough. Even if the
value given is defaultpaged, the default behavior of the Symbian build tools
is to compress executables using the deflate algorithm. If these executables are
installed into the \sys\bin\ directory and are still deflate compressed, they will
not be code paged.

You need to do one of the following for an executable to be code paged from
internal writeable media:

1. In the MMP file of the executable, explicitly specify the paged keyword.
This will implicitly ensure the executable is byte-pair compressed (or un-
compressed if compression is disabled).
2. Explicitly convert the executable to the byte-pair or uncompressed format.
For example, use the elftran command as follows:
3. elftran -compressionmethod bytepair \epoc32\release\
armv5\urel\mylibrary.dll.
4. Modify the build tools to compress executables in the byte-pair format by
default.
5. Modify the build tools to uncompress executables by default.

All these options except option four have BC implications (see Section 5.6.2 for
the BC impact on installed executables).

7.2.4 Basic Code-Paging Example


Using the instructions in Section 7.1.1, a typical OBY file that uses code paging
from ROFS might look as follows (changes from the XIP ROM paging example in
Section 7.1.2.1 are in bold italics):

// MyDPConfig.oby
// MyDPConfig.oby
#if !defined PAGED_ROM
#if !defined PAGED_ROM
#define PAGED_ROM
#define PAGED_ROM
#endif
#endif
146 Demand Paging on Symbian

#if
#if !defined
!defined USE_CODE_PAGING
USE_CODE_PAGING
//
// Uncomment
Uncomment next
next line
line if
if code
code paging
paging is
is wanted
wanted
#define
#define USE_CODE_PAGING
USE_CODE_PAGING
#endif
#endif

#if
#if !defined
!defined CODE_PAGING_FROM_ROFS
CODE_PAGING_FROM_ROFS
//
// Uncomment
Uncomment next
next line
line if
if code
code paging
paging from
from primary
primary rofs
rofs is
is wanted
wanted
#define
#define CODE_PAGING_FROM_ROFS
CODE_PAGING_FROM_ROFS
#endif
#endif

ROM_IMAGE[0]
ROM_IMAGE[0] {{
pagedrom
pagedrom
compress
compress
//
// Min
Min Max
Max Young/Old
Young/Old
//
// Live
Live Live
Live Page
Page
//
// Pages
Pages Pages
Pages Ratio
Ratio
demandpagingconfig
demandpagingconfig 256
256 512
512 33
pagingoverride
pagingoverride defaultpaged
defaultpaged

#if
#if defined
defined USE_CODE_PAGING
USE_CODE_PAGING
pagingpolicy
pagingpolicy defaultpaged
defaultpaged
#endif
#endif
}}

#if
#if defined
defined CODE_PAGING_FROM_ROFS
CODE_PAGING_FROM_ROFS
ROM_IMAGE[1]
ROM_IMAGE[1] {{
pagingoverride
pagingoverride defaultpaged
defaultpaged
}}
#endif
#endif

You would also adjust the OBY file that determined the start of the primary ROFS
partition (for example, base.iby) like this:

#if
#if defined(_NAND)
defined(_NAND) ||
|| defined(_NAND2)
defined(_NAND2)
#if !defined PAGED_ROM
#if !defined PAGED_ROM |||| defined
defined CODE_PAGING_FROM_ROFS
CODE_PAGING_FROM_ROFS
REM Start of ROFS image
REM Start of ROFS image
ROM_IMAGE[1]
ROM_IMAGE[1] { {
#endif
#endif
#endif
#endif
Configuring Demand Paging on a Device 147

No change to the buildrom command is required if MyDPConfig.oby appears


before base.iby. If this is not the case, then USE_CODE_PAGING and CODE_
PAGING_FROM_ROFS must be defined on the command line (as well as PAGED_
ROM). For example, for Techview on H4:

buildrom
buildrom –D_NAND2
–D_NAND2 –DPAGED_ROM
–DPAGED_ROM –DUSE_CODE_PAGING
–DUSE_CODE_PAGING –DCODE_PAGING_
–DCODE_PAGING_
FROM_ROFS
FROM_ROFS
h4hrp
h4hrp techview
techview MyDPConfig
MyDPConfig

7.3 Fine-Grained Configuration

In the previous two sections, I explained how to create demand-paged ROMs and
configure the general paging behavior (that is, the default paging policy and size
of the paging cache). This section describes how to configure whether individual
files are paged or not.

7.3.1 MMP File Configuration


Symbian has extended the MMP file syntax to add two new keywords, unpaged
and paged, which mark the executable as unpaged or paged respectively. Each
keyword should appear on a line of its own and takes no arguments.

If the build tools are configured to compress executables while building (the
default behavior), then executables marked with the paged keyword will be byte-
pair-compressed.

If a ROM/ROFS partition has defined a pagingoverride of defaultpaged or


defaultunpaged, then the pageability indicated in the MMP file will be respected
when the executable is placed in ROM/ROFS. If the pagingoverride is nopag-
ing or alwayspage, then this keyword will take precedence, and the executable
will be placed in ROM/ROFS as unpaged or paged respectively, and the page-
ability indicated in the MMP file will be ignored.

7.3.2 OBY File Configuration


Symbian has extended the OBY file syntax to include two new modifiers, un-
paged and paged, which mark an object as unpaged or paged during the rom
148 Demand Paging on Symbian

build/rofsbuild process. The modifiers should appear at the end of an OBY


statement like this:

file=ABI_DIR\DEBUG_DIR\MyLibrary.dll \sys\bin\MyLibrary.dll unpaged


file=ABI_DIR\DEBUG_DIR\MyLibrary.dll \sys\bin\MyLibrary.dll
unpaged

You should only use these modifiers for user-side ROM objects (such as ‘file=’,
‘dll=’, ‘data=’ and ‘secondary=’ statements). Kernel-side ROM objects (‘primary=’,
‘extension=’, ‘variant=’ and ‘device=’ statements) are always unpaged so any
modifier will be ignored. Furthermore, the modifier will be ignored for ‘data=’
statements if the object is in a ROFS partition; the pageability of non-executable
files is only relevant in an XIP ROM image (see Section 3.4).

If ROM/ROFS compression is switched on, objects marked as paged will be byte-


pair-compressed automatically.

The OBY file pageability modifier overrides any pageability defined in the execut-
able’s MMP file. However, it does not override the pagingoverride statement
in the case of nopaging or alwayspage.

7.3.3 Central Configuration Via the Configpaging Tool


You may find it difficult to manage the configuration of the pageability of files us-
ing the methods discussed in Section 7.1.1 if you need a different pageability for
the same file on different devices. If each MMP or OBY/IBY file has two possible
configurations and there are many such files in ROM, then maintaining the overall
configuration could become very expensive. When you need a more ‘fluid’ con-
figuration, it is better to use the configpaging tool as a single place to manage
the paging configuration of many files.

This optional tool runs during the OBY pre-processing phase, before ROMBUILD
and ROFSBUILD are executed. You provide it with a centralized list of paged
and unpaged files, and the tool uses this to add paged/unpaged modifiers to
individual statements in the intermediate OBY files. This centralized list overrides
any individually specified paged/unpaged modifiers and any pageability defined
in individual MMP files. However, the list will not override the pagingoverride
statement in the case of nopaging or alwayspage.
Configuring Demand Paging on a Device 149

There are two approaches for invoking the configpaging tool:

1. By using the externaltool statement in an input OBY file: externalt


ool=configpaging[:<optionalconfigfile>]
2. By using the ‘-e’ buildrom command line option: buildrom -econfig
paging[:<optionalconfigfile>] <rest of command line>.

where <optionalconfigfile> is a text file in the directory \epoc32\rom\con-


figpaging\. The default file used is configpaging.cfg.

The syntax of <optionalconfigfile> is as shown in Table 8.

Table 8

Keyword Effect
defaultpaged Sets all unspecified files to be paged. Note this
will effectively override any pagingoverride
statement in the case of defaultpaged or de-
faultunpaged because all ROM files will then
contain a pageability modifier.

defaultunpaged Sets all unspecified files to be unpaged. Note


this will effectively override any pagingover-
ride statement in the case of defaultpaged
or defaultunpaged because all ROM files will
then contain a pageability modifier.

<file regex> unpaged Sets the file(s) specified by the regular expres-
or sion in <file regex> to unpaged.
unpaged:
<file regex>
<file regex>
<file regex> paged Set the file(s) specified by the regular expression
or in <file regex> to paged.
paged:
<file regex>
<file regex>
150 Demand Paging on Symbian

include “<includefile>” Includes another file in \epoc32\rom\con-


figpaging\ specified by <includefile>.
Included files will be processed before the re-
maining lines of the parent file. Included files can
themselves include other files.

7.4 Optimizing the Configuration

The primary purpose of demand paging is to save RAM, so optimization in this


context usually means ‘how to save the most RAM’. In a given ROM, there are a
number of configurable variables that may affect this:

• The amount of code and data that is marked as paged or unpaged


• The size of the paging cache
• The ratio of young pages to old pages in the paging cache
• Whether ROM or executable compression is switched on and which algo-
rithm is used
• Whether XIP ROM paging and/or code paging is used
• Whether files are located in the core ROM image or the primary ROFS –
also known as the core/ROFS split.

The first four variables are essentially RAM/performance/disk usage trade-off


decisions and were discussed in Section 3.11. Variable five is relatively easy to
decide on: in general, the performance and RAM saving of XIP ROM paging is
better than code paging so you should use XIP ROM paging where possible.
However, code paging can give extra RAM savings, so you should enable this
too, if possible. Variable six is interesting, because if you get this right it can give
you greater RAM savings with little or no performance trade-off. So it is worth
investing some effort in this, and I discuss it further in the rest of this section.

The simplest core/ROFS split is to put everything in the core image with an empty
ROFS – this is the optimum split if you want to page as much code as possible.
However, it is suboptimal when there is a significant amount of unpaged code. In
fact, if the amount of unpaged code plus minimum paging cache size approaches
the size of code that needs to be loaded in a non-demand-paged ROM, there
may be a net RAM loss (see Section 3.14).
Configuring Demand Paging on a Device 151

A good heuristic to follow is to try to arrange as much paged data as possible in


the core ROM image (where it can be XIP ROM paged), while leaving as much
unpaged code as possible in the primary ROFS (where it can be loaded as
required – because any unpaged code in the core ROM permanently occupies
RAM). This heuristic is complicated by the fact that all static dependencies of
code in the core ROM image must also be in the core, even if those dependen-
cies are rarely used. The RAM saved by having a paged executable in the core
ROM might be offset by the overhead of having the unpaged dependencies of
that executable in the core. You would usually put unpaged files in the ROFS to
avoid the overhead of them permanently occupying RAM by being in the core
ROM. You might also put some paged code in the ROFS if it means its unpaged
dependencies can also be in ROFS, rather than placing the paged code in the
core and being forced to place all the unpaged dependencies in the core. The
RAM saving of paging additional code in the core can be outweighed by the RAM
overhead of a bigger unpaged core.

There are various strategies for dealing with this issue, and I discuss these in the
following sections.

7.4.1 efficient_rom_paging.pm
You can invoke this optional tool, which runs during the buildrom phase be-
tween configpaging and execution of rombuild/rofsbuild. It searches
for any paged files in the intermediate ROFS OBY file and moves these files to
the core ROM OBY file, together with their static dependencies (both paged and
unpaged). This ensures that as much paged code as possible is in the final core
ROM image, making best use of XIP ROM paging. However, the tool makes no
effort to limit the amount of unpaged code in the core ROM image.

You invoke the tool in one of two ways:

1. By use of the ‘externaltool=’ OBY file syntax. The following statement


should be used: externaltool=efficient_rom_paging
2. By using the ‘-e’ buildrom command line option. For instance:
3. buildrom –efficient_rom_paging <rest of command line>.
152 Demand Paging on Symbian

7.4.2 Limiting Unpaged Code in the Core ROM Image


You can limit unpaged code in the core ROM image – either using regexps in the
config file, or by writing a tool (perl script) similar to efficient_rom_paging,
which only allows a ‘privileged’ set of unpaged executables to exist in the core im-
age. Only paged executables with dependencies that are in this privileged set (or
paged) would be allowed in the core image. Other executables, both paged and
unpaged, would be placed in the primary ROFS.

The difficulty here is choosing the privileged set. If the set is too small, then many
paged executables will have to be in the primary ROFS, because they have non-
privileged, unpaged dependencies. If the set is too large, then the configuration
would be much the same as with efficient_rom_paging. An ideal set would
be one that contains the unpaged executables that are always loaded, plus those
that have a significant amount of paged code dependent upon them.

7.5 Other Demand-Paged ROM Building Features

Since the introduction of demand paging, you can pass a ‘-geninc’ switch to
the buildrom command or to ROMBUILD to output additional ROM building
information. When this switch is used, the tools create a file called <rom_image_
name>.inc in the ROM building directory. The file has the following format:

#define SYMBIAN_ROM_UNPAGED_SIZE unpaged_size


SYMBIAN_ROM_UNPAGED_SIZE
#define SYMBIAN_ROM_PAGED_SIZE unpaged_size
paged_size_in
#define
REM SYMBIAN_ROM_PAGED_SIZE
Start of ROFS image paged_size_in

where unpaged_size is the hexadecimal uncompressed size of the unpaged


part of the core image and also defines the offset to the start of the paged part of
the core image. In non-demand-paged ROMs, this is the uncompressed size of
the whole core image.
paged_size is the hexadecimal uncompressed size of the paged part of the
core image. In non-demand-paged ROMs, this is zero.

You can use this information in your custom ROM building tools that wrap Sym-
bian’s ROM building tools.
Configuring Demand Paging on a Device 153

7.6 Using the Symbian Reference Configurations

Symbian defines three standard demand-paging configurations that are suitable


for a reference environment. Two of the configurations are based on the manda-
tory and recommended unpaged lists mentioned in Section 6.4. These can be
used as the basis for a customer demand-paging configuration, with the caveats
defined in Section 7.6.4.

7.6.1 The Default Demand-Paging Configuration


The first reference configuration is in \epoc32\rom\include\pagedrom.oby
– this is the default demand-paging configuration. It looks similar to the example
OBY file in Section 7.2.4 but with the following differences:

1. The pagedrom and compress keywords are not defined, because these
are already defined in the reference board IBY files (such as \epoc32\
rom\include\base_h4hrp.iby for H4).
2. CODE_PAGING_FROM_ROFS is disabled.
3. configpaging.pm (see Section 7.3) is used to include the recommended
unpaged list of components. The mandatory unpaged components are
configured via their MMP files, so they do not need to be configured cen-
trally.
4. efficient_rom_paging.pm (see Section 7.4.1) is used.
5. Different demandpagingconfig parameters are used.

To build a reference ‘Techview’ ROM using the default configuration, simply add
the pagedrom parameter to any NAND Techview buildrom command line, like
this:

buildrom –D_NAND2pagedrom
buildrom –D_NAND2 pagedrom h4hrp
h4hrp techview
techview

It is important that pagedrom appears before techview so that the flags defined
in pagedrom.oby are parsed before base.iby, which is included by tech-
view.iby.
154 Demand Paging on Symbian

The default configuration provides a generous paging cache. The aim of this is to
provide a modest RAM saving compared with a non-demand-paged NAND Tech-
view ROM, while maintaining performance for all performance-critical use cases.
Performance for some use cases is actually improved.

7.6.2 The Functional Demand-Paging Configuration


This configuration appears in \epoc32\rom\include\pagedrom_function-
al.oby. It differs from the default configuration in the following ways:

1. configpaging.pm is not used. Only the mandatory unpaged components


are unpaged, and these are configured via their MMP files. No central con-
figuration is required.
2. A more restrictive demandpagingconfig is used.

You can build a ROM using this configuration in the same way as for the default
configuration, but using pagedrom_functional in place of pagedrom. The
purpose of the functional configuration is to provide a more aggressive paging
environment, while maintaining functional equivalence with a non-demand-paged
NAND Techview ROM. As a result, performance is worse for most use cases but
there are significant RAM savings.

7.6.3 The Stressed Demand-Paging Configuration


This configuration appears in \epoc32\rom\pagedrom_stressed.oby. It dif-
fers from the default configuration in the following ways:

• configpaging.pm is used with an alternative configuration file that pages


as many files in the system as possible, overriding any mandatory un-
paged components.
• A more restrictive demandpagingconfig is used.

The purpose of this configuration is to provide an extremely aggressive paging


environment for stress testing. Although Techview ROMs using this configuration
are functional at a basic level, functional equivalence with a non-demand-paged
NAND Techview ROM is not guaranteed and performance is much worse. RAM
savings are maximized.
Configuring Demand Paging on a Device 155

7.6.4 Defining Custom Demand-Paging Configurations


The configurations defined in Sections 7.6.1 to 7.6.3 are only suitable for Tech-
view ROMs. Developers using demand paging are free to define their own con-
figurations, either from scratch using the information in the earlier parts of this
document, or by basing them on one of the Symbian-provided configurations.

Here are some things to bear in mind when defining custom configurations:

• Symbian only warrants the functionality of the OS when all mandatory un-
paged components are unpaged. A configuration that overrides the man-
datory unpaged components, like the stressed configuration mentioned
above, is not warranted.
• Symbian only warrants the performance of Symbian components involved
in key use cases (see Section 6.1.2) when all mandatory and recommend-
ed unpaged components are unpaged.
• The size of the paging cache is dependent on the amount of code loaded
during the most extreme use case. Simple UI platforms (such as Tech-
view) need a smaller cache size, whereas larger ones (such as S60) need
a larger cache.
• On platforms that add significant additional code to Symbian, those ad-
ditional components should be evaluated (perhaps using the guidelines in
Section 6) to see if there are any further unpaged components that should
be added to the mandatory and recommended unpaged lists.
• The minimum paging cache size should be large enough so that when
you set the maximum cache size to the same value as the minimum (in
testing), then the functional equivalence and robustness of the platform is
maintained, and performance is at an acceptable minimum level. Lower
values may cause stability problems when the device is low on free RAM.
• On production devices, the maximum paging cache size should be set to
the maximum possible value to minimize the number of page faults. The
configurations provided by Symbian place a somewhat low upper limit
on the paging cache size to induce additional page faults for testing
purposes.
156 Demand Paging on Symbian
Testing and Debugging in a Demand-Paged Environment 157

8
Testing and Debugging in a Demand-Paged
Environment

This chapter discusses tracing in a demand-paged environment, using BTrace


and the DPTest API. It also describes stop-mode hardware debugging and the
potential issues that demand paging may cause or expose during testing, includ-
ing strategies you can use to expose such problems.

8.1 Tracing and Debugging with Demand Paging

8.1.1 Demand-Paged BTrace Logging


Symbian provides a binary trace-logging framework called BTrace. BTrace logs
are not human-readable but they are compact and in an ideal format for post-
processing by an analysis tool.

BTrace has sub-categories to allow tracing of different functional areas in the


system. The one for the kernel paging subsystem is EPaging, and for the media
paging subsystem, EPagingMedia. Tracing the former should provide enough
information for most purposes. The latter is probably only useful for detailed
analysis of the paging implementation itself. The EThreadIdentification sub-
category should also be enabled to receive useful context information for paging
events.

If you include the BTRACE.EXE console application in an OBY file called MyD-
PBTrace.oby, then you can create a basic demand-paged Techview ROM with
tracing enabled using the following command:

buildrom –D_NAND2 –DBTRACE pagedrom h4hrp techview MyDPB-


buildrom –D_NAND2 –DBTRACE pagedrom h4hrp techview MyDPBTrace
Trace
158 Demand Paging on Symbian

To enable demand-paging BTrace logging in the kernel’s paging code while


running a test program RUNNYCHEESE.EXE, you should use the following com-
mands from a command prompt on the device:

btrace
btrace –f3,10
–f3,10 –m1
–m1 –b1024
–b1024
RUNNYCHEESE.EXE
RUNNYCHEESE.EXE
btrace
btrace d:\MyDPLog.txt
d:\MyDPLog.txt

The first line does the following things:

• Enables the EThreadIdentification and EPaging sub-categories (3


and 10 respectively)
• Sets the trace buffer to be enabled but not in ‘free running’ mode
• Sets the trace buffer size to 1024 KB.

The second line executes RUNNYCHEESE.EXE and the third line dumps the
BTrace log to d:\MyDPLog.txt.

Sometimes you may need to enable demand-paging tracing immediately after


boot. In this case, you’ll need to enable tracing in the OBY file. To do this using
the same BTrace parameters as the above example, MyDPBTrace.oby should
look as follows:

//
// MyDPBTrace.oby
MyDPBTrace.oby
file=ABI_DIR\DEBUG_DIR\btrace.exe
file=ABI_DIR\DEBUG_DIR\btrace.exe \sys\bin\btrace.exe
\sys\bin\btrace.exe

ROM_IMAGE[0]
ROM_IMAGE[0] {{
//
// Set
Set the
the Btrace
Btrace flag
flag (EThreadIdentification
(EThreadIdentification =
= 3)
3) +
+ (EPaging
(EPaging =
= 10)
10)
BTrace
BTrace 1032
1032

//
// Set
Set the
the trace
trace mode
mode (enabled/not
(enabled/not free
free running)
running)
BTraceMode
BTraceMode 1
1

//
// Set
Set the
the buffer
buffer size
size
BTraceBuffer
BTraceBuffer 1024000
1024000
}
}

Then you can dump the BTrace log from a command prompt in the same way as
before:
Testing and Debugging in a Demand-Paged Environment 159

btrace d:\MyDPLog.txt

8.1.2 Demand-Paged Kernel Trace Logging


The older, less sophisticated kernel tracing method can also obtain the BTrace
logging information mentioned above. These log events are human readable, but
they are not as well defined as those in BTrace, so this approach may not be suit-
able for the post-processing of events. Also, the trace output is very verbose and
so will have a significant performance impact on the system. You should only use
it if fast tracing hardware is available.

However, this approach does report additional events, especially during the boot
sequence. This may be of use if you are debugging a new hardware platform with
demand paging enabled.

The kernel trace flags for the paging subsystem and the paging media subsystem
are bits 62 and 61 respectively. To enable both these flags, adjust the kernel-
trace keyword in the relevant OBY file as follows:
kerneltrace 0x00000000 0x60000000

The flags should be OR’d with any other relevant kernel trace flags. You must
use a debug version of the kernel (kernel trace logging is not enabled in release
versions).

8.1.3 DPTest API


The DPTest API is documented in the Symbian Developer Library documenta-
tion, developer.symbian.org/sfdl, so I will not cover it in detail here. It consists
of a number of static functions declared in \epoc32\include\dptest.h and
is implemented in dptest.dll. Its API classification tag is @Test, meaning it
shouldn’t be used in any production device.

Essentially, this API allows the caller to retrieve information about which demand
paging attributes are enabled, how many page faults and page ins have taken
place, the paging cache size parameters and the current cache size. The API
also allows the executable it runs in to flush the cache and change the cache
size, so long as that executable has the WriteDeviceData platform security
capability.
160 Demand Paging on Symbian

Symbian also provides a console program called dptestcons.exe that exercis-


es the DPTest API. This can be easily included in a ROM by adding dptest-
cons.oby to the buildrom command:

buildrom –D_NAND2 pagedrom h4hrp techview dptestcons


buildrom –D_NAND2 pagedrom h4hrp techview dptestcons

For usage information, simply run dptestcons from an eshell instance with
no parameters.

8.1.4 Stop-Mode Hardware Debugging


Stop-mode debugging of a demand-paged ROM (for example, using a Lauter-
bach via the JTAG interface) is supported, just as it is for non-demand-paged
ROMs. However, you should disable the trapping of data and program aborts,
since that is how the paging subsystem operates. If you don’t, execution will
break on every page fault.

On Trace32, you can do this in either of these ways:


1. From the menu, select Break -> OnChip Trigger...

2. A dialog will appear. Uncheck DABORT and PABORT from the ‘Set’ group
of checkboxes.

3. Use the following Trace32 commands at the B:: prompt: TrOnChip.Set


DABORT off

4. TrOnChip.Set PABORT off.

These commands can be added to a script.

8.2 Testing

In this section, I discuss the potential issues that demand paging may cause or
expose during testing, and strategies you can use to expose such problems. The
unpredictable nature of demand paging makes it very difficult to anticipate
Testing and Debugging in a Demand-Paged Environment 161

problems in the system. However, there are some patterns to the kinds of prob-
lems that are likely to be observed. I describe some of these and their possible
solutions in the next sections.

8.2.1 Testing Approach


When demand-paging support is added to a platform, you will have to test at
least one additional ROM configuration – a new demand-paged ROM as well
as the existing non-demand-paged ROM. If you are supporting several different
demand-paging configurations, then your testing burden will be increased fur-
ther – in fact, this may introduce an intolerable test overhead on a project. In this
section, I discuss strategies for maximizing the test benefit while minimizing the
overhead.

Demand-paged versus non-paged ROMs


In a platform that supports both demand-paged and non-demand-paged con-
figurations, the simplest test strategy is to duplicate all testing on both configura-
tions. This may not be feasible, so it is worth comparing the configurations to see
whether any duplication of effort can be removed.

Functional differences
Demand-paged ROMs contain code paths that are not executed in non-demand-
paged ROMs – but is the converse true? Is there functionality in non-demand-
paged ROMs that is no longer required (and hence not executed) in a demand-
paged environment? If this were true, there would be no option but to test both
demand-paged and non-demand-paged ROMs, because neither of them is a
subset of the other. To answer that question, we need to look again at the NAND
flash layouts in Figures 1 and 3 in chapter 3.

Figure 1, layout B is a typical non-demand-paged NAND layout. When compar-


ing this with Figure 3, layout C or D, it can be seen that all the elements of 1B are
also present in 3C and 3D. There is a permanently RAM-shadowed area in 1B
(the core image), which corresponds to the unpaged part of the core image in 3C
and 3D. In all three layouts, there is a ROFS section, where whole executables
are loaded into RAM as required. There is also a user data area in all three lay-
outs.

Assuming a typical demand-paged ROM layout like 3C or 3D, it is safe to say that
162 Demand Paging on Symbian

comprehensive functional testing of a demand-paged ROM will also exercise all


code paths of an equivalent non-demand-paged ROM.

Performance differences
In Section 2, I discussed the trade-off between RAM and performance in
demand-paged ROMs. We’ve seen that for some configurations it is difficult to
predict whether a particular use case will be quicker on a demand-paged or a
non-demand-paged ROM. However, we can choose a sufficiently aggressive
demand-paged configuration such that all use cases run more slowly than on a
non-demand-paged ROM.

Analysis of demand-paged defects found thus far shows that those exposed by
timing differences are only reproducible when the use case runs slower than on
a non-demand-paged ROM. Furthermore, some defects are only reproducible on
aggressive demand-paged configurations but not on less aggressive configura-
tions (or non-demand-paged ROMs). There have been no cases of defect
reproducibility increasing as the configuration is made less aggressive.

So, assuming you make the demand-paged ROM configuration aggressive


enough, there is no need for you to test for timing-related problems on an
equivalent non-demand-paged ROM.

RAM differences
The primary purpose of demand paging is to save RAM, so any sensible
demand-paged configuration will result in more free RAM than an equivalent non-
demand-paged ROM. It is important that you continue to run any use cases and
tests related to out-of-memory conditions in demand-paged ROMs, where out-
of-memory conditions are harder to reproduce. You can reproduce the behavior
of the paging cache in out-of-memory conditions by limiting the maximum paging
cache, but this does not limit other system memory allocations.

In theory, some out-of-memory-related defects on a non-demand-paged ROM


may not be reproducible on a demand-paged ROM. This is one of the benefits of
paging, but it means that test code must be written well enough to exercise
out-of-memory conditions if only demand-paged ROMs are tested.
Testing and Debugging in a Demand-Paged Environment 163

Functional testing
We know that the more aggressive the demand-paged configuration, the greater
the chances of reproducing problems such as those discussed in Section 8.2.1
and that Symbian only warrants configurations in which all the Symbian-
mandatory unpaged components are unpaged. So, a good configuration to use
for functional testing would be one that only has these components unpaged
(plus any additional non-Symbian components that fit the same criteria), together
with a small maximum paging cache size. However, you should not choose a
maximumpaging cache size so small that the time taken to execute the tests is
unreasonably long.

The Symbian functional configuration (see Section 7.6.2) fulfils these require-
ments for the Techview reference environment.

Performance testing
The relatively aggressive configuration used for functional testing may not be
suitable for performance testing. Some tests may have to complete within a
certain time interval to pass. So, to test performance, you may need to mark ad-
ditional components as unpaged, and/or choose a larger paging cache. However,
it is still sensible to limit the maximum paging cache size to reproduce out-of-
memory behavior.

Remember: Symbian only warrants the performance of performance-critical use


cases when the Symbian-recommended unpaged components are unpaged. We
also recommend that any non-Symbian components involved in performance-
critical use cases are unpaged for performance testing.

The Symbian default configuration (see Section 7.6.1) matches the above
requirements for the Techview reference environment.

User testing
The configurations that you use for functional or performance testing are not suit-
able for production devices. At some point, you will need to perform wider system
testing with a production demand-paged configuration. Making this should be a
simple matter of taking the configuration used for performance testing, changing
the minimum paging cache size to the maximum size and changing the maximum
size to the maximum possible value. This will means that the paging cache can
164 Demand Paging on Symbian

grow much larger, which means that fewer defects will be reproducible. Testing
with this configuration should be delayed until as late as possible in the project,
otherwise some problems in out-of-memory situations may be hidden.
In Conclusion 165

9
In Conclusion

In this book, I’ve looked at demand paging at many levels - from a high-level
overview in Chapters 2 and 3, to an in-depth study of the implementation in
Chapter 4. In Chapter 5, I’ve given you a practical hands-on guide to using de-
mand paging yourself, whether you are working with device drivers in a demand-
paged system or enabling demand paging in a new device (in which case you’ll
also find Chapter 7, on configuring device parameters, very useful). If you’re
working at a higher level, then Chapter 6 gives you the nitty-gritty on getting your
component ready for demand paging. Finally, in Chapter 8, I look at testing and
debugging in a demand-paged environment.

Demand paging on Symbian has been a great success. Not only does demand
paging increase free RAM, it also speeds device boot and application start-up
times, and makes for a much more robust device under low-memory conditions.
So successful was demand paging that it has been back ported two operating
system generations, to devices that have already been released into the market. I
wish you all the best in working with it.
Index
A E
Aging a Page 36 Effective RAM Saving 19
Algorithm 55 efficient_rom_paging.pm 151
Allocating Memory 33 EStart 117
B F
Binary Compatibility 120
Boot-Loaded 90 File Server 63
BTrace 157 File System Caching 14
Byte-Pair Compression 54 Fine-Grained Configuration 147
Fragmented Write Requests 113
C Freeing a Page 38
Functional Equivalence 129, 137
Cache Support 52
Candidate Unpaged List 134 G
Chunk APIs 50
Clamping 63 General performance 137
Code-Paged ROM 143 H
Code Paging 13, 34
Code Segment Initialization 43 Hardware Debugging 160
Component Architecture 134
Composite File System 8 I
Configpaging 148 Implementing File Clamping 118
Core/ROFS 141 Improved application start-up times 3
Critical Code Paths 133 Internal MMC/SD Card 114
Custom Demand-Paging Configurations 155 Internal Writeable Media 145
D IPC Analysis 128

Data Paging 76 K
Data Structures 40 Kernel 136
DDigitiser 105 Kernel Containers 82
Debugger Breakpoint 58 Kernel Extension Entry Point 109
Debugging 157 Kernel Implementation 21
Default Demand-Paging Configuration 153 Kernel RAM Usage 6
defaultpaged 149 Kernel Trace Logging 159
defaultunpaged 149 Key Classes 21
demandpagingconfig 141
Dependency Information 127 L
Device APIs 58
Device Stability 137 Live Page List 28
Disconnected Chunk API 51 Locking Memory 48
DPTest API 159 Logical Channel 99
Dynamic Analysis 129
Dynamic RAM Cache 33
M Power management 137
Problems 133
MaxLivePages 141
Media Driver Migration 107 R
Media Drivers 58
Memory Accounting 34 RAM Cache Interface 30
Memory Allocation 32 Rejuvenating a Page 36
Migrating 76 RFile::BlockMap() API 67
Migrating Device Drivers 87 ROFS 144
Minimum Paging Cache Size 135 ROM 7
MinLivePages 141 ROM paging structure 12
Mitigation Techniques 133 S
MMP File Configuration 147
Security 137
N Shared Chunks 85
Sound driver 105
NAND Flash 7 Static Analysis 127
NAND Flash Structure 12 Stressed Demand-Paging Configuration 154
Symbian OS v9.3 74
O
Symbian Reference Configurations 152
OBY File 139
T
OBY File Configuration 147
Optimizing 150 Testing 159
The Kernel Blockmap 42
P
The Paging Algorithm 15
Pageability Categories 136 The Paging Configuration 17
Paged 138 TLocalDriveCaps 111
pagedrom keyword 140 Tracing 156
Page-Fault Handler 22
U
Page-Information Structures 22
Page Locking 47 Unique Threads 92
Paging Cache Sizes 18 Unpaged 137
Paging In 16 Unpaged Files 17
Paging In a ROM Page 40 USB driver 104
Paging Out 17 Use-Case Analysis 128
pagingoverride 140 User-Side Blockmap 41
pagingpolicy 143
Paging Requests 112 V
Paging Scenarios 75
Virtual Memory 50
PDD 101
Performance Data 130
Permanent Presence 137
Physical Device 97
Physical RAM 22

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy