Implementing Power Management Features On The Nucleus Rtos
Implementing Power Management Features On The Nucleus Rtos
W H I T E P A P E R
E M B E D D E D S O F T W A R E
w w w.m e n t o r.c o m
Implementing Power Management Features on the Nucleus RTOS
INTRODUCTION
Power management on embedded devices boils down to an amazingly simple principle “turn-off anything
you don’t use.” Though this sounds fairly simple, the actual implementation can be quite complex. The
Nucleus® real-time operating system attempts to minimize the complexity by taking care of as many of the
nitty-gritty details as possible, allowing the software developer to make high-level decisions, but also giving
the developer the option of full control if desired.
This paper covers a few of the power management techniques that define device static power consumption.
Dynamic CPU power savings utilizing CPU_Idle functionality and the automatic tick suppression feature
available in Nucleus will also be discussed. Lastly, the concept of Dynamic Voltage and Frequency Scaling
(DVFS) will be covered as it applies to both static and dynamic power savings. All of the discussed techniques
are analyzed when applied to the Atmel reference platform AT91SAM9M10-EKES.
1 The other big factor is the DVFS operating point also discussed in this paper.
w w w. m e nto r. co m
2
Implementing Power Management Features on the Nucleus RTOS
It’s always a good rule of thumb to prioritize system states in order of their functionality and power
consumption. The higher the state, the higher the functionality and the higher the power consumption. This
type of organization allows applications to request minimum system power states required for their proper
operation.
For example, in our hypothetical MP3 player scenario, the task responsible for playback of music would call
NU_PM_Request_Min_System_State (PLAYBACK);
whenever audio files are being played. Conversely, when playback is stopped, the playback task would call
NU_PM_Release_Min_System_State(handle);
to inform the system that the requirement is no longer needed. Other tasks/applications can also make such
requests. For example, a GUI application may request the USER_ACTIVE state while the user is interacting with
the device, then relinquish that request once the user has gone idle2. Once an application makes a request for
a minimum system power state, the OS will not allow the system power state to drop below the requested
state. When multiple requests are in effect, the highest requested state becomes the lowest possible system
power state. Such a system allows each task to request what it needs from the system without any awareness
from other tasks in the system.
For example, if the playback task requests a minimum system state of PLAYBACK and then the GUI task
requests USER_ACTIVE, the system will be in USER_ACTIVE state until the GUI releases the request at which
point the system state will drop to PLAYBACK. All this will happen without the playback task having any need
to know about the GUI application.
2 Nucleus OS power services includes ‘Watchdog services’ to assist with implementation of timeouts based on user inactivity.
See the Reference Power Controller provided with Nucleus for more details.
w w w. m e nto r. co m
3
Implementing Power Management Features on the Nucleus RTOS
EXAMPLE IMPLEMENTATION
To illustrate the static power consumption savings let’s take a look at an example implementation using the
AT91SAM9M10-EKES platform. In this example, we have defined four system power states and wrote a simple
application that transitions between the states every second. The resulting power consumption graph is
shown in figure 2. The code to define the states and the test application is shown in Figure 3. To define the
system power states NU_PM_Map_System_Power_State API was used (not shown here).
Power Values
Maximum
Calculated Average
Minimum
pm_status = NU_PM_Set_System_State(eLCD_DIM_USBH_ON);
NU_Sleep(1*NU_PLUS_TICKS_PER_SEC); SS2
pm_status = NU_PM_Set_System_State(eLCD_OFF_USBH_ON);
NU_Sleep(1*NU_PLUS_TICKS_PER_SEC); SS1
pm_status = NU_PM_Set_System_State(eLCD_OFF_USBH_OFF);
NU_Sleep(1*NU_PLUS_TICKS_PER_SEC);
} SS0
}
w w w. m e nto r. co m
4
Implementing Power Management Features on the Nucleus RTOS
1. “OFF” – The serial port is not functional, all available circuitry that can be controlled (clocks,
transmitter, receiver, any RS232 level converters) is powered down and any interrupts that may
come from the serial port are disabled.
2. “SLEEP” – In this state the serial power is powered off as in the OFF state, however it can
receive a RING signal interrupt and pass it up to the applications or power controller. The power
controller (or another application) at that point, may raise the power state of the serial port directly
or by raising the system power state.
3. “ON” – In this state the serial port is expected to be fully functional and circuitry powered
on as needed basis.
The states OFF and SLEEP are fairly straightforward, however the ON state needs to implement logic to
dynamically control the power consumption. Here are some considerations:
■■ If the serial port is not opened by any applications, even though its power state is ON, the hardware
should stay powered down and Rx interrupts disabled just as if it was in the SLEEP state since there is
no user of the serial port. There is no requirement to burn power. A peripheral state is defined as the
limit, or maximum power to be consumed, not the minimum.
■■ If there is a separate control of transmitter or transmit clock, it should be disabled if there is no out-
going activity on the serial port for some pre-determined amount of time.
Other drivers should be implementing a similarly aggressive power management strategy. All power-aware
drivers shipped as part of the Mentor® Embedded Nucleus® ReadyStart™ BSP have such algorithms
implemented (limited by the abilities of the particular hardware).
When no tasks are scheduled in the system, Nucleus allows for the processor and related hardware to enter
low power modes. To accomplish this, the CPU driver for a particular platform registers two functions:
2. VOID CPU_Wakeup(VOID)
The CPU_Idle function is called every time there is no task to be scheduled, while CPU_Wakeup() function is
called whenever the CPU becomes active again -- due to pre-scheduled task or an external interrupt.
w w w. m e nto r. co m
5
Implementing Power Management Features on the Nucleus RTOS
SIMPLEST CPU_IDLE/WAKEUP
The simplest CPU_Idle function simply puts the CPU into a CPU Idle state. For our reference platform this code
is shown Figure 4 below.
The CPU_Wakeup() function is empty for this implementation as the CPU automatically exits the “Wait For
Interrupt” state without any software setup requirements.
To illustrate the power savings gained we have implemented a simple test program that keeps the CPU busy
for 1 second, then sleeps for 1 second, repeating the pattern indefinitely. The results without and with our
simple CPU_Idle function are shown in Figure 5 below.
Power Values
Maximum
Calculated Average
Minimum
As we can see in Figure 5, the simple CPU_Idle saves ~190mA of current whenever the CPU is idle. Without
CPU_Idle implementation the power consumption would stay at ~460mA regardless of the CPU utilization3.
w w w. m e nto r. co m
6
Implementing Power Management Features on the Nucleus RTOS
ADVANCED CPU_IDLE
While the CPU Idle implemented in the previous section shows demonstrable results, Idle power consumption
can be further reduced to take advantage of specific hardware features. Our reference platform, for example,
offers the capability to put SDRAM into self-refresh mode at the cost of the power and time to enter and exit
such low power mode. Our CPU_Idle function can be modified to allow the SDRAM to enter self-refresh mode
if the expected idle time (passed to the CPU_Idle function as a parameter) is greater than a given threshold.
This value is experimentally determined by performing measurements of the cost and benefits. Figure 6
below illustrates the principle behind this determination4.
The new CPU_Idle for our reference platform now looks like this:
3 In some cases when CPU_Idle is not implemented the current can even increase during times when there is nothing to be
scheduled, since when CPU_Idle is not implemented, the scheduler enters an infinite loop while waiting for the next interrupt
and that infinite loop can cause the CPU to consume even more power than during typical application operations.
4 In this specific reference platform the threshold is enforced by an automatic hardware timer measuring “idle DRAM time,”
therefore, it is not necessary to make this determination in CPU_Idle function.
w w w. m e nto r. co m
7
Implementing Power Management Features on the Nucleus RTOS
Please note that this time the CPU_Wakeup function is no longer empty – in this function, we disable the
ability for SDRAM to go so self-refresh low power mode so that it doesn’t affect performance as needed.
Power savings realized by going to self-refresh mode on our reference platform are on the order of 70mA in
addition to the initial idle mode savings with the grand total of ~260mA savings at the highest CPU operating
point. The resulting power consumption using the test code from the previous section as shown in Figure 8.
Power Values
Maximum
Calculated Average
Minimum
Other more advanced techniques can be used to further reduce dynamic CPU idle power consumption when
using tick suppression (see next section) to allow longer duration idle times. In some situations where idle
times are really long (and hardware allows for such long duration hardware timers) it may even be worth your
while to consider attempting to transition to a lower Operating Point and restore the original in CPU_Wakeup.
Please see the following section on Dynamic Voltage and Frequency Scaling (DVFS) for more details.
Please note that basing your CPU idle on the expected idle time pays off in systems that do not have a large
number of asynchronous external interrupts. Whenever the system receives such an external interrupt while it
is idle, the expected idle time is not met, therefore any calculated power savings will not be fully utilized.
w w w. m e nto r. co m
8
Implementing Power Management Features on the Nucleus RTOS
TICK SUPPRESSION
The Nucleus OS task scheduler has a regular tick timer interrupt occurring at a fixed interval in order to
perform task scheduling. This interval is typically 10ms. In order to allow the CPU idle time to be longer than
this interval, tick suppression must be enabled. Tick Suppression in Nucleus is fully automatic and requires
only that the software developer enable it. To enable tick suppression, all that is required is generate a call
to an API:
NU_PM_Start_Tick_Suppress ();
NU_PM_Stop_Tick_Suppress ();
Tick suppression can generally be enabled and left enabled without any drawbacks. The way tick suppression
works is that whenever the scheduler sees an expected idle time (no tasks to be scheduled) longer than the
time remaining to the next scheduled timer interrupt, it reschedules the next tick/interval timer interrupt to a
much later time (either when the next known tasks are scheduled to run or to a maximum possible tick timer
value). Upon coming out of the idle state, whether the exit was caused by the scheduled interval timer
interrupt or an asynchronous external interrupt, the system time and other state information is properly
adjusted so that any running tasks are not affected – they see “the world” the same regardless if tick
suppression happened or not.
To illustrate how tick suppression works, Figure 9 shows the power consumption of a system running a test
pattern of busy and idle times with tick suppression disabled.
Power Values
Maximum
Calculated Average
Minimum
w w w. m e nto r. co m
9
Implementing Power Management Features on the Nucleus RTOS
In Figure 10 below, we show the very same test pattern but with tick suppression enabled. As you can see
in this graph a large number of power spikes are gone since the tick processing never actually happened.
Power Values
Maximum
Calculated Average
Minimum
One other point worth mentioning is that in order to capture power spikes due to ticks on a power graph, measured
current needs to be sampled at a sufficiently enough rate to capture such short duration “blips.”
w w w. m e nto r. co m
10
Implementing Power Management Features on the Nucleus RTOS
Nucleus abstracts DVFS in terms of operating points. The number of supported operating points can be
obtained from the Nucleus Power Services by calling the NU_PM_Get_OP_Count () API. Information about
each operating point, including frequency (or frequencies if there are multiple clocks in the system) and
voltage can be obtained by calling APIs such as NU_PM_Get_Freq_Info(), NU_PM_Get_Voltage_Info() and
NU_PM_Get_OP_Specific_Info(). For more information on this topic please see the Nucleus power reference
manual6.
Determining the optimal operating point for a particular functionality is best done using measurements.
Ideally, power can be measured at different operating points on the final hardware design while performing
the desired function (e.g. playing music). The optimal operating point is the one that fulfills the performance
requirement and consumes the least power to complete the job.
The problem of finding the optimal operating point can alternatively be approached by a combination of
some simpler measurements and calculations. Since the total power consumed by the device can be divided
between static and dynamic utilization, we can measure each component independently. Both the static and
dynamic power consumption components are typically a function of an operating point. Let’s start by
re-measuring the static power consumption measured in the “System Power States” section earlier in this
paper for each operating point of interest. For our purposes, we’ll pick three different operating points shown
in table below (Figure 11). Note that this reference platform does not support voltage scaling and therefore
only the frequency varies between operating points.
5 In Nucleus, an operating point can also define additional independent clocks and such as DRAM clock and other parameters
such as clock divider settings, etc. Use NU_PM_Get_OP_Specific_Info() API to retrieve the additional information.
6 A Nucleus® Power Management Services Reference Manual is available with Nucleus OS.
w w w. m e nto r. co m
11
Implementing Power Management Features on the Nucleus RTOS
Since we want to measure the static power consumption we made sure that the test application is idle
(sleeping) after setting the system power state and the operating point. A simple application shown in
Figure 12 below accomplishes this task.
#define eMHZ_133 1
#define eMHZ_200 2
#define eMHZ_400 3
pm_status = NU_PM_Set_System_State(eLCD_OFF_USBH_OFF);
pm_status = NU_PM_Start_Tick_Suppress();
while (1)
{
pm_status = NU_PM_Set_Current_OP(eMHZ_400);
NU_Sleep(1*NU_PLUS_TICKS_PER_SEC);
pm_status = NU_PM_Set_Current_OP(eMHZ_200);
NU_Sleep(1*NU_PLUS_TICKS_PER_SEC);
pm_status = NU_PM_Set_Current_OP(eMHZ_133);
NU_Sleep(1*NU_PLUS_TICKS_PER_SEC);
}
}
Figure 12: Operating point test application.
Power Values
Maximum
Calculated Average
Minimum
w w w. m e nto r. co m
12
Implementing Power Management Features on the Nucleus RTOS
Next, we repeated these measurement for each system state and we the resulting power consumption is
shown in the table below (Figure 14).
Notice that for states 3 and 4 OP1 and OP2 do not show results. PM_REQUEST_BELOW_MIN code is returned
by the NU_PM_Set_Current_OP() API indicating that some entity has requested that the operating point not
transition below OP3. In our case it’s the LCD driver that cannot operate properly at the lower operating
points and therefore it issues a minimum operating point request preventing the system from transitioning
any lower. As soon as the LCD driver is turned off, the operating point transitions are allowed again.
w w w. m e nto r. co m
13
Implementing Power Management Features on the Nucleus RTOS
Once we have the idle power consumption measured, we can modify our application to loop for a specified
period of time instead of sleeping. This will allow us to measure the 100% CPU utilization power consumption.
The results for our reference platform are shown in the table below (Figure 15).
Now that we have the complete measurement results we can proceed to calculating the estimated power
consumption given a system state, the operating point, and an estimated average CPU utilization. CPU
utilization should be measured at a lowest operating point that allows it to be under 100%. What that means
is a minimum operating point has to be picked that satisfies the performance requirements and the time
duration of the measurement so the average CPU utilization is under 100%.
For example, if you have a function to be performed that takes 100% CPU utilization for 1.2 seconds at
100MHz, every 2 seconds, then the average CPU utilization is 60% at 100MHz. For remaining operating points
the measured utilization can be scaled linearly with the frequency (e.g. 60% utilization at 100MHz will scale to
20% utilization at 300MHz).
The estimated power consumption can be calculated using the following equation:
where P_Busy and P_Idle come from in Figure 15 above and CPU_UTIL is the estimated average CPU utilization
normalized to 100% = 1.00. Please note that it’s extremely important to scale the CPU utilization properly in
order to get the accurate estimated result. Given the results, one can pick the most optimal operating point
in which to attempt a switch.
So why go through all of these calculations if hardware is available to have power measured for any particular
function at any operating point? Well, with a large number of measurements, sometimes it may be easier to
calculate a large number of data points instead of measuring them, especially if every desired point is now
known at the start and hardware is not available all the time. This is where this estimation method can help.
w w w. m e nto r. co m
14
Implementing Power Management Features on the Nucleus RTOS
That said, the most exciting application for these calculations is the ability to automatically determine the
most optimal OP. Suppose that one has measured the data in Figure 15 and there exists a task that monitors
the current CPU utilization. Such a task can notice that CPU utilization is holding a steady average over some
time while the power state does not change, and attempt transitions to the most optimal calculated operating
point.
APPLICATION PROGRAMMING
When programming applications for a power-optimized system running the Nucleus RTOS, it’s important
to keep the following in mind:
1. Yield the processor (sleep, suspend on event or mutex, etc.) whenever possible. This will allow the
operating system to transition the CPU to the most optimal idle mode automatically.
2. When needing a specific system state, request it via the NU_PM_Request_Min_System_State() API only,
then release that request promptly using the NU_PM_Release_Min_System_State() API when its no
longer needed. This will allow the operating system to automatically fall back to the minimum required
state as requested by all applications in the system.
3. Avoid polled algorithms. Event interrupt and/or message driven systems should be implemented to allow
the CPU to stay Idle as long as possible. As the events get more and more frequent, there exists a point
at which the implementation should switch to polled operation in order to reduce the number of
interrupts and therefore context switches. This reduces the average CPU utilization (and therefore
power consumption) for the same load.
CONCLUSION
As embedded device deployments are growing exponentially, an ever increasing number of those
devices rely on batteries or other limited capacity power sources. Modern chipsets offer more and more
power saving features, but most of them require software complexity to fully utilize. A competitive
embedded OS today must address power management in order to simplify the design process, speed up
time to market, and increase reusability of code between different platforms by abstracting the power
saving features from the designer.
MGC 11 - 11 TECH10420-w