0% found this document useful (0 votes)
92 views50 pages

Embedded Systems 9. Low Power Design: Lothar Thiele

Power

Uploaded by

Sathish Kumar N
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
92 views50 pages

Embedded Systems 9. Low Power Design: Lothar Thiele

Power

Uploaded by

Sathish Kumar N
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

Embedded Systems 9.

Low Power Design

Lothar Thiele

Swiss Federal Institute of Technology

9-1

Computer Engineering and Networks Laboratory

Contents of Course
1. Embedded Systems Introduction 2. Software Introduction 3. Real-Time Models 4. Periodic/Aperiodic Tasks 5. Resource Sharing 6. Real-Time OS 12. Model Based Design 7. System Components 8. Communication 9. Low Power Design 10. Models 11. Architecture Synthesis

Software and Programming


Swiss Federal Institute of Technology

Processing and Communication


9-2

Hardware
Computer Engineering and Networks Laboratory

Topics
General Remarks Power and Energy Basic Techniques
Parallelism VLIW (parallelism and reduced overhead) Dynamic Voltage Scaling Dynamic Power Management

Swiss Federal Institute of Technology

9-3

Computer Engineering and Networks Laboratory

Power and Energy Consumption


Need for efficiency (power and energy):

Power is considered as the most important constraint in embedded systems. [in: L. Eggermont (ed): Embedded Systems Roadmap 2002, STW] Power demands are increasing rapidly, yet battery capacity cannot [in Diztel et al.: Power-Aware Architecting for data-dominated applications, 2007, Springer] keep up.
Swiss Federal Institute of Technology 9-4 Computer Engineering and Networks Laboratory

Implementation Alternatives
General-purpose processors

Performance Power Efficiency

Application-specific instruction set processors (ASIPs) Microcontroller DSPs (digital signal processors)

Flexibility

Programmable hardware FPGA (field-programmable gate arrays)

Application-specific integrated circuits (ASICs)


Swiss Federal Institute of Technology 9-5 Computer Engineering and Networks Laboratory

The Power/Flexibility Conflict


10 1 0.1 0.01 Operations/Watt [MOPS/mW] DSP-ASIPs Ps poor design techniques Technology

1.0

0.5

0.25

0.13

0.07

Necessary to optimize HW and SW. Use heterogeneous architectures. Apply specialization techniques.
Swiss Federal Institute of Technology 9-6

[H. de Man, Keynote, DATE02; T. Claasen, ISSCC99]


Computer Engineering and Networks Laboratory

Energy Efficiency

Hugo De Man, IMEC, Philips, 2007

Swiss Federal Institute of Technology

9-7

Computer Engineering and Networks Laboratory

Topics
General Remarks Power and Energy Basic Techniques
Parallelism VLIW (parallelism and reduced overhead) Dynamic Voltage Scaling Dynamic Power Management

Swiss Federal Institute of Technology

9-8

Computer Engineering and Networks Laboratory

Power and Energy are Related


P

E t In many cases, faster execution also means less energy, but the opposite may be true if power has to be increased to allow faster execution.
Swiss Federal Institute of Technology 9-9 Computer Engineering and Networks Laboratory

Low Power vs. Low Energy


Minimizing the power consumption is important for
the design of the power supply the design of voltage regulators the dimensioning of interconnect cooling (short term cooling)
high cost (estimated to be rising at $1 to $3 per Watt for heat dissipation [Skadron et al. ISCA 2003]) limited space

Minimizing the energy consumption is important due to


restricted availability of energy (mobile systems) limited battery capacities (only slowly improving) very high costs of energy (solar panels, in space) long lifetimes, low temperatures
9-10 Computer Engineering and Networks Laboratory

Swiss Federal Institute of Technology

Power Consumption of a CMOS Gate


subthreshold and gate-oxide leakage

Ileak : leakage current Iint : short circuit current Isw : switching current

Swiss Federal Institute of Technology

9-11

Computer Engineering and Networks Laboratory

Power Consumption of CMOS Processors


Main sources:
Dynamic power consumption
charging and discharging capacitors

Short circuit power consumption


short circuit path between supply rails during switching

Leakage
leaking diodes and translators becomes one of the major factors due to shrinking feature sizes in semiconductor technology

Swiss Federal Institute of Technology

9-12

Computer Engineering and Networks Laboratory

Dynamic Voltage Scaling (DVS)


Power consumption of CMOS circuits (ignoring leakage): Delay for CMOS circuits:

: supply voltage : switching activity : load capacity : clock frequency

: supply voltage : threshold voltage

Decreasing Vdd reduces P quadratically (f constant). The gate delay increases only reciprocally. Maximal frequency fmax decreases linearly.
Swiss Federal Institute of Technology 9-13 Computer Engineering and Networks Laboratory

Potential for Energy Optimization: DVS

Saving energy for a given task: Reduce the supply voltage Vdd Reduce switching activity Reduce the load capacitance CL Reduce the number of cycles #cycles

Swiss Federal Institute of Technology

9-14

Computer Engineering and Networks Laboratory

Example: Voltage Scaling

[Courtesy, Yasuura, 2000]


Swiss Federal Institute of Technology 9-15

Vdd
Computer Engineering and Networks Laboratory

Power Supply Gating


Power gating is one of the most effective ways of minimizing static power consumption (leakage)
Cut-off power supply to inactive units/components Reduces leakage

Swiss Federal Institute of Technology

9-16

Computer Engineering and Networks Laboratory

Topics
General Remarks Power and Energy Basic Techniques
Parallelism VLIW (parallelism and reduced overhead) Dynamic Voltage Scaling Dynamic Power Management

Swiss Federal Institute of Technology

9-17

Computer Engineering and Networks Laboratory

Use of Parallelism
Vdd fmax Vdd/2 fmax/2 Vdd/2 fmax/2

Swiss Federal Institute of Technology

9-18

Computer Engineering and Networks Laboratory

Use of Pipelining
Vdd fmax Vdd/2 fmax/2 Vdd/2 fmax/2

Swiss Federal Institute of Technology

9-19

Computer Engineering and Networks Laboratory

Topics
General Remarks Power and Energy Basic Techniques
Parallelism VLIW (parallelism and reduced overhead) Dynamic Voltage Scaling Dynamic Power Management

Swiss Federal Institute of Technology

9-20

Computer Engineering and Networks Laboratory

New ideas help


Pentium Crusoe

Running the same multimedia application.


As published by Transmeta [www.transmeta.com]
Swiss Federal Institute of Technology 9-21 Computer Engineering and Networks Laboratory

VLIW Architectures
Large degree of parallelism
many computational units, (deeply) pipelined

Simple hardware architecture


explicit parallelism (parallel instruction set) parallelization is done offline (compiler)

Swiss Federal Institute of Technology

9-22

Computer Engineering and Networks Laboratory

Transmeta is a typical VLIW Architecture

Swiss Federal Institute of Technology

9-23

Computer Engineering and Networks Laboratory

Transmeta

VLIW

(VLIW)
Swiss Federal Institute of Technology 9-24 Computer Engineering and Networks Laboratory

Topics
General Remarks Power and Energy Basic Techniques
Parallelism VLIW (parallelism and reduced overhead) Dynamic Voltage Scaling Dynamic Power Management

Swiss Federal Institute of Technology

9-25

Computer Engineering and Networks Laboratory

Spatial vs. Dynamic Voltage Management


Slow Module 1.3V 50MHz Normal Mode 1.3 V 50MHz

Standard Modules 1.8V 100MHz

Busy Module 3.3V 200MHz

Busy Mode 3.3 V 200MHz

Not all components require same performance.


Swiss Federal Institute of Technology 9-26

Required performance may change over time


Computer Engineering and Networks Laboratory

Potential for Energy Optimization: DVS

Saving energy for a given task: Reduce the supply voltage Vdd Reduce switching activity Reduce the load capacitance CL Reduce the number of cycles #cycles

Swiss Federal Institute of Technology

9-27

Computer Engineering and Networks Laboratory

Example: INTEL Xscale


OS should schedule distribution of the energy budget.

Swiss Federal Institute of Technology

9-28

Computer Engineering and Networks Laboratory

From Intels Web Site

Example: Voltage Scaling

[Courtesy, Yasuura, 2000]


Swiss Federal Institute of Technology 9-29

Vdd
Computer Engineering and Networks Laboratory

DVS Example: a) Complete task ASAP

Task that needs to execute 109 cycles within 25 seconds.


Ea= 109 x 40 x 10-9 = 40 [J]

Swiss Federal Institute of Technology

9-30

Computer Engineering and Networks Laboratory

DVS Example: b) Two voltages

Eb= 750 106 x 40 x 10-9 + 250 106 x 10 x 10-9 = 32.5 [J]

Swiss Federal Institute of Technology

9-31

Computer Engineering and Networks Laboratory

DVS Example: c) Optimal Voltage

Ec = 109 x 25 x 10-9 = 25 [J]

Swiss Federal Institute of Technology

9-32

Computer Engineering and Networks Laboratory

DVS: Optimal Strategy


y z x Ta T t Vdd P(y) P(z) P(x)

Execute task in fixed time T with variable voltage Vdd(t):


gate delay: execution rate: invariant:

z = a x + (1-a) y

case A: execute at voltage x for T a time units and at voltage y for (1-a) T time units; energy consumption T ( P(x) a + P(y) (1-a) ) case B: execute at voltage z = a x + (1-a) y for T time units; energy consumption T P(z)
Swiss Federal Institute of Technology 9-33 Computer Engineering and Networks Laboratory

DVS: Optimal Strategy


Dynamic power is a convex function of Vdd P(y) P(x) a + P(y) (1-a) P(x) P(z)

If possible, running at a constant frequency (voltage) minimizes the energy consumption for dynamic voltage scaling:
case A is always worse if the power consumption is a convex function of the supply voltage
Swiss Federal Institute of Technology 9-34 Computer Engineering and Networks Laboratory

DVS: Offline Scheduling on One Processor


Let us model a set of independent tasks as follows: We suppose that a task vi V
requires ci computation time at normalized processor frequency 1 arrives at time ai has (absolute) deadline constraint di

How do we schedule these tasks such that all these tasks can be finished no later than their deadlines and the energy consumption is minimized?
YDS Algorithm from A Scheduling Model for Reduce CPU
Energy, Frances Yao, Alan Demers, and Scott Shenker, FOCS 1995. If possible, running at a constant frequency (voltage) minimizes the energy consumption for dynamic voltage scaling.
Swiss Federal Institute of Technology 9-35 Computer Engineering and Networks Laboratory

YDS Algorithm for Offline Scheduling


1 2 4 3
0 4 8 12

5 6

3,6,5 2,6,3 7
16 time

0,8,2 6,14,6 10,14,6 11,17,2 12,17,2

Define intensity G([z, z]) in some time interval [z, z]: average accumulated execution time of all tasks that have arrival and deadline in [z, z] relative to the length of the interval z-z

ai,di,ci

Swiss Federal Institute of Technology

9-36

Computer Engineering and Networks Laboratory

YDS Algorithm for Offline Scheduling


Step 1: Execute jobs in the interval with the highest intensity by using the earliest-deadline first schedule and running at the intensity as the frequency.
1 2 4 3
0 4 8 12

5 6 7
16 time

3,6,5 2,6,3 0,8,2 6,14,6 10,14,6 11,17,2 12,17,2

G([0,6]) = (5+3)/6=8/6, G([0,8]) = (5+3+2)/ (8-0) = 10/8, G([0,14]) = (5+3+2+6+6)/14=11/7, G([0,17]) = (5+3+2+6+6+2+2)/17=26/17 G([2, 6]) = (5+3)/(6-2)=2, G([2,14]) = (5+3+6+6) / (14-2) = 5/3, G([2,17]) = (5+3+6+6+2+2)/15=26/15 G([3,6]) =5/3, G([3,14]) = (5+6+6)/(14-3) = 17/11, G([3,17])=(5+6+6+2+2)/14=21/14 G([6,14]) = 12/(14-6)=12/8, G([6,17]) = (6+6+2+2)/(17-6)=16/11 G([10,14]) = 6/4, G([10,17]) = 10/7, G([11,17]) = 4/6, G([12,17]) = 2/5
Swiss Federal Institute of Technology 9-37

ai,di,ci

Computer Engineering and Networks Laboratory

YDS Algorithm for Offline Scheduling


Step 1: Execute jobs in the interval with the highest intensity by using the earliest-deadline first schedule and running at the intensity as the frequency.
1 2 4 3
0 4 8 12

5 6 7
16 time

3,6,5 2,6,3 0,8,2 6,14,6 10,14,6 11,17,2

2
0 4

1
8 12 16

12,17,2

ai,di,ci
Computer Engineering and Networks Laboratory

Swiss Federal Institute of Technology

9-38

YDS Algorithm for Offline Scheduling


Step 2: Adjust the arrival times and deadlines by excluding the possibility to execute at the previous critical intervals.
1 2 4 3
0 4 8 12

5 6 0,8,2 7
16 time

0,4,2 2,10,6 6,10,6 7,13,2 8,13,2

6,14,6 10,14,6 11,17,2

5 6 4 3
0 4 8

12,17,2

ai,di,ci
7
12 16
9-39

time
Computer Engineering and Networks Laboratory

Swiss Federal Institute of Technology

YDS Algorithm for Offline Scheduling


Step 3: Run the algorithm for the revised input again
5 6 4 3
0 4 8

0,4,2 2,10,6 6,10,6


12 16 time

7,13,2 8,13,2

G([0,4])=2/4, G([0,10]) = 14/10, G([0,13])=18/13 G([2,10])=12/8, G([2,13]) = 16/11, G([6,10])=6/4 G([6,13])=10/7, G([7,13])=4/6, G([8,13])=4/5

ai,di,ci

4
0 4
Swiss Federal Institute of Technology

5
8 12 16
9-40

time
Computer Engineering and Networks Laboratory

YDS Algorithm for Offline Scheduling


Step 3: Run the algorithm for the revised input again Step 4: Put pieces together
frequency

0,4,2 2 1
4

0,2,2 2,5,2 2,5,2

4
8

5
12 16

time

7,13,2 8,13,2

frequency

3
0

2
4

0,2,2 4
8

0,2,2

5
12

7
16

time

v1 frequency
Swiss Federal Institute of Technology

v2 2

v3 1

v4 1.5
9-41

v5 1.5

v6 4/3

v7 4/3
Computer Engineering and Networks Laboratory

DVS: Online Scheduling on One Processor


frequency 3 2 1

3,6,5 2,6,3 2 3 2 1 3 4 4 5 6 7
time

0,8,2 6,14,6 10,14,6 11,17,2 12,17,2

0 4 8 12 16 Continuously update to the best schedule for all arrived tasks


Time 0: task v3 is executed at 2/8 Time 2: task v2 arrives G([2,6]) = , G([2,8]) = 4.5/6=3/4 => execute v2 at Time 3: task v1 arrives G([3,6]) = (5+3-3/4)/3=29/12, G([3,8]) < G([3,6]) => execute v2 and v1 at 29/12 Time 6: task v4 arrives G([6,8]) = 1.5/2, G([6,14]) = 7.5/8 => execute v3 and v4 at 15/16 Time 10: task v5 arrives G([10,14]) = 39/16 => execute v4 and v5 at 39/16 Time 11 and Time 12 The arrival of v6 and v7 does not change the critical interval Time 14: G([14,17]) = 4/3 => execute v6 and v7 at 4/3
Swiss Federal Institute of Technology 9-42

ai,di,ci

Computer Engineering and Networks Laboratory

Remarks on YDS Algorithm


Offline
The algorithm guarantees the minimal energy consumption while satisfying the timing constraints The time complexity is O(N3), where N is the number of tasks in V
Finding the critical interval can be done in O(N2) The number of iterations is at most N

Exercise:
For periodic real-time tasks with deadline=period, running at constant speed with 100% utilization under EDF has minimum energy consumption while satisfying the timing constraints.

Online
Compared to the optimal offline solution, the on-line schedule uses at most 27 times of the minimal energy consumption.
Swiss Federal Institute of Technology 9-43 Computer Engineering and Networks Laboratory

Topics
General Remarks Power and Energy Basic Techniques
Parallelism VLIW (parallelism and reduced overhead) Dynamic Voltage Scaling Dynamic Power Management

Swiss Federal Institute of Technology

9-44

Computer Engineering and Networks Laboratory

Swiss Federal Institute of Technology

9-45

Computer Engineering and Networks Laboratory

Dynamic Power Management (DPM)


Dynamic Power management tries to assign optimal power saving states Requires Hardware Support Example: StrongARM SA1100 400mW
RUN: operational IDLE: a SW routine may stop the CPU when not in use, while monitoring interrupts SLEEP: Shutdown of on-chip activity 10s 4J

RUN
90s 10s 36J 4J 90s 5J 160ms 64mJ

IDLE 50mW
Swiss Federal Institute of Technology 9-46

SLEEP 160W

Computer Engineering and Networks Laboratory

Reduce Power According to Workload


application states shut down busy run Tsd Tbs
Tsd: shutdown delay Tbs: time before shutdown Twu: wakeup delay

wake up waiting sleep Twu busy run

power states

Desired: Shutdown only during long idle times Tradeoff between savings and overhead
Swiss Federal Institute of Technology 9-47 Computer Engineering and Networks Laboratory

The Challenge
Questions: When to go to a power-saving state? Is an idle period long enough for shutdown? Predicting the future

Swiss Federal Institute of Technology

9-48

Computer Engineering and Networks Laboratory

Combining DVFS and DPM


DVS Critical frequency (voltage): Running at any frequency/voltage lower than this frequency is not worthwhile for execution. sleep
power during run task using voltage and frequency scaling

run task sleep

time

energy for executing task

Critical voltage
Swiss Federal Institute of Technology 9-49 Computer Engineering and Networks Laboratory

Procrastination Schedule
frequency 3 2 1 critical frequency: 1.5

3,6,5 2,6,3

3
0

2
4

4
8

54 5 6
12

7 7
16 time

0,8,1 7,14,2 10,14,2 13,17,2 15,17,2

YDS algorithm, rounded up

procrastinate scheduling

Execute by using voltages higher or equal to the critical voltage only apply YDS algorithm round up voltages lower than the critical voltage Procrastinate the execution of tasks to aggregate enough time for sleeping Try to reduce the number of times to turn on/off Sleep as long as possible
Swiss Federal Institute of Technology 9-50

ai,di,ci

Computer Engineering and Networks Laboratory

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy