0% found this document useful (0 votes)

21 views25 pages

HC2024 T2 Qualcomm NaderNikfar Final-0824

Uploaded by

21521811

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views25 pages

HC2024 T2 Qualcomm NaderNikfar Final-0824

Uploaded by

21521811

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

On-device AI and its thermal

implications

Nader Nikfar, Sr. Dir. of Technology

Qualcomm Technologies, Inc. HotChips, Aug. 2024

Snapdragon and Qualcomm branded products are products of Qualcomm Technologies, Inc. and/or its subsidiaries.
1. What is on-device AI?

Agenda
2. Benefits & importance
3. Use-case evolution
4. Thermal Implications
5. Potential solutions
6. Summary
What is
on-device AI?

3
AI is transforming …

4
Intelligence is moving towards edge devices
On-device and AI go hand in hand

• Running machine learning on devices

like smartphones, laptops, and cars,
instead of cloud.
• On-device AI utilizes on-chip processors
such as NPU, CPU, and GPU.
• Current on-device AI can support
inferences on multiple platforms.
• Cloud is currently necessary for pooling
of big data and training AI inference
algorithms, and it complements on-
device processing. Devices, machines, and things are becoming more intelligent

SOC consist of several processors integrated into the same die

5
Al apps enabled by on-device Generative AI

6
Benefits
&
Importance

7
Advantages of on-device inference

8
• Inference running entirely in the
cloud will have issues for real-time
applications that are latency-
sensitive and mission-critical like
autonomous driving. Automotive
• Such applications cannot afford
Gen AI can be used for ADAS/AD
the roundtrip time or rely on critical
to help improve drive policy by
functions to operate when in predicting the trajectory and
variable wireless coverage. behavior of various agents
Use-Case
Evolution

10
2015 2016-2022 2023 2023+

Use Case
Audio/ Camera
Speech Video
LLM-powered assistants

Audio/ Multi-modal gen

Speech Stable Diffusion/ControNet AI models

Micro Tile Multi-Modal AI

Hardware Scalar Vector Tensor Scalar Vector
Inferencing
Tensor Scalar Vector

Transformer Support
Transformer Support

Models Simple CNN Transformer / LSTM/ 10B LLMs / LVMs/LMMs 10B++LLMs/LVMs/LMMs

RNN/CNN
12
Thermal
Implications

13
Devices

Thermal implications:
• Workload dependent
• Device/platform thermal solution & its required boundary conditions
• Hardware configuration
14
10 billion parameter mark
On device Cloud

As model complexities grow, potential constraints in processing Almost no constraints in processing

power, memory, and power. power, memory, and power.

Active MID
Cooling
Load
LOW
OFF
Do we have a thermal problem now?

LLM model with billions of parameters run within thermal limits on mobile and laptops
16
Potential
Solutions

17
Improved performance & power consumption
Most impactful approach for thermal

18
Sustained NPU Performance
One hour of continuous NPU usage

>5X
higher performance
at lower thermals
Intel Core Ultra 7 155H

19
Thermal Mitigation

• If necessary, mitigation policies similar to what has been utilized for CPUs will be applicable.
• In some platforms and depending on the workload, latency in completing the task may not be
as time-sensitive.
• Mobile devices can accommodate many useful Gen-AI workloads despite thermal
constraints.

20
Hybrid AI
Distributes and coordinates AI workloads among cloud and edge devices where and when appropriate

Energy-synergy:
- If task is determined too complex for
device, it will be offloaded to cloud avoiding
processing on-device.
- If task is fit for on-device, no inference in
the cloud, hence total energy consumption is
lower
• Eliminates round trip communication to the cloud +
inference on-device consumes much less power +
offloads datacenter energy consumption to meet their
environmental and sustainability goals.

21
Packaging Innovations

Chiplet Package Integration

and/or

2.5D 3D
Active Die

   Package Technology Integration → → →

More-than-Moore advances will contribute to improved performance/watt

22
Continued AI Research
Research is in our DNA

23
Summary

• On-device AI provides low latency, high performance, lower power

consumption, and reliability.
• Power and thermal efficiency are essential for on-device AI.
• On-device AI still evolving; hard to predict how models will evolve
in complexity to accurately predict thermal bottlenecks across
mobile platforms.
• Current on-device AI can support inferences on multiple platforms.
• As AI model complexities grow, tight power and thermal constraints
within mobile devices will drive innovation.
• Continued AI research and engineering developments will lead to
further efficiency optimization (Lower power, more inference). As
such:
• Microarchitecture improvements
• Technology node advances
• Powerful GenAI models become more efficient
• Hybrid-AI

24
Thank you
Nothing in these materials is an offer to sell any of the components References in this presentation to “Qualcomm” may mean Qualcomm Incorporated,
or devices referenced herein. Qualcomm Technologies, Inc., and/or other subsidiaries or business units within
the Qualcomm corporate structure, as applicable. Qualcomm Incorporated includes our
© Qualcomm Technologies, Inc. and/or its affiliated
licensing business, QTL, and the vast majority of our patent portfolio. Qualcomm
Follow us on: companies. All Rights Reserved.
Technologies, Inc., a subsidiary of Qualcomm Incorporated, operates, along with its
Qualcomm is a trademark or registered trademark of Qualcomm Incorporated. subsidiaries, substantially all of our engineering, research and development functions, and
For more information, visit us at: Other products and brand names may be trademarks
substantially all of our products and services businesses, including our QCT semiconductor
business.
or registered trademarks of their respective owners.
qualcomm.com & qualcomm.com/blog Snapdragon and Qualcomm branded products are products of Qualcomm Technologies, Inc.
and/or its subsidiaries. Qualcomm patented technologies are licensed by Qualcomm
Incorporated.

Tinder Questionnaire
No ratings yet
Tinder Questionnaire
4 pages
HC31 1.11 Huawei - Davinci.HengLiao v4.0 PDF
No ratings yet
HC31 1.11 Huawei - Davinci.HengLiao v4.0 PDF
44 pages
MDN 04 0213DG
No ratings yet
MDN 04 0213DG
95 pages
Transforming Edge Ai With Npus in Microcontrollers
No ratings yet
Transforming Edge Ai With Npus in Microcontrollers
12 pages
Npu AI
No ratings yet
Npu AI
54 pages
Devices - Reimagined
No ratings yet
Devices - Reimagined
8 pages
On-Device AI and Edge Computing Optimization
No ratings yet
On-Device AI and Edge Computing Optimization
3 pages
Power Efficientreconfigurable Accelerator For Deep Convolutional Neural Networks
No ratings yet
Power Efficientreconfigurable Accelerator For Deep Convolutional Neural Networks
6 pages
Accelerating Artificial Intelligence Innovation With Concurrent Design Engineering
No ratings yet
Accelerating Artificial Intelligence Innovation With Concurrent Design Engineering
26 pages
Understanding AI Part 2 Inference, Revised
No ratings yet
Understanding AI Part 2 Inference, Revised
4 pages
The Ai PC Opportunity White Paper
No ratings yet
The Ai PC Opportunity White Paper
8 pages
Module 5-Notes
No ratings yet
Module 5-Notes
10 pages
AI Opportunities For Increased Energy Autonomy of Low Power IoT Devices
No ratings yet
AI Opportunities For Increased Energy Autonomy of Low Power IoT Devices
4 pages
Techology Trend of Edge Ai: Yen-Lin Lee, Pei-Kuei Tsung, and Max Wu Mediatek Inc
No ratings yet
Techology Trend of Edge Ai: Yen-Lin Lee, Pei-Kuei Tsung, and Max Wu Mediatek Inc
2 pages
A Holistic Approach To Energy Efficient Soc Design WP
No ratings yet
A Holistic Approach To Energy Efficient Soc Design WP
7 pages
Hc2024 Amd Vpeng
No ratings yet
Hc2024 Amd Vpeng
36 pages
5 Introduction To Huawei AI Platforms v3.5
No ratings yet
5 Introduction To Huawei AI Platforms v3.5
113 pages
User Friendly S8e3 The Impact of On Device Ai
No ratings yet
User Friendly S8e3 The Impact of On Device Ai
7 pages
Ten Lessons From Three Generations Shaped Google S Tpuv4i
No ratings yet
Ten Lessons From Three Generations Shaped Google S Tpuv4i
40 pages
2023-05 On Device AI - Double-Edged Sword
No ratings yet
2023-05 On Device AI - Double-Edged Sword
9 pages
Atlas Ai
No ratings yet
Atlas Ai
69 pages
1434 - MRAM Is For The Edge and Beyond
No ratings yet
1434 - MRAM Is For The Edge and Beyond
19 pages
Session11 Papers
No ratings yet
Session11 Papers
13 pages
Wireless Electricity
No ratings yet
Wireless Electricity
16 pages
Ai Disruption Driving Innovation On Device Inference
No ratings yet
Ai Disruption Driving Innovation On Device Inference
12 pages
AI Computing Trends - Challenges Innovations-Final
No ratings yet
AI Computing Trends - Challenges Innovations-Final
18 pages
Futuro Aqui Ia Computing
No ratings yet
Futuro Aqui Ia Computing
3 pages
Artificial Intelligent & Deep Learning Hardware Accelerators For Smart Technology and Intelligent Society
No ratings yet
Artificial Intelligent & Deep Learning Hardware Accelerators For Smart Technology and Intelligent Society
91 pages
Mobile Computing (Group 08)
No ratings yet
Mobile Computing (Group 08)
7 pages
1.2 The New Disruptive Force in High-End IoT Markets - RSB-3810 (Mediatek Genio 1200) - MediaTek
No ratings yet
1.2 The New Disruptive Force in High-End IoT Markets - RSB-3810 (Mediatek Genio 1200) - MediaTek
17 pages
AI and ML Accelerator Survey and Trends
No ratings yet
AI and ML Accelerator Survey and Trends
10 pages
AI and ML Accelerator Survey and Trends
No ratings yet
AI and ML Accelerator Survey and Trends
10 pages
【极术公开课】AI大模型与智能物联创新应用
No ratings yet
【极术公开课】AI大模型与智能物联创新应用
25 pages
Hackstorm - Intel® AI PC Edition Sample Idea
No ratings yet
Hackstorm - Intel® AI PC Edition Sample Idea
10 pages
MythicWhitepaper 2019oct31
No ratings yet
MythicWhitepaper 2019oct31
9 pages
MCHP-UK-MEL3272-AI Trends-190889 Final
No ratings yet
MCHP-UK-MEL3272-AI Trends-190889 Final
10 pages
AI Accelerator
No ratings yet
AI Accelerator
5 pages
Accelerated Computing
No ratings yet
Accelerated Computing
3 pages
Edge AI Solutions On STM32 Overview
No ratings yet
Edge AI Solutions On STM32 Overview
43 pages
Generative AI at The Edge
100% (1)
Generative AI at The Edge
37 pages
AI Benchmark: All About Deep Learning On Smartphones in 2019
No ratings yet
AI Benchmark: All About Deep Learning On Smartphones in 2019
19 pages
Evolving CPU Architectures For AI
No ratings yet
Evolving CPU Architectures For AI
5 pages
March 27 Commercial MTE HPStrix Halo
No ratings yet
March 27 Commercial MTE HPStrix Halo
42 pages
Improving Energy Efficiency Through Parallelization
No ratings yet
Improving Energy Efficiency Through Parallelization
10 pages
Hardware Accelerators For Artificial Intelligence
No ratings yet
Hardware Accelerators For Artificial Intelligence
38 pages
Rapid Fire Questions
No ratings yet
Rapid Fire Questions
7 pages
Google AI Infrastructure Supremacy - Systems Matter More Than Microarchitecture - SemiAnalysis
No ratings yet
Google AI Infrastructure Supremacy - Systems Matter More Than Microarchitecture - SemiAnalysis
22 pages
Automates Neural Architecture Construction
No ratings yet
Automates Neural Architecture Construction
23 pages
Document
No ratings yet
Document
2 pages
Lecture Slides-Week2
No ratings yet
Lecture Slides-Week2
58 pages
Artificial Intelligence in The Internet of Things
No ratings yet
Artificial Intelligence in The Internet of Things
6 pages
Synopsys Ai Chips Ebook
No ratings yet
Synopsys Ai Chips Ebook
58 pages
Designing Efficient and High-Performance AI Accele
No ratings yet
Designing Efficient and High-Performance AI Accele
13 pages
Vishwa HLD LLD - Ver0.1
No ratings yet
Vishwa HLD LLD - Ver0.1
31 pages
Ta 01 Sharma Crimmins Paper
No ratings yet
Ta 01 Sharma Crimmins Paper
19 pages
02global Market Trends Industry Technology Roadmap Esg Christophe Fouquet
No ratings yet
02global Market Trends Industry Technology Roadmap Esg Christophe Fouquet
50 pages
Intel Edge AI Open VINO
100% (1)
Intel Edge AI Open VINO
14 pages
06 AI Computing Platform Atlas
No ratings yet
06 AI Computing Platform Atlas
57 pages
B
No ratings yet
B
1 page
Merisiana Malya
No ratings yet
Merisiana Malya
3 pages
Software Defined Networking (SDN) - a definitive guide
From Everand
Software Defined Networking (SDN) - a definitive guide
Rajesh Kumar Sundararajan
2/5 (2)
Physical Computing: Exploring Computer Vision in Physical Computing
From Everand
Physical Computing: Exploring Computer Vision in Physical Computing
Fouad Sabry
No ratings yet
Thesis Using Barcode
100% (3)
Thesis Using Barcode
7 pages
As 98788 003 001
No ratings yet
As 98788 003 001
19 pages
CAE 1 - Time Table 2021-2026, 2022-2026, 2022-2027, 2023-2026, 2023-2027 & 2023-2028
No ratings yet
CAE 1 - Time Table 2021-2026, 2022-2026, 2022-2027, 2023-2026, 2023-2027 & 2023-2028
17 pages
Mythos Brochure Digital
No ratings yet
Mythos Brochure Digital
18 pages
Implement Basic Connectivity
No ratings yet
Implement Basic Connectivity
9 pages
COA Minimum Standards Regulations 2015 Submitted To MHRD For Approval
No ratings yet
COA Minimum Standards Regulations 2015 Submitted To MHRD For Approval
30 pages
SolarEdge Reimbursement Policy (Current)
No ratings yet
SolarEdge Reimbursement Policy (Current)
1 page
Working With Categorical Data Chapter4
No ratings yet
Working With Categorical Data Chapter4
33 pages
LSB Based Digital Watermarking Technique
No ratings yet
LSB Based Digital Watermarking Technique
4 pages
Samsung V-NAND Technology: Yield More Capacity, Performance, Endurance and Power Efficiency
No ratings yet
Samsung V-NAND Technology: Yield More Capacity, Performance, Endurance and Power Efficiency
8 pages
ME200 Drawing Part1
No ratings yet
ME200 Drawing Part1
99 pages
John Lewis Partnership Card Welcome Booklet
No ratings yet
John Lewis Partnership Card Welcome Booklet
10 pages
DAY - 1 Day - 6 DAY - 11: Extended Primitives Modelling
No ratings yet
DAY - 1 Day - 6 DAY - 11: Extended Primitives Modelling
2 pages
Data STRC
No ratings yet
Data STRC
7 pages
323-1851-520 (6500 R9.3 PMS) Issue2
No ratings yet
323-1851-520 (6500 R9.3 PMS) Issue2
494 pages
URC Total Control Roku IG Rev3.0 03152024
No ratings yet
URC Total Control Roku IG Rev3.0 03152024
27 pages
5G Network Design With HTZ
No ratings yet
5G Network Design With HTZ
4 pages
YSMAN067 Anglais Rév.1 (Hema Star II Service)
100% (1)
YSMAN067 Anglais Rév.1 (Hema Star II Service)
33 pages
Marketing Group Assignment Mic
No ratings yet
Marketing Group Assignment Mic
23 pages
Itu-T: Transmission Impairments Due To Speech Processing
No ratings yet
Itu-T: Transmission Impairments Due To Speech Processing
26 pages
Unit2 Cs
No ratings yet
Unit2 Cs
16 pages
4G Mobile WiFi 3 Quick Start - (E5785-330 01 En)
No ratings yet
4G Mobile WiFi 3 Quick Start - (E5785-330 01 En)
13 pages
Poker Math
No ratings yet
Poker Math
212 pages
SEminar Report On Cloud Computing PDF
50% (2)
SEminar Report On Cloud Computing PDF
25 pages
Optical Fiber Communication by Sunil S Harakannanavar 1f2795 PDF
No ratings yet
Optical Fiber Communication by Sunil S Harakannanavar 1f2795 PDF
218 pages
MECH3780 Fluid Mechanics 2 and CFD: Computation Fluid Dynamics (CFD) Lecture 6 - Evaluation of Numerical Solutions
No ratings yet
MECH3780 Fluid Mechanics 2 and CFD: Computation Fluid Dynamics (CFD) Lecture 6 - Evaluation of Numerical Solutions
12 pages
How AI Is Creating A New World
No ratings yet
How AI Is Creating A New World
3 pages
Mho Relay 2 PDF
No ratings yet
Mho Relay 2 PDF
8 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

HC2024 T2 Qualcomm NaderNikfar Final-0824

Uploaded by

HC2024 T2 Qualcomm NaderNikfar Final-0824

Uploaded by

On-device AI and its thermal

Nader Nikfar, Sr. Dir. of Technology

• Running machine learning on devices

SOC consist of several processors integrated into the same die

Audio/ Multi-modal gen

Micro Tile Multi-Modal AI

Models Simple CNN Transformer / LSTM/ 10B LLMs / LVMs/LMMs 10B++LLMs/LVMs/LMMs

As model complexities grow, potential constraints in processing Almost no constraints in processing

Chiplet Package Integration

   Package Technology Integration → → →

More-than-Moore advances will contribute to improved performance/watt

• On-device AI provides low latency, high performance, lower power

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.