0% found this document useful (0 votes)

3 views15 pages

STWP Presentation

This research explores optimizing AI workload scaling in cloud environments using a managed Kubernetes approach, specifically Amazon EKS, to address challenges like resource inefficiency and scaling latency. It proposes a framework for dynamically orchestrating AI workloads, aiming to improve resource efficiency and cost while maintaining performance. The study provides practical guidelines for cloud architects and AI engineers to implement resilient infrastructure for AI applications.

Uploaded by

netkerishi2764

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views15 pages

STWP Presentation

Uploaded by

netkerishi2764

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 15

Scaling AI Workloads in Cloud

Environments: A Managed Kubernetes

Approach
Authors: Vaishnavi Pangare , Akanksha Singh, Hritika Pawar
Abstract:
This research investigates optimizing AI workload scaling in cloud environments through
a virtual Kubernetes approach, specifically utilizing Amazon Elastic Kubernetes Service
(EKS). As AI applications become increasingly resource-intensive, traditional scaling
methods fail to efficiently balance performance, cost, and availability. This study proposes
a novel framework for dynamically orchestrating AI workloads across virtual Kubernetes
clusters, addressing challenges related to resource utilization, latency, and fault tolerance.
Using a mixed-methods approach combining experimental benchmarking and case studies
across multiple AI workload types, the research aims to demonstrate significant
improvements in resource efficiency and cost reduction while maintaining or enhancing AI
model performance. The findings will provide cloud architects and AI engineers with
practical guidelines for implementing resilient, cost-effective infrastructure for modern AI
applications
Problem Statement :-
• Organizations deploying AI workloads in cloud environments face several critical
challenges:
• Resource Inefficiency: Standard Kubernetes deployments often lead to underutilized
compute resources, especially GPUs and specialized AI accelerators.
• Scaling Latency: Traditional auto-scaling mechanisms are not responsive enough for the
bursty nature of AI workload demands.
• Cost Management: Organizations struggle to balance performance requirements with
budget constraints.
• Workload Heterogeneity: Different AI applications have varying resource profiles and
scaling requirements.
• Multi-tenant Optimization: Efficiently sharing infrastructure across multiple AI
applications or teams.
Research Objectives:
• Design a reference architecture for managed Kubernetes clusters optimized for AI
workloads using Amazon EKS.
• Develop and implement predictive scaling algorithms that minimize latency for
changing AI workload demands.
• Create a classification framework for AI workloads that guides optimal infrastructure
provisioning.
• Establish benchmarking methodologies to evaluate the performance and cost-
efficiency of the proposed solution.
• Formulate best practices and implementation guidelines for organizations deploying
AI workloads on EKS
Significance of the Study :

This research addresses a critical gap in cloud computing and AI infrastructure, with
significant benefits for:
• Academic Community
• Industry Practitioners
• Technology Ecosystem
Gap Analysis :
Despite these advancements, several gaps remain in the literature:
1. Limited integration between workload characterization and automated scaling mech
anisms
2. Insufficient attention to heterogeneous AI workloads sharing infrastructure
3. Lack of comprehensive frameworks that balance performance, cost, and resource
utilization
4. Minimal research on the application of virtual Kubernetes clusters for AI workloads
5. Few empirical studies comparing different configuration approaches for EKS in AI
contexts
Methodology:
What Is Local Kubernetes?
Definition: A self-contained Kubernetes deployment running on local infrastructure
Key Characteristics:
- Runs on physical servers or VMs within organizational boundaries
- Complete control over hardware, networking, and storage resources
- Full Kubernetes functionality in an on-premises environment
- Direct management of the entire Kubernetes stack

Components:-
- Control plane (API server, scheduler, controller manager, etcd)
- Worker nodes running containerized applications
- Local persistent storage solutions
- Physical networking infrastructure
- Local load balancers and ingress controllers

• We can locally use Kubernetes via Minikube, Kind, k3s ,Kubeadm, Openshift
Fig. Local Kubernetes For AI Workload Deployment
What Is Virtual Kubernetes?
Definition: A Kubernetes service operated and maintained by a third-party cloud provider

Key Characteristics:
•Kubernetes control plane managed by the provider
•Automated deployment, scaling, and updates
•Built-in monitoring and security features
•Pay-as-you-go pricing model
•Reduced operational overhead

Leading Providers:
•Amazon EKS (Elastic Kubernetes Service)
•Google GKE (Google Kubernetes Engine)
•Microsoft AKS (Azure Kubernetes Service)
•IBM Cloud Kubernetes Service
•Digital Ocean Kubernetes
Fig . Architecture Diagram of Deploying AI workloads on Managed Kubernetes
Proposed Framework :
1. Workload Classification System
2. Managed Cluster Architecture
3. Predictive Scaling Engine
4. Resource Optimization Strategies
5. Implementation on Amazon EKS
Expected Outcomes & Impact
Technical Outcomes:
- Optimized architecture framework for AI workloads on managed K8s
- Performance benchmarks comparing managed vs. local deployments
- Provider comparison (AWS EKS, GCP GKE, Azure AKS)
Practical Applications:
- Best practices for AI workload deployment and scaling
- Cost-optimization strategies for GPU/TPU resources
- MLOps workflow integration patterns
Measurable Impacts:
- Training time reduction metrics for distributed workloads
- Inference latency improvements at scale
- Operational cost analysis and ROI calculations
Delivered Artifacts:
- Reference implementation templates
- Deployment automation scripts
- AI-specific monitoring configurations
REFERENCES:
[1] Burns, B., Grant, B., Oppenheimer, D., Brewer, E., & Wilkes, J. (2019). Borg, Omega, and Kubernetes:
Lessons learned from three container-management systems over a decade. Queue, 14(1), 70-93.
[2] Chen, T., Moreau, T., Jiang, Z., Zheng, L., Yan, E., Shen, H., ... & Gonzalez, J. (2020). TVM: An
automated end-to-end optimizing compiler for deep learning. In Proceedings of the 13th USENIX
Symposium on Operating Systems Design and Implementation.
[3] Gupta, A., & Singh, R. (2023). Inference workload patterns in production AI sys tems: A study of resource
utilization and latency requirements. Journal of Machine Learning Operations, 5(2), 128-145
[4] Li, K., Zhou, M., Wu, X., & Yu, H. (2022). Performance analysis of managed Ku bernetes services for AI
workloads: A comparative study. In Proceedings of the In ternational Conference on Cloud Computing.
[5] Martinez, J., Patel, K., & Rodriguez, L. (2024). Deployment strategies for large language models on
Kubernetes: Challenges and solutions. arXiv preprint arXiv:2401.12345.
Thank You

300+ Artificial Intelligence MCQ Questions & Answers - Letsfindcourse
100% (2)
300+ Artificial Intelligence MCQ Questions & Answers - Letsfindcourse
10 pages
Kubernetes Cluster For AI Based Applications
No ratings yet
Kubernetes Cluster For AI Based Applications
7 pages
Cloud Project
No ratings yet
Cloud Project
7 pages
Kubernetes For MLOps Engineers
No ratings yet
Kubernetes For MLOps Engineers
7 pages
Introduction To Amazon EKS-2024
No ratings yet
Introduction To Amazon EKS-2024
41 pages
Characterising Resource Management Performance in Kubernetes
No ratings yet
Characterising Resource Management Performance in Kubernetes
25 pages
Toka 2021
No ratings yet
Toka 2021
15 pages
Kubernetes Basics
No ratings yet
Kubernetes Basics
25 pages
Kuber Net Es
No ratings yet
Kuber Net Es
33 pages
Scaling Up AI ML With Kubernetes
No ratings yet
Scaling Up AI ML With Kubernetes
7 pages
Advanced Cloud Computing
No ratings yet
Advanced Cloud Computing
4 pages
LESSON 2 - Assembling A Computer - Performance Checklist
No ratings yet
LESSON 2 - Assembling A Computer - Performance Checklist
2 pages
CET 324 Advance Cybersecurity Part2
0% (1)
CET 324 Advance Cybersecurity Part2
32 pages
Kubernetes Notes
No ratings yet
Kubernetes Notes
35 pages
Optimizing Kubernetes Performance For Large
No ratings yet
Optimizing Kubernetes Performance For Large
4 pages
Tarush Exp7
No ratings yet
Tarush Exp7
6 pages
Zia Platform Services Document 55
100% (1)
Zia Platform Services Document 55
44 pages
Activity 5 Presentation Software
No ratings yet
Activity 5 Presentation Software
15 pages
Trabajo 1
No ratings yet
Trabajo 1
5 pages
Efficient Resource Utilization in Kubern
No ratings yet
Efficient Resource Utilization in Kubern
5 pages
20 Assignment2
No ratings yet
20 Assignment2
4 pages
A Comparative Study Between Service Mesh and Standalone Kubernetes
No ratings yet
A Comparative Study Between Service Mesh and Standalone Kubernetes
7 pages
Unit - 2: Onventional Ncryption Principles
No ratings yet
Unit - 2: Onventional Ncryption Principles
35 pages
Kubernates-Part1
No ratings yet
Kubernates-Part1
28 pages
Kubernetes
No ratings yet
Kubernetes
9 pages
Viva Shedule
No ratings yet
Viva Shedule
4 pages
AZ 900 - Complete Notes
No ratings yet
AZ 900 - Complete Notes
90 pages
Open Web Application Security Project (OWASP)
No ratings yet
Open Web Application Security Project (OWASP)
4 pages
PUM SetUpandPatching
No ratings yet
PUM SetUpandPatching
42 pages
SMAN™ Refrigerant Manifold + Micron Gauge (4 Port) : Operator'S Manual
No ratings yet
SMAN™ Refrigerant Manifold + Micron Gauge (4 Port) : Operator'S Manual
23 pages
Object Oriented Programming
No ratings yet
Object Oriented Programming
81 pages
Data Visualization With Seaborn
No ratings yet
Data Visualization With Seaborn
100 pages
Panel LG Display LC320WXN-SBD1 0 (DS)
No ratings yet
Panel LG Display LC320WXN-SBD1 0 (DS)
40 pages
Profihub f1 Manual en v101
No ratings yet
Profihub f1 Manual en v101
26 pages
Library Automation Package
No ratings yet
Library Automation Package
26 pages
How To Create Display Ad in Google Ads: Meaning
No ratings yet
How To Create Display Ad in Google Ads: Meaning
4 pages
Key Differences Between BioMérieux MALDI-ToF MS (V
No ratings yet
Key Differences Between BioMérieux MALDI-ToF MS (V
9 pages
Dbms 2
No ratings yet
Dbms 2
28 pages
DS Lab 21 Scheme Journal
No ratings yet
DS Lab 21 Scheme Journal
30 pages
Updated Courses
No ratings yet
Updated Courses
13 pages
09 - Active Directory
No ratings yet
09 - Active Directory
10 pages
Top Notch 2 List of Vocabulary in Unit 9
No ratings yet
Top Notch 2 List of Vocabulary in Unit 9
8 pages
The Lekha'S Waevguru Is An Ideal Tool For Realizing Networks Starting From 2G To 4G and Extension To 5G Concepts and Other Radio Applications
No ratings yet
The Lekha'S Waevguru Is An Ideal Tool For Realizing Networks Starting From 2G To 4G and Extension To 5G Concepts and Other Radio Applications
3 pages
Veritas Cluster Cheat Sheet
No ratings yet
Veritas Cluster Cheat Sheet
7 pages
Data Sheet: Automotive Audio Bus A B Transceiver
No ratings yet
Data Sheet: Automotive Audio Bus A B Transceiver
2 pages
Cyber Threat Hunting
No ratings yet
Cyber Threat Hunting
2 pages
For Downloading
No ratings yet
For Downloading
3 pages
7a188f - 454 - 2459854 6
No ratings yet
7a188f - 454 - 2459854 6
3 pages
Chapter 01 Subprograms
No ratings yet
Chapter 01 Subprograms
10 pages
Big Data on Kubernetes: A practical guide to building efficient and scalable data solutions
From Everand
Big Data on Kubernetes: A practical guide to building efficient and scalable data solutions
Neylson Crepalde
No ratings yet
FaaS-netes Deployment and Operations: The Complete Guide for Developers and Engineers
From Everand
FaaS-netes Deployment and Operations: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
AWS DevOps Engineer Professional Certification Guide: Hands-on guide to understand, analyze, and solve 150 scenario-based questions (English Edition)
From Everand
AWS DevOps Engineer Professional Certification Guide: Hands-on guide to understand, analyze, and solve 150 scenario-based questions (English Edition)
Sumit Kapoor
No ratings yet
Kubernetes Deployment: Advanced Strategies
From Everand
Kubernetes Deployment: Advanced Strategies
William Jones
No ratings yet
Mastering DevOps in Kubernetes: Maximize your container workload efficiency with DevOps practices in Kubernetes (English Edition)
From Everand
Mastering DevOps in Kubernetes: Maximize your container workload efficiency with DevOps practices in Kubernetes (English Edition)
Soumiyajit Das Chowdhury
No ratings yet
Mastering Kubernetes
From Everand
Mastering Kubernetes
Gigi Sayfan
5/5 (1)
K0s Essentials: The Complete Guide for Developers and Engineers
From Everand
K0s Essentials: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Virtual Kubelet in Practice: The Complete Guide for Developers and Engineers
From Everand
Virtual Kubelet in Practice: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Jenkins Operator for Kubernetes Environments: The Complete Guide for Developers and Engineers
From Everand
Jenkins Operator for Kubernetes Environments: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Efficient Model Deployment with BentoML: The Complete Guide for Developers and Engineers
From Everand
Efficient Model Deployment with BentoML: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Crossplane Composition Functions in Practice: The Complete Guide for Developers and Engineers
From Everand
Crossplane Composition Functions in Practice: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Kubecost Essentials: The Complete Guide for Developers and Engineers
From Everand
Kubecost Essentials: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
OpenEBS for Kubernetes Storage: The Complete Guide for Developers and Engineers
From Everand
OpenEBS for Kubernetes Storage: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Mastering Kubernetes: From Basics to Advanced Cluster Orchestration
From Everand
Mastering Kubernetes: From Basics to Advanced Cluster Orchestration
Dargslan
No ratings yet
Kubernetes from basic to advanced levels
From Everand
Kubernetes from basic to advanced levels
Alex Carvalho
No ratings yet
Kubernetes Handbook: Non-Programmer's Guide to Deploy Applications with Kubernetes
From Everand
Kubernetes Handbook: Non-Programmer's Guide to Deploy Applications with Kubernetes
Stephen Fleming
3.5/5 (2)
Model-Driven Online Capacity Management for Component-Based Software Systems
From Everand
Model-Driven Online Capacity Management for Component-Based Software Systems
André van Hoorn
No ratings yet
AWS DevOps for GenAI: Automating and Scaling AI Solutions
From Everand
AWS DevOps for GenAI: Automating and Scaling AI Solutions
Prachi Tembhekar
No ratings yet
Elastic Kubernetes Service in Practice: Definitive Reference for Developers and Engineers
From Everand
Elastic Kubernetes Service in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Azure Container Apps Deployment and Architecture: The Complete Guide for Developers and Engineers
From Everand
Azure Container Apps Deployment and Architecture: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Mastering Kubernetes in Production: Managing Containerized Applications
From Everand
Mastering Kubernetes in Production: Managing Containerized Applications
Peter Johnson
No ratings yet
Kubernetes Event-driven Autoscaling with KEDA: The Complete Guide for Developers and Engineers
From Everand
Kubernetes Event-driven Autoscaling with KEDA: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Longhorn for Kubernetes Storage Architecture: The Complete Guide for Developers and Engineers
From Everand
Longhorn for Kubernetes Storage Architecture: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Amazon ECR Deployment Solutions: Definitive Reference for Developers and Engineers
From Everand
Amazon ECR Deployment Solutions: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Crossplane for Modern Cloud Infrastructure: Definitive Reference for Developers and Engineers
From Everand
Crossplane for Modern Cloud Infrastructure: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Kubeless in Action: Definitive Reference for Developers and Engineers
From Everand
Kubeless in Action: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Azure Kubernetes Service Essentials: Definitive Reference for Developers and Engineers
From Everand
Azure Kubernetes Service Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Architecting Solutions with EC2: Definitive Reference for Developers and Engineers
From Everand
Architecting Solutions with EC2: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Kops for Enterprise Kubernetes Cluster Management: Definitive Reference for Developers and Engineers
From Everand
Kops for Enterprise Kubernetes Cluster Management: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Kubernetes Clusters with KIND: Definitive Reference for Developers and Engineers
From Everand
Kubernetes Clusters with KIND: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Argo for Cloud-Native Workflows and Delivery: Definitive Reference for Developers and Engineers
From Everand
Argo for Cloud-Native Workflows and Delivery: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Kubernetes Essentials Guide: Definitive Reference for Developers and Engineers
From Everand
Kubernetes Essentials Guide: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Mastering Kubernetes
From Everand
Mastering Kubernetes
Manish Soni
No ratings yet
Kubeflow Operations and Workflow Engineering: Definitive Reference for Developers and Engineers
From Everand
Kubeflow Operations and Workflow Engineering: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Google Kubernetes Engine Essentials: Definitive Reference for Developers and Engineers
From Everand
Google Kubernetes Engine Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Efficient DevOps Automation with AWS CodeStar: Definitive Reference for Developers and Engineers
From Everand
Efficient DevOps Automation with AWS CodeStar: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Knative in Cloud-Native Infrastructure: Definitive Reference for Developers and Engineers
From Everand
Knative in Cloud-Native Infrastructure: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
About Kubernetes and Security Practices - Short Edition: First Edition, #1
From Everand
About Kubernetes and Security Practices - Short Edition: First Edition, #1
Ami Adi
No ratings yet
Scale Smart: Azure Architecture Essentials
From Everand
Scale Smart: Azure Architecture Essentials
Kameron Hussain
No ratings yet
The Kubeflow Handbook: Streamlining Machine Learning on Kubernetes
From Everand
The Kubeflow Handbook: Streamlining Machine Learning on Kubernetes
Robert Johnson
No ratings yet
AWS Certified Solutions Architect - Associate Exam Prep kit
From Everand
AWS Certified Solutions Architect - Associate Exam Prep kit
SUJAN
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

STWP Presentation

Uploaded by

STWP Presentation

Uploaded by

Scaling AI Workloads in Cloud

Environments: A Managed Kubernetes

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.