STWP Presentation
STWP Presentation
This research addresses a critical gap in cloud computing and AI infrastructure, with
significant benefits for:
• Academic Community
• Industry Practitioners
• Technology Ecosystem
Gap Analysis :
Despite these advancements, several gaps remain in the literature:
1. Limited integration between workload characterization and automated scaling mech
anisms
2. Insufficient attention to heterogeneous AI workloads sharing infrastructure
3. Lack of comprehensive frameworks that balance performance, cost, and resource
utilization
4. Minimal research on the application of virtual Kubernetes clusters for AI workloads
5. Few empirical studies comparing different configuration approaches for EKS in AI
contexts
Methodology:
What Is Local Kubernetes?
Definition: A self-contained Kubernetes deployment running on local infrastructure
Key Characteristics:
- Runs on physical servers or VMs within organizational boundaries
- Complete control over hardware, networking, and storage resources
- Full Kubernetes functionality in an on-premises environment
- Direct management of the entire Kubernetes stack
Components:-
- Control plane (API server, scheduler, controller manager, etcd)
- Worker nodes running containerized applications
- Local persistent storage solutions
- Physical networking infrastructure
- Local load balancers and ingress controllers
• We can locally use Kubernetes via Minikube, Kind, k3s ,Kubeadm, Openshift
Fig. Local Kubernetes For AI Workload Deployment
What Is Virtual Kubernetes?
Definition: A Kubernetes service operated and maintained by a third-party cloud provider
Key Characteristics:
•Kubernetes control plane managed by the provider
•Automated deployment, scaling, and updates
•Built-in monitoring and security features
•Pay-as-you-go pricing model
•Reduced operational overhead
Leading Providers:
•Amazon EKS (Elastic Kubernetes Service)
•Google GKE (Google Kubernetes Engine)
•Microsoft AKS (Azure Kubernetes Service)
•IBM Cloud Kubernetes Service
•Digital Ocean Kubernetes
Fig . Architecture Diagram of Deploying AI workloads on Managed Kubernetes
Proposed Framework :
1. Workload Classification System
2. Managed Cluster Architecture
3. Predictive Scaling Engine
4. Resource Optimization Strategies
5. Implementation on Amazon EKS
Expected Outcomes & Impact
Technical Outcomes:
- Optimized architecture framework for AI workloads on managed K8s
- Performance benchmarks comparing managed vs. local deployments
- Provider comparison (AWS EKS, GCP GKE, Azure AKS)
Practical Applications:
- Best practices for AI workload deployment and scaling
- Cost-optimization strategies for GPU/TPU resources
- MLOps workflow integration patterns
Measurable Impacts:
- Training time reduction metrics for distributed workloads
- Inference latency improvements at scale
- Operational cost analysis and ROI calculations
Delivered Artifacts:
- Reference implementation templates
- Deployment automation scripts
- AI-specific monitoring configurations
REFERENCES:
[1] Burns, B., Grant, B., Oppenheimer, D., Brewer, E., & Wilkes, J. (2019). Borg, Omega, and Kubernetes:
Lessons learned from three container-management systems over a decade. Queue, 14(1), 70-93.
[2] Chen, T., Moreau, T., Jiang, Z., Zheng, L., Yan, E., Shen, H., ... & Gonzalez, J. (2020). TVM: An
automated end-to-end optimizing compiler for deep learning. In Proceedings of the 13th USENIX
Symposium on Operating Systems Design and Implementation.
[3] Gupta, A., & Singh, R. (2023). Inference workload patterns in production AI sys tems: A study of resource
utilization and latency requirements. Journal of Machine Learning Operations, 5(2), 128-145
[4] Li, K., Zhou, M., Wu, X., & Yu, H. (2022). Performance analysis of managed Ku bernetes services for AI
workloads: A comparative study. In Proceedings of the In ternational Conference on Cloud Computing.
[5] Martinez, J., Patel, K., & Rodriguez, L. (2024). Deployment strategies for large language models on
Kubernetes: Challenges and solutions. arXiv preprint arXiv:2401.12345.
Thank You