Scaling AI Workloads with Kubernetes: Sharing GPU Resources Across Multiple Containers - Jack Ong

Similar Tracks
Which GPU Sharing Strategy Is Right for You? A Comprehensive Benchmark Study Us... K. Klues, Y. Chen
CNCF [Cloud Native Computing Foundation]
Enabling Fault Tolerance for GPU Accelerated AI Workloads in Kubernetes - A. Singh & A. Paithankar
CNCF [Cloud Native Computing Foundation]
Scaling Kubernetes Clusters for Generative Models: Managing GPU Resources for AI App... Jack Min Ong
The Linux Foundation
Mastering GPU Management in Kubernetes Using the Operator Pattern- Shiva Krishna Merla & Kevin Klues
CNCF [Cloud Native Computing Foundation]
Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM | Jared Casper
@Scale