
Sign up to save your podcasts
Or


Nvidia Distinguished Engineer Kevin Klues noted that low-level systems work is invisible when done well and highly visible when it fails — a dynamic that frames current Kubernetes innovations for AI. At KubeCon + CloudNativeCon North America 2025, Klues and AWS product manager Jesse Butler discussed two emerging capabilities: dynamic resource allocation (DRA) and a new workload abstraction designed for sophisticated AI scheduling.
DRA, now generally available in Kubernetes 1.34, fixes long-standing limitations in GPU requests. Instead of simply asking for a number of GPUs, users can specify types and configurations. Modeled after persistent volumes, DRA allows any specialized hardware to be exposed through standardized interfaces, enabling vendors to deliver custom device drivers cleanly. Butler called it one of the most elegant designs in Kubernetes.
Yet complex AI workloads require more coordination. A forthcoming workload abstraction, debuting in Kubernetes 1.35, will let users define pod groups with strict scheduling and topology rules — ensuring multi-node jobs start fully or not at all. Klues emphasized that this abstraction will shape Kubernetes’ AI trajectory for the next decade and encouraged community involvement.
Learn more from The New Stack about dynamic resource allocation:
Kubernetes Primer: Dynamic Resource Allocation (DRA) for GPU Workloads
Kubernetes v1.34 Introduces Benefits but Also New Blind Spots
Join our community of newsletter subscribers to stay on top of the news and at the top of your game.
By The New Stack4.3
3131 ratings
Nvidia Distinguished Engineer Kevin Klues noted that low-level systems work is invisible when done well and highly visible when it fails — a dynamic that frames current Kubernetes innovations for AI. At KubeCon + CloudNativeCon North America 2025, Klues and AWS product manager Jesse Butler discussed two emerging capabilities: dynamic resource allocation (DRA) and a new workload abstraction designed for sophisticated AI scheduling.
DRA, now generally available in Kubernetes 1.34, fixes long-standing limitations in GPU requests. Instead of simply asking for a number of GPUs, users can specify types and configurations. Modeled after persistent volumes, DRA allows any specialized hardware to be exposed through standardized interfaces, enabling vendors to deliver custom device drivers cleanly. Butler called it one of the most elegant designs in Kubernetes.
Yet complex AI workloads require more coordination. A forthcoming workload abstraction, debuting in Kubernetes 1.35, will let users define pod groups with strict scheduling and topology rules — ensuring multi-node jobs start fully or not at all. Klues emphasized that this abstraction will shape Kubernetes’ AI trajectory for the next decade and encouraged community involvement.
Learn more from The New Stack about dynamic resource allocation:
Kubernetes Primer: Dynamic Resource Allocation (DRA) for GPU Workloads
Kubernetes v1.34 Introduces Benefits but Also New Blind Spots
Join our community of newsletter subscribers to stay on top of the news and at the top of your game.

32,246 Listeners

229,674 Listeners

16,174 Listeners

9 Listeners

3 Listeners

273 Listeners

9,724 Listeners

1,105 Listeners

626 Listeners

154 Listeners

4 Listeners

25 Listeners

10,254 Listeners

551 Listeners

5,576 Listeners

15,506 Listeners