KubeFM

By KubeFM

Discover all the great things happening in the world of Kubernetes, learn (controversial) opinions from the experts and explore the successes (and failures) of running Kubernetes at scale.... more

· Technology

5

22 ratings

Download on the App Store

Download on the App Store

Get it on Google Play

FAQs about KubeFM:

How many episodes does KubeFM have?

The podcast currently has 101 episodes available.

KubeFM episodes:

July 28, 20261 Million Tokens Per Second on Kubernetes, with Federico Iezzi
GPU inference throughput depends on more than accelerator generation or count.
Memory bandwidth, model parallelism, cache configuration, and the load generator itself all influence measured throughput.
Federico Iezzi, Customer Engineer at Google Cloud, explains how his team achieved 1 million output tokens per second using Qwen 3.5 27B, vLLM, GKE Autopilot, and NVIDIA B200 GPUs.
The discussion covers:
Why memory bandwidth limits decode performance
How Federico chose between tensor and data parallelism
What changed after enabling multi-token prediction and reducing the KV cache footprint with FP8 quantization.
Sponsor
This episode is sponsored by LearnKube. Download the free book, The Technical Guide to Kubernetes Rightsizing, to understand what Prometheus and Grafana cannot tell you about safely reducing requests and limits.
More info
Find all the links and info for this episode here: https://ku.bz/1xD9Md0mb
Interested in sponsoring an episode? Learn more.
...more
48min
May 19, 2026The Hidden Cost of Slow Autoscaling, with John Ford
Forced platform migrations are usually treated as something to survive. At Scout24, a mandatory OS migration became an opportunity to rethink Kubernetes autoscaling, node provisioning, and infrastructure efficiency.
John Ford explains how Scout24 moved its EKS-based Infinity platform from a polling autoscaler and over-provisioned capacity to Karpenter and Bottlerocket. The result was faster node startup, a safer migration path, and about a 30% infrastructure reduction without major downtime.
In this interview:
Why two-minute node provisioning forced a 25% capacity buffer
How Karpenter made the Bottlerocket migration safer
What broke around EC2 metadata, AWS SDKs, and cgroups
How the new foundation enables Spot, ARM, and GPU workloads
Sponsor
This episode is sponsored by LearnKube — get started on your Kubernetes journey through comprehensive online, in-person or remote training.
More info
Find all the links and info for this episode here: https://ku.bz/DdmVC2_7v
Interested in sponsoring an episode? Learn more.
...more
22min
May 12, 2026The Namespaces Scaling Trap, with Brian Stack
Most teams scale Kubernetes by thinking about pods and nodes. At Render, Brian Stack ran into a different dimension: hundreds of thousands of namespaces per cluster, multiplied across DaemonSets that list-watch every namespace.
Brian explains how Render traced the issue through Calico and Vector, worked with upstream maintainers, and turned memory profiling into operational wins: lower node costs, lighter API-server load, and faster rollouts.
In this interview:
Why namespaces can become a hidden scaling bottleneck
How DaemonSets multiply memory and control-plane pressure
How profiling, staging clusters, and upstream collaboration freed 7 TiB
Why pushing from an 80% fix to a complete fix can make teams faster
Sponsor
This episode is sponsored by LearnKube — get started on your Kubernetes journey through comprehensive online, in-person or remote training.
More info
Find all the links and info for this episode here: https://ku.bz/0mrvCsXrV
Interested in sponsoring an episode? Learn more.
...more
37min
May 05, 2026AI Agents Running Kubernetes, with Mike Solomon
What happens when an AI agent stops generating Kubernetes YAML and starts operating the cluster directly?
Mike Solomon, software engineer at AIATELLA, explains how his team moved from a sprawling Helm setup to Markdown-driven infrastructure specs that Claude Code can execute, test, and refine.
You will learn
Why Helm became hard to maintain for a fast-moving medical infrastructure repo
How Claude debugged Argo, TLS conflicts, kubectl patches, and private registry credentials
How runbooks plus agent memory files capture failures so deployments become reproducible.
It is a practical look at where Kubernetes automation may be heading: less hand-written YAML, more precise intent, and a sharper definition of when the human must stay in the loop.
Sponsor
This episode is sponsored by LearnKube — get started on your Kubernetes journey through comprehensive online, in-person or remote training.
More info
Find all the links and info for this episode here: https://ku.bz/y70mLvWNs
Interested in sponsoring an episode? Learn more.
...more
39min
April 28, 2026SaaS with Kubernetes Operators and Garbage Collection, with Alexander Held
A single Kubernetes CRD for every service request turns small changes into full-platform reconciliations.
Alexander Held, former platform engineer at Mercedes-Benz Tech Innovation, describes a production refactor from a 2,000-line CRD to purpose-built resources and controllers. He shows how teams can model business workflows as Kubernetes APIs and then use owner references, finalizers, and events to keep platform operations predictable.
You will learn:
Why monolithic CRDs create performance and troubleshooting problems
How controllers turn database provisioning and backups into reconciliation loops
How finalizers clean up external resources such as S3 backups
Why Kubernetes events make platform workflows easier to debug
Sponsor
This episode is sponsored by LearnKube — get started on your Kubernetes journey through comprehensive online, in-person or remote training.
More info
Find all the links and info for this episode here: https://ku.bz/TGy4Qn7Qs
Interested in sponsoring an episode? Learn more.
...more
36min
April 21, 2026What Hip-Hop Can Teach Us About Kubernetes, with Kelsey Hightower, Eric Abercrombie, and Julius Payne II
Kelsey Hightower, Eric Abercrombie, and Julius Payne II reflect on life after achievement, entering the Kubernetes world for the first time, and how music, creativity, and lived experience shape the way they think about technology.
In this interview:
Why fundamentals, patience, and repetition still matter more than shortcuts
How Kubernetes, community, and confidence intersect for people entering cloud-native work
What hip-hop, production, and storytelling can teach us about ownership, authenticity, and finding your voice
Sponsor
This episode is sponsored by LearnKube — get started on your Kubernetes journey through comprehensive online, in-person or remote training.
More info
Find all the links and info for this episode here: https://ku.bz/czrCCXSLt
Interested in sponsoring an episode? Learn more.
...more
1h 30min
April 07, 2026Intelligent Kubernetes Load Balancing, with Rohit Agrawal
You're running gRPC services in Kubernetes, load balancing looks fine on the dashboard — but some pods are burning at 80% CPU while others sit idle, and adding more replicas only partially helps.
Rohit Agrawal, a Staff Software Engineer on the traffic platform team at Databricks, explains why this happens and how his team replaced Kubernetes's default networking with a proxy-less, client-side load-balancing system built on the xDS protocol.
In this episode:
Why KubeProxy's Layer 4 routing breaks down under high-throughput gRPC: it picks a backend once per TCP connection, not per request
How Databricks built an Endpoint Discovery Service (EDS) that watches Kubernetes directly and streams real-time pod metadata to every client
How zone-aware spillover cut cross-availability-zone costs without sacrificing availability
Why CPU-based routing failed (monitoring lag creates oscillation) and what signals to use instead
The system has been running in production for three years across hundreds of services, handling millions of requests.
Sponsor
This episode is sponsored by LearnKube — get started on your Kubernetes journey through comprehensive online, in-person or remote training.
More info
Find all the links and info for this episode here: https://ku.bz/y803JMhBk
Interested in sponsoring an episode? Learn more.
...more
31min
March 31, 2026That Time I Found a Service Account Token in my Log Files, with Vincent von Büren
You're integrating HashiCorp Vault into your Kubernetes cluster and adding a temporary debug log line to check whether the ServiceAccount token is being passed correctly. Three months later, that log line is still in production — and the token it prints has a 1-year expiry with no audience restrictions.
Vincent von Büren, a platform engineer at ipt in Switzerland, lived through exactly this incident. In this episode, he breaks down why default Kubernetes ServiceAccount tokens are a quiet security risk hiding in plain sight.
You will learn:
What's actually inside a Kubernetes ServiceAccount JWT (issuer, subject, audience, and expiry)
Why tokens with no audience scoping enable replay attacks across internal and external systems
How Vault's Kubernetes auth method and JWT auth method compare, and when to choose each
What projected tokens are, why they dramatically reduce blast radius, and what's holding teams back from using them
Practical steps for auditing which pods actually need API access and disabling auto-mounting everywhere else
Sponsor
This episode is sponsored by LearnKube — get started on your Kubernetes journey through comprehensive online, in-person or remote training.
More info
Find all the links and info for this episode here: https://ku.bz/LTnB_Ntbc
Interested in sponsoring an episode? Learn more.
...more
29min
March 24, 2026GPU Containers as a Service, with Landon Clipp
Running GPU workloads on Kubernetes sounds straightforward until you need to isolate multiple tenants on the same server. The moment you virtualize GPUs for security, you lose access to NVIDIA kernel drivers — and almost every tool in the ecosystem assumes those drivers exist.
Landon Clipp built a GPU-based Containers as a Service platform from scratch, solving each isolation layer — from kernel separation with Kata Containers + QEMU to NVLink fabric partitioning to network policies with Cilium/eBPF — and shares exactly what broke along the way.
In this interview:
Why standard NVIDIA tooling (GPU Operator) fails in multi-tenant setups, and how to use CDI with PCI topology scanning to make GPUs visible to Kubernetes without kernel drivers
How to partition the NVLink fabric between tenants using a trusted service VM running Fabric Manager, and why the physical PCIe wiring differs between Supermicro HGX and NVIDIA DGX systems
Why gVisor doesn't work for GPU workloads — NVIDIA's unstable ioctl ABI means Google has to update gVisor for every driver release, and they only support a handful of GPUs
What caused 8-GPU VMs to take 30+ minutes to boot, and the specific fixes (IOMMUFD, cold plugging, kernel upgrades) that brought it down to minutes
How Cilium network policies enforce tenant isolation at the Kubernetes identity level instead of fragile IP-based rules
Where Containers as a Service fits best: inference workloads where AI teams want to ship an OCI image without managing infrastructure or signing multi-million dollar cluster contracts.
Sponsor
This episode is sponsored by LearnKube — get started on your Kubernetes journey through comprehensive online, in-person or remote training.
More info
Find all the links and info for this episode here: https://ku.bz/jjK_yJTDz
Interested in sponsoring an episode? Learn more.
...more
46min
March 17, 2026How We Cut Build Debugging Time by 75% with AI, with Ron Matsliah
Build failures in Kubernetes CI/CD pipelines are a silent productivity killer. Developers spend 45+ minutes scrolling through cryptic logs, often just hitting rerun and hoping for the best.
Ron Matsliah, DevOps engineer at Next Insurance, built an AI-powered assistant that cut build debugging time by 75% — not as a dashboard, but delivered directly in Slack where developers already work.
In this episode:
Why combining deterministic rules with AI produces better results than letting an LLM guess alone
How correlating Kubernetes events with build logs catches spot instance terminations that produce misleading errors
Why integrating into existing workflows and building feedback loops from day one drove adoption
The prompt engineering lessons learned from testing with real production data instead of synthetic examples
The takeaway: simple rules plus rich context consistently outperform complex AI queries on their own.
Sponsor
This episode is sponsored by LearnKube — get started on your Kubernetes journey through comprehensive online, in-person or remote training.
More info
Find all the links and info for this episode here: https://ku.bz/PDdYfC00w
Interested in sponsoring an episode? Learn more.
...more
21min

FAQs about KubeFM:

How many episodes does KubeFM have?

The podcast currently has 101 episodes available.

More shows like KubeFM

Software Engineering Radio - the podcast for professional software developers by team@se-radio.net (SE-Radio Team)

Software Engineering Radio - the podcast for professional software developers

275 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

289 Listeners

Security Now (Audio) by TWiT

Security Now (Audio)

2,009 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

626 Listeners

LINUX Unplugged by Jupiter Broadcasting

LINUX Unplugged

274 Listeners

The Enterprise AI Show by Massive Studios

The Enterprise AI Show

149 Listeners

Talk Python To Me by Michael Kennedy

Talk Python To Me

583 Listeners

Soft Skills Engineering by Jamison Dance and Dave Smith

Soft Skills Engineering

288 Listeners

Thoughtworks Technology Podcast by Thoughtworks

Thoughtworks Technology Podcast

43 Listeners

Late Night Linux by The Late Night Linux Family

Late Night Linux

170 Listeners

Kubernetes Podcast from Google by Abdel Sghiouar, Kaslin Fields

Kubernetes Podcast from Google

181 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

203 Listeners

The Stack Overflow Podcast by The Stack Overflow Podcast

The Stack Overflow Podcast

62 Listeners

2.5 Admins by The Late Night Linux Family

2.5 Admins

98 Listeners

Oxide and Friends by Oxide Computer Company

Oxide and Friends

65 Listeners