September 16, 2025

VerticalPodAutoscaler Went Rogue: It Took Down Our Cluster, with Thibault Jamet

Listen Later

Running 30 Kubernetes clusters serving 300,000 requests per second sounds impressive until your Vertical Pod Autoscaler goes rogue and starts evicting critical system pods in an endless loop.

Thibault Jamet shares the technical details of debugging a complex VPA failure at Adevinta, where webhook timeouts triggered continuous pod evictions across their multi-tenant Kubernetes platform.

You will learn:

VPA architecture deep dive - How the recommender, updater, and mutating webhook components interact and what happens when the webhook fails
Hidden Kubernetes limits - How default QPS and burst rate limits in the Kubernetes Go client can cause widespread failures, and why these aren't well documented in Helm charts
Monitoring strategies for autoscaling - What metrics to track for webhook latency and pod eviction rates to catch similar issues before they become critical

Sponsor

This episode is brought to you by Testkube—where teams run millions of performance tests in real Kubernetes infrastructure. From air-gapped environments to massive scale deployments, orchestrate every testing tool in one platform. Check it out at testkube.io

More info

Find all the links and info for this episode here: https://ku.bz/rf1pbWXdN
Interested in sponsoring an episode? Learn more.

...more

View all episodes

View all episodes

Download on the App Store

Download on the App Store

Get it on Google Play

KubeFM

By KubeFM

5

22 ratings

September 16, 2025

VerticalPodAutoscaler Went Rogue: It Took Down Our Cluster, with Thibault Jamet

Listen Later

Running 30 Kubernetes clusters serving 300,000 requests per second sounds impressive until your Vertical Pod Autoscaler goes rogue and starts evicting critical system pods in an endless loop.

Thibault Jamet shares the technical details of debugging a complex VPA failure at Adevinta, where webhook timeouts triggered continuous pod evictions across their multi-tenant Kubernetes platform.

You will learn:

VPA architecture deep dive - How the recommender, updater, and mutating webhook components interact and what happens when the webhook fails
Hidden Kubernetes limits - How default QPS and burst rate limits in the Kubernetes Go client can cause widespread failures, and why these aren't well documented in Helm charts
Monitoring strategies for autoscaling - What metrics to track for webhook latency and pod eviction rates to catch similar issues before they become critical

Sponsor

This episode is brought to you by Testkube—where teams run millions of performance tests in real Kubernetes infrastructure. From air-gapped environments to massive scale deployments, orchestrate every testing tool in one platform. Check it out at testkube.io

More info

Find all the links and info for this episode here: https://ku.bz/rf1pbWXdN
Interested in sponsoring an episode? Learn more.

...more

More shows like KubeFM

Software Engineering Radio - the podcast for professional software developers by team@se-radio.net (SE-Radio Team)

Software Engineering Radio - the podcast for professional software developers

275 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

289 Listeners

Security Now (Audio) by TWiT

Security Now (Audio)

2,009 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

626 Listeners

LINUX Unplugged by Jupiter Broadcasting

LINUX Unplugged

274 Listeners

The Enterprise AI Show by Massive Studios

The Enterprise AI Show

149 Listeners

Talk Python To Me by Michael Kennedy

Talk Python To Me

583 Listeners

Soft Skills Engineering by Jamison Dance and Dave Smith

Soft Skills Engineering

288 Listeners

Thoughtworks Technology Podcast by Thoughtworks

Thoughtworks Technology Podcast

43 Listeners

Late Night Linux by The Late Night Linux Family

Late Night Linux

170 Listeners

Kubernetes Podcast from Google by Abdel Sghiouar, Kaslin Fields

Kubernetes Podcast from Google

181 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

203 Listeners

The Stack Overflow Podcast by The Stack Overflow Podcast

The Stack Overflow Podcast

62 Listeners

2.5 Admins by The Late Night Linux Family

2.5 Admins

98 Listeners

Oxide and Friends by Oxide Computer Company

Oxide and Friends

65 Listeners