
Sign up to save your podcasts
Or


Running 30 Kubernetes clusters serving 300,000 requests per second sounds impressive until your Vertical Pod Autoscaler goes rogue and starts evicting critical system pods in an endless loop.
Thibault Jamet shares the technical details of debugging a complex VPA failure at Adevinta, where webhook timeouts triggered continuous pod evictions across their multi-tenant Kubernetes platform.
You will learn:
VPA architecture deep dive - How the recommender, updater, and mutating webhook components interact and what happens when the webhook fails
Hidden Kubernetes limits - How default QPS and burst rate limits in the Kubernetes Go client can cause widespread failures, and why these aren't well documented in Helm charts
Monitoring strategies for autoscaling - What metrics to track for webhook latency and pod eviction rates to catch similar issues before they become critical
Sponsor
This episode is brought to you by Testkube—where teams run millions of performance tests in real Kubernetes infrastructure. From air-gapped environments to massive scale deployments, orchestrate every testing tool in one platform. Check it out at testkube.io
More info
Find all the links and info for this episode here: https://ku.bz/rf1pbWXdN
Interested in sponsoring an episode? Learn more.
By KubeFM5
22 ratings
Running 30 Kubernetes clusters serving 300,000 requests per second sounds impressive until your Vertical Pod Autoscaler goes rogue and starts evicting critical system pods in an endless loop.
Thibault Jamet shares the technical details of debugging a complex VPA failure at Adevinta, where webhook timeouts triggered continuous pod evictions across their multi-tenant Kubernetes platform.
You will learn:
VPA architecture deep dive - How the recommender, updater, and mutating webhook components interact and what happens when the webhook fails
Hidden Kubernetes limits - How default QPS and burst rate limits in the Kubernetes Go client can cause widespread failures, and why these aren't well documented in Helm charts
Monitoring strategies for autoscaling - What metrics to track for webhook latency and pod eviction rates to catch similar issues before they become critical
Sponsor
This episode is brought to you by Testkube—where teams run millions of performance tests in real Kubernetes infrastructure. From air-gapped environments to massive scale deployments, orchestrate every testing tool in one platform. Check it out at testkube.io
More info
Find all the links and info for this episode here: https://ku.bz/rf1pbWXdN
Interested in sponsoring an episode? Learn more.

274 Listeners

287 Listeners

2,006 Listeners

624 Listeners

270 Listeners

151 Listeners

582 Listeners

288 Listeners

43 Listeners

164 Listeners

180 Listeners

204 Listeners

62 Listeners

98 Listeners

66 Listeners