Platform Engineering Playbook Podcast

Cloudflare Outage November 2025: When a Rust Panic Took Down 20% of the Internet


Listen Later

A routine database permissions change triggered Cloudflare's worst outage since 2019—taking down ChatGPT, X, Shopify, Discord, and 20% of the internet for nearly 6 hours. Jordan and Alex dissect the technical chain reaction from ClickHouse metadata exposure to a Rust panic in the FL2 proxy, examining how ~60 features became >200 and exceeded a hardcoded memory limit. The third major cloud outage in 30 days—after AWS and Azure—raises critical questions about infrastructure concentration risk and why internal configuration needs the same defensive programming as external input.

Perfect for senior platform engineers, SREs, DevOps engineers with 5+ years experience looking to level up their platform engineering skills.

Episode URL: https://platformengineeringplaybook.io/podcasts/00030-cloudflare-outage-november-2025

...more
View all episodesView all episodes
Download on the App Store

Platform Engineering Playbook PodcastBy vibesre