Ten Thousand Feet, the Vervint Podcast

Cloud Outages: Lessons in Resilience and Error Budgets


Listen Later

In this episode of 10,000 Feet, Nate Sherman and Matt Glenn dive into the recent wave of major cloud outages impacting AWS, Azure, and Cloudflare, exploring what went wrong and why these failures are so disruptive. They discuss the growing risks of globally applied changes, the importance of error budgets, and strategies for building resilience in modern infrastructure. The conversation also covers best practices in site reliability engineering, monitoring, and alerting, as well as the role of AI and automation in change management. Packed with insights for architects, SREs, and IT leaders, this episode offers practical guidance on balancing speed, reliability, and risk in today’s cloud-driven world.

...more
View all episodesView all episodes
Download on the App Store

Ten Thousand Feet, the Vervint PodcastBy Vervint

  • 5
  • 5
  • 5
  • 5
  • 5

5

15 ratings