Screaming in the Cloud

Is It Broken Everywhere or Just for Me with Omri Sass


Listen Later

When your website stops working at 3 AM, you need to answer one question fast: Is it my code or is a big cloud provider having problems? Omri Sass from Datadog explains updog.ai, a tool that monitors whether major services like AWS, CloudFlare, and others are actually working. Instead of asking people to report problems like Down Detector does, updog uses real data from thousands of computers to detect when services go down. Omri shares why this took 6 years to build, how they process massive amounts of data with machine learning, and why cloud providers have been strangely upset about these tools existing.



About Omri: 

Omri Sass is a Director of Product Management at Datadog, where he leads and supports a team of 25+ product managers driving initiatives across Bits AI SRE, Data Observability, Service Management, and most recently, the launch of updog.ai. Outside of work, Omri is an avid sci-fi reader, a dedicated yoga practitioner, and happily outmatched by his cat.


Show Highlights:

(02:12) What is Updog and How Does It Work

(03:38) Why Knowing If It's a Global Problem Matters

(04:01) The Problem With Testing Every Endpoint Yourself

(05:52) How Datadog Discovered EC2 Outages From Their Own Systems

(10:38) When AWS Regions Go Down and Cascade Failures

(13:13) What Happens When Services Rebuild Completely
(16:29) The Most Important Learning During a 3 AM Incident
(20:11) Why This Took So Long to Build
(23:40) When Datadog Going Down Isn't Critical Path
(25:22) How They Picked Which AWS Services to Monitor
(27:07) What Comes Next for Updog
(30:11) Where to Find Omri and Updog


Links:
 

Datadog: datadoghq.com

Omir’s LinkedIn: https://www.linkedin.com/in/omri-sass-65632a14/

Sponsored by:
duckbillhq.com

...more
View all episodesView all episodes
Download on the App Store

Screaming in the CloudBy Corey Quinn

  • 4.7
  • 4.7
  • 4.7
  • 4.7
  • 4.7

4.7

92 ratings


More shows like Screaming in the Cloud

View all
Software Engineering Radio - the podcast for professional software developers by team@se-radio.net (SE-Radio Team)

Software Engineering Radio - the podcast for professional software developers

272 Listeners

Hanselminutes with Scott Hanselman by Scott Hanselman

Hanselminutes with Scott Hanselman

382 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

289 Listeners

The a16z Show by Andreessen Horowitz

The a16z Show

1,098 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

623 Listeners

The Cloudcast by Massive Studios

The Cloudcast

151 Listeners

Thoughtworks Technology Podcast by Thoughtworks

Thoughtworks Technology Podcast

45 Listeners

Y Combinator Startup Podcast by Y Combinator

Y Combinator Startup Podcast

227 Listeners

Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

Syntax - Tasty Web Development Treats

988 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

207 Listeners

AWS Morning Brief by Corey Quinn

AWS Morning Brief

79 Listeners

The Stack Overflow Podcast by The Stack Overflow Podcast

The Stack Overflow Podcast

64 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

530 Listeners

Oxide and Friends by Oxide Computer Company

Oxide and Friends

67 Listeners

The AI Daily Brief: Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief: Artificial Intelligence News and Analysis

651 Listeners