This is Fine! A podcast about resilience engineering and software

Root Cause Analysis vs. Resilience Engineering w/special guest Lorin Hochstein


Listen Later

A history of the 5 whys and root cause analysis from papers

Some critiques of the 5 whys:

From John Allspaw: https://www.oreilly.com/radar/the-infinite-hows/

From Alan J Card: https://qualitysafety.bmj.com/content/26/8/671



James Reason and the Swiss Cheese Model: 

https://pmc.ncbi.nlm.nih.gov/articles/PMC8514562/

James Reason’s book Human Error: https://bookshop.org/p/books/human-error/9e06d8a100a07537?ean=9780521314190&next=t



And a classic from Sidney Dekker (et al.) on the implication of complexity within safety investigations:

https://www.sciencedirect.com/science/article/abs/pii/S0925753511000105?via%3Dihub



We always recommend the Howie Guide: https://howie-guide.pagerduty.com/

STAMP is starting to get popular: https://functionalsafetyengineer.com/introduction-to-stamp/

Google’s STAMP paper: https://www.usenix.org/publications/loginonline/evolution-sre-google

Google’s STAMP discussion on ProdCast: https://sre.google/prodcast/#season4-episode7

And presentation at SRECon: https://www.usenix.org/conference/srecon25americas/presentation/klein

Nancy Leveson’s google scholar is always worth browsing: https://scholar.google.com/citations?user=78y4sEcAAAAJ&hl=en

Allspaw’s LinkedIn post that we quoted: https://www.linkedin.com/posts/jallspaw_important-reminders-about-learning-effectively-activity-7378775591447183360-c_eD


Lorin’s Law: https://surfingcomplexity.blog/2017/06/24/a-conjecture-on-why-reliable-systems-fail/

Want to talk more about this subject? We’re doing a live event co-sponsored by RISF and you can sign up for it here: https://resilienceinsoftware.org/networks/events/146485


...more
View all episodesView all episodes
Download on the App Store

This is Fine! A podcast about resilience engineering and softwareBy Colette Alexander and Clint Byrum

  • 5
  • 5
  • 5
  • 5
  • 5

5

4 ratings


More shows like This is Fine! A podcast about resilience engineering and software

View all
This American Life by This American Life

This American Life

91,087 Listeners

Freakonomics Radio by Freakonomics Radio + Stitcher

Freakonomics Radio

32,075 Listeners

Planet Money by NPR

Planet Money

30,683 Listeners

99% Invisible by Roman Mars

99% Invisible

26,162 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

291 Listeners

Cautionary Tales with Tim Harford by Pushkin Industries

Cautionary Tales with Tim Harford

5,154 Listeners

Slight Reliability by Stephen Townshend

Slight Reliability

2 Listeners