PurePerformance

Learning from Incidents is what good SREs do with Laura Nolan


Listen Later

Incidents happen! And when asking Laura Nolan who was an SRE at Google and Slack, healthy organizations should take proper time to analyze and learn from them. This will improve future incident response as well as overall system resiliency.Tune in to this episode and hear Laura’s tips & tricks what makes a good SRE organization. It starts with doing good write ups of incidents, doing your research on incident reports of software and services that you are looking into using. We also spent a good amount of time discussing root cause analysis where she highlighted an incident that happened at her time at Google and what she learned about outdated alerting.Thanks Laura for a great discussion and lots of insights.

Here are the additional links we discussed during the podcast
  • Laura on LinkedIn: https://www.linkedin.com/in/laura-nolan-bb7429/
  • Laura on Twitter:https://twitter.com/lauralifts
  • Incident Template talk @ SRECon: https://www.usenix.org/conference/srecon22emea/presentation/nolan-break
  • What SRE could be talk @ SRECon: https://www.usenix.org/conference/srecon22emea/presentation/nolan-sre
  • Howie Post-Incident Guide: https://www.jeli.io/howie/welcome
  • My philosophy on Alerting article: https://docs.google.com/document/d/199PqyG3UsyXlwieHaqbGiWVa8eMWi8zzAn0YfcApr8Q/edit
...more
View all episodesView all episodes
Download on the App Store

PurePerformanceBy PurePerformance

  • 5
  • 5
  • 5
  • 5
  • 5

5

9 ratings


More shows like PurePerformance

View all
Arrested DevOps by Matt Stratton, Trevor Hess, Jessica Kerr, and Bridget Kromhout

Arrested DevOps

69 Listeners

Darknet Diaries by Jack Rhysider

Darknet Diaries

8,014 Listeners

SmartLess by Jason Bateman, Sean Hayes, Will Arnett

SmartLess

58,245 Listeners