The InfoQ Podcast

Ryan Kitchens on Learning from Incidents at Netflix, the Role of SRE, and Sociotechnical Systems


Listen Later

In today’s podcast we sit down with Ryan Kitchens, a senior site reliability engineer and member of the CORE team at Netflix. This team is responsible for the entire lifecycle of incident management at Netflix, from incident response to memorialising an issue.
Why listen to this podcast:
- Top level metrics can be used as a proxy for user experience, and can be used to determine that issue should be alerted on an investigated. For example, at Netflix if the customer playback initiation “streams per second” metric declines rapidly, this may be an indication that something has broken.
- Focusing on how things go right can provide valuable insight into the resilience within your system e.g. what are people doing everyday that helps us overcome incidents. Finding sources of resilience is somewhat “the story of the incident you didn’t have”.
- When conducting an incident postmortem, simply reconstructing an incident is often not sufficient to determine what needs to be fixed; there is no root cause with complex socio-technical systems as found at Netflix and most modern web-based organisations. Instead, teams must dig a little deeper, and look for what went well, what contributed to the problem, and where are the recurring patterns.
- Resilience engineering is a multidisciplinary field that was established in the early 2000s, and the associated community that has emerged is both academic and deeply practical. Although much resilience engineering focuses on domains such as aviation, surgery and military agencies, there is much overlap with the domain of software engineering.
- Make sure that support staff within an organisation have a feedback loop into the product team, as these people providing support often know where all of the hidden problems are, the nuances of the systems, and the workarounds.
More on this: Quick scan our curated show notes on InfoQ https://bit.ly/2LLwk8T
You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq
Subscribe: www.youtube.com/infoq
Like InfoQ on Facebook: bit.ly/2jmlyG8
Follow on Twitter: twitter.com/InfoQ
Follow on LinkedIn: www.linkedin.com/company/infoq
Check the landing page on InfoQ: https://bit.ly/2LLwk8T
...more
View all episodesView all episodes
Download on the App Store

The InfoQ PodcastBy InfoQ

  • 4.8
  • 4.8
  • 4.8
  • 4.8
  • 4.8

4.8

37 ratings


More shows like The InfoQ Podcast

View all
Software Engineering Radio by se-radio@computer.org

Software Engineering Radio

271 Listeners

Hanselminutes with Scott Hanselman by Scott Hanselman

Hanselminutes with Scott Hanselman

383 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

289 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

622 Listeners

Soft Skills Engineering by Jamison Dance and Dave Smith

Soft Skills Engineering

289 Listeners

Thoughtworks Technology Podcast by Thoughtworks

Thoughtworks Technology Podcast

43 Listeners

Engineering Culture by InfoQ by InfoQ

Engineering Culture by InfoQ

13 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

303 Listeners

Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

Syntax - Tasty Web Development Treats

987 Listeners

CoRecursive: Coding Stories by Adam Gordon Bell - Software Developer

CoRecursive: Coding Stories

190 Listeners

Practical AI by Practical AI LLC

Practical AI

207 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

202 Listeners

.NET Rocks! by Carl Franklin and Richard Campbell

.NET Rocks!

244 Listeners

The Stack Overflow Podcast by The Stack Overflow Podcast

The Stack Overflow Podcast

63 Listeners

Oxide and Friends by Oxide Computer Company

Oxide and Friends

62 Listeners