The InfoQ Podcast

Tanya Reilly on Site Reliability Engineering and the Evolution of the New York City Fire Code


Listen Later

This week on the InfoQ Podcast, Wes Reisz talks to Tanya Reilly (Principal Engineer at Squarespace and previously a staff SRE at Google). Tanya discusses her research into how the fire code evolved in New York and draws on some of the parallels she sees in software. Along the way, she discusses what it means to be an SRE, what effective aspects of the role might look like, and her opinions on what we as an industry should be doing to prevent disasters. This podcast features discussion on paved roads, prevention, testing, firefighting (in software), and reliability questions to ask throughout the software lifecycle.
Why listen to this podcast:
- Teams increasingly are responsible for the entire software lifecycle. When this happens, they think about the software differently because they know their the ones that will get paged if it fails. This idea is at the core of the “You Build It, You Run It” philosophy in DevOps.
- The role of SRE is to define how to do things in a really reliable way. The focus is to make the majority of the operations work go away, and, for the things that can’t go away, it’s as easy as possible.
- At the very start of a project (when you’re writing the initial design), you should be thinking about the dependencies for a system and how will those that follow with be able to determine that. A great way to do this is to offer an API that people will want to use and then instrument it.
- We can learn a lot from the growth of fire safety regulations as metaphors for software, including: fireproof interior walls, socializing best practices, software inspections, and circuit breakers are all examples.
- The work SREs do varies in many places. SREs range from making recommendations on patterns to library creators in other areas. Occasionally, SREs are firefighters of last resort. In these cases, they’re the last resort though.
- We use error budgets and SLOs to quantify how many much risk we’re comfortable taking. It’s used to inform how much less (or more risk) we’re willing to take on.
- We need to consider software reliability throughout the full cycle of software development. When you build systems. Think about as if there will not be someone on call for it .
You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq
Subscribe: www.youtube.com/infoq
Like InfoQ on Facebook: bit.ly/2jmlyG8
Follow on Twitter: twitter.com/InfoQ
Follow on LinkedIn: www.linkedin.com/company/infoq
...more
View all episodesView all episodes
Download on the App Store

The InfoQ PodcastBy InfoQ

  • 4.8
  • 4.8
  • 4.8
  • 4.8
  • 4.8

4.8

37 ratings


More shows like The InfoQ Podcast

View all
Hanselminutes with Scott Hanselman by Scott Hanselman

Hanselminutes with Scott Hanselman

377 Listeners

Software Engineering Radio - the podcast for professional software developers by se-radio@computer.org

Software Engineering Radio - the podcast for professional software developers

272 Listeners

.NET Rocks! by Carl Franklin and Richard Campbell

.NET Rocks!

246 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

282 Listeners

The Cloudcast by Massive Studios

The Cloudcast

152 Listeners

Thoughtworks Technology Podcast by Thoughtworks

Thoughtworks Technology Podcast

42 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

627 Listeners

Soft Skills Engineering by Jamison Dance and Dave Smith

Soft Skills Engineering

270 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

203 Listeners

Engineering Culture by InfoQ by InfoQ

Engineering Culture by InfoQ

12 Listeners

CoRecursive: Coding Stories by Adam Gordon Bell - Software Developer

CoRecursive: Coding Stories

189 Listeners

Kubernetes Podcast from Google by Abdel Sghiouar, Kaslin Fields

Kubernetes Podcast from Google

181 Listeners

Practical AI by Practical AI LLC

Practical AI

189 Listeners

The Stack Overflow Podcast by The Stack Overflow Podcast

The Stack Overflow Podcast

64 Listeners

The Pragmatic Engineer by Gergely Orosz

The Pragmatic Engineer

52 Listeners