Reliability Enablers

#48 Cutting Down "Toil" aka Manual Work in Software


Listen Later

Sebastian and I scoured Chapter 5 of the Site Reliability Engineering (2016) book to find nuggets of wisdom on how to reduce toil.

We hit the jackpot with concepts like:

* what is toil according to a 5-point criteria

* why even care about toil?

* where you can find toil in your software system

* Google’s goal for how much work (%) should be toil

* the fact that toil isn’t always all that bad

Don’t have time to listen to what we learned or added to the concepts?

Check out the takeaways toward the end of this email.

But first…

Before we jump into the takeaways, here’s a new segment I’m trying out for newsletters. I’ll highlight a new reliability tool that I think could help you.

Do you struggle to visualize your Kubernetes workloads?

In that case, have you heard of kube-ops-view?

It helps you visualize your complex K8s clusters and everything inside them.

For a deeper rundown, visit the LinkedIn post I made about kube-ops-view which shares a few more details.

Back to our original programming…

Here are key takeaways from our chat

* Define and Identify Toil

Regularly evaluate your tasks. Identify work that is manual, repetitive, and potentially automatable. Recognize it as toil and prioritize its reduction.

* Prioritize Automation

Look for repetitive tasks in your workflow and automate them using tools and scripts to reduce manual interventions and increase efficiency.

* Embrace the Role of an SRE

Realize that the role of an SRE is to improve system reliability proactively. Focus on long-term improvements rather than just responding to immediate issues.

* Address Common Sources of Toil

Identify frequent sources of toil like context switching, on-call duties, and release processes. Implement solutions to automate and streamline these areas.

* Adopt a Toil Elimination Mindset

Cultivate a mindset focused on eliminating toil. Regularly discuss and explore automation opportunities with your team to improve processes.

* Develop a Culture of Continuous Improvement

Encourage a culture that values reducing manual, repetitive work. Advocate for proactive problem-solving and continuous process enhancement within teams.

Until next time, happy toil hunting!



This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit read.srepath.com
...more
View all episodesView all episodes
Download on the App Store

Reliability EnablersBy Ash Patel & Sebastian Vietz

  • 5
  • 5
  • 5
  • 5
  • 5

5

2 ratings


More shows like Reliability Enablers

View all
Software Defined Talk by Software Defined Talk LLC

Software Defined Talk

67 Listeners

The New Stack Podcast by The New Stack

The New Stack Podcast

32 Listeners

Darknet Diaries by Jack Rhysider

Darknet Diaries

7,845 Listeners

Kubernetes Podcast from Google by Abdel Sghiouar, Kaslin Fields

Kubernetes Podcast from Google

180 Listeners

Shawn Ryan Show by Shawn Ryan

Shawn Ryan Show

41,024 Listeners

GOTO - The Brightest Minds in Tech by GOTO

GOTO - The Brightest Minds in Tech

5 Listeners

Google SRE Prodcast by Salim Virji

Google SRE Prodcast

16 Listeners

The IaC Podcast by Ohad Maislish

The IaC Podcast

2 Listeners

KubeFM by KubeFM

KubeFM

2 Listeners

localfirst.fm by localfirst.fm

localfirst.fm

17 Listeners