The Cloudcast

From QAos to Chaos Engineering


Listen Later

Benjamin Wilms (@MrBWilms, co-founder/CEO of @Steadybit) talks about the importance of resilience for SREs, DevOps, and developers through chaos engineering platforms

SHOW: 661

CLOUD NEWS OF THE WEEK - http://bit.ly/cloudcast-cnotw

CHECK OUT OUR NEW PODCAST - "CLOUDCAST BASICS"

SHOW SPONSORS:

  • Datadog Synthetic Monitoring: Frontend and Backend Modern Monitoring
  • Ensure frontend issues don’t impair user experience by detecting user-facing issues with API and browser tests with a free 14 day Datadog trial. Listeners of The Cloudcast will also receive a free Datadog T-shirt. 
  • Granulate, an Intel company - Autonomous, continuous, workload optimization
  • gProfiler from Granulate - Production profiling, made easy
  • CDN77 - Content Delivery Network Optimized for Video
  • 85% of users stop watching a video because of stalling and rebuffering. Rely on CDN77 to deliver a seamless online experience to your audience. Ask for a free trial with no duration or traffic limits.

SHOW NOTES:

  • Steadybit (homepage)
  • Steadybit wants developers involved in Chaos engineering before production (TechCrunch)

Topic 1 - Benjamin, give everyone a quick introduction.

Topic 2 - Let’s start with the concept of chaos engineering. In its simplest form, chaos engineering intentionally takes down parts of a test or production environment (typically after software has shipped) randomly so teams, typically SRE’s/ops/dev, are forced to make the applications more resilient over time. It’s not a matter of if systems will go down, it’s a matter of when. This makes the systems better over time. Benjamin, you have a consulting background in this area that ultimately led to founding Steadybit. What were the limitations to this approach?

Topic 3 - What you’re talking about is a more proactive approach to downtime. I’ll call this resilience engineering and it requires a shift in mindset in an organization. How do you get developers onboard to embrace the need? Are we asking developers to share responsibility for outages with the SRE organization?

Topic 4 - On the surface, the obvious benefit is reduced downtime. That can be hard to quantify in business value. Outages can be measured, a lack of outages is harder to quantify. Does this become an issue in convincing an organization to embrace this methodology?

Topic 5 - When you say we are going to move chaos engineering into the CI/CD pipeline, what does that mean? Is this code that is added? Testing simulations that have to be passed? Real time failures of databases or nodes or simulated? What are the common use cases?

FEEDBACK?

  • Email: show at the cloudcast dot net
  • Twitter: @thecloudcastnet
...more
View all episodesView all episodes
Download on the App Store

The CloudcastBy Massive Studios

  • 4.6
  • 4.6
  • 4.6
  • 4.6
  • 4.6

4.6

147 ratings


More shows like The Cloudcast

View all
Hanselminutes with Scott Hanselman by Scott Hanselman

Hanselminutes with Scott Hanselman

377 Listeners

Software Engineering Radio - the podcast for professional software developers by se-radio@computer.org

Software Engineering Radio - the podcast for professional software developers

273 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

283 Listeners

a16z Podcast by Andreessen Horowitz

a16z Podcast

1,023 Listeners

Thoughtworks Technology Podcast by Thoughtworks

Thoughtworks Technology Podcast

42 Listeners

Talk Python To Me by Michael Kennedy

Talk Python To Me

592 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

624 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

203 Listeners

Gartner ThinkCast by Gartner

Gartner ThinkCast

110 Listeners

DataFramed by DataCamp

DataFramed

266 Listeners

Kubernetes Podcast from Google by Abdel Sghiouar, Kaslin Fields

Kubernetes Podcast from Google

181 Listeners

Practical AI by Practical AI LLC

Practical AI

189 Listeners

The Stack Overflow Podcast by The Stack Overflow Podcast

The Stack Overflow Podcast

64 Listeners

The Real Python Podcast by Real Python

The Real Python Podcast

141 Listeners

The Pragmatic Engineer by Gergely Orosz

The Pragmatic Engineer

53 Listeners