The Cloudcast

DevOps and Incident Response Evolution


Listen Later

Chris Riley (@hoardinginfo, DevOps Advocate, @Splunk) talks about the state of DevOps, the evolution of Incident Response with Machine Learning, Service vs. Site Reliability, and using Incident Response to increase quality of development

SHOW: 439

SHOW SPONSOR LINKS:

  • Datadog Homepage - Modern Monitoring and Analytics
  • Try Datadog yourself by starting a free, 14-day trial today. Listeners of this podcast will also receive a free Datadog T-shirt
  • MongoDB Homepage - The most popular database for modern applications
  • MongoDB Atlas - MongoDB-as-a-Service on AWS, Azure and GCP

CLOUD NEWS OF THE WEEK - http://bit.ly/cloudcast-cnotw

SHOW NOTES:

  • VictorOps (now Splunk) Blog
  • Chris Riley at DevOps.com
  • Developers Eating the World Podcast

Topic 1 - Welcome to the show. Tell everyone a little about yourself, you’ve been active in the DevOps space for quite some time.

Topic 2 - About a year ago we had your peer and good friend of the show, Josh Atwell, on to talk about the State of DevOps in 2019. What are your thoughts on changes over the last 12 months and where we headed in 2020?

Topic 3 - One item in particular that has drawn my attention is your discussions on Incident Response and Machine Learning. Can you tell everyone a little bit about that and why you believe it will be valuable going forward?

Topic 4 - This in a way feels almost like a transition into the next evolution of our model. First we had separate dev and ops and no one talked, then we put them together, then we had every device and app start spitting out logs and alerts and next thing you knew, we were drowning in data… The complexity of the systems has grown exponentially. Fair?

Topic 5 - You recently did a post over on the Victor Ops blog about SRE and the meaning of the “S” in that blog. You propose more and more it should stand for Service Reliability Engineer vs. the more traditional Site Reliability Engineer, especially as we move into a subscription based model world. Can you explain to everyone your thoughts there?

Topic 6 -
When I think Incident Response, I think production environments. As part of VictorOps I’m sure you see a lot of use cases and have solved some pretty unique customer problems. How can this be applied outside of production, say for application testing or quality before hitting production? Is that a valid approach?

FEEDBACK?

  • Email: show at thecloudcast dot net
  • Twitter: @thecloudcastnet
...more
View all episodesView all episodes
Download on the App Store

The CloudcastBy Massive Studios

  • 4.6
  • 4.6
  • 4.6
  • 4.6
  • 4.6

4.6

147 ratings


More shows like The Cloudcast

View all
The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

289 Listeners

The a16z Show by Andreessen Horowitz

The a16z Show

1,093 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

623 Listeners

Talk Python To Me by Michael Kennedy

Talk Python To Me

583 Listeners

Soft Skills Engineering by Jamison Dance and Dave Smith

Soft Skills Engineering

288 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

302 Listeners

NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

334 Listeners

Tech Brew Ride Home by Morning Brew

Tech Brew Ride Home

961 Listeners

Practical AI by Practical AI LLC

Practical AI

203 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

205 Listeners

The Real Python Podcast by Real Python

The Real Python Podcast

141 Listeners

Big Technology Podcast by Alex Kantrowitz

Big Technology Podcast

500 Listeners

This Day in AI Podcast by Michael Sharkey, Chris Sharkey

This Day in AI Podcast

228 Listeners

AI + a16z by a16z

AI + a16z

36 Listeners

The Pragmatic Engineer by Gergely Orosz

The Pragmatic Engineer

71 Listeners