Screaming in the Cloud

Episode 22: The Chaos Engineering experiment that is us-east-1


Listen Later

Trying to convince a company to embrace the theory and idea of Chaos Engineering is an uphill battle. When a site keeps breaking, Gremlin’s plan involves breaking things intentionally. How do you introduce chaos as a step toward making things better?

Today, we’re talking to Ho Ming Li, lead solutions architect at Gremlin. He takes a strategic approach to deliver holistic solutions, often diving into the intersection of people, process, business, and technology. His goal is to enable everyone to build more resilient software by means of Chaos Engineering practices.

Some of the highlights of the show include:

  • Ho Ming Li previously worked as a technical account manager (TAM) at Amazon Web Services (AWS) to offer guidance on architectural/operational best practices
  • Difference between and transition to solutions architect and TAM at AWS
  • Role of TAM as the voice and face of AWS for customers
  • Ultimate goal is to bring services back up and make sure customers are happy
  • Amazon Leadership Principles: Mutually beneficial to have the customer get what they want, be happy with the service, and achieve success with the customer
  • Chaos Engineering isn’t about breaking things to prove a point
  • Chaos Engineering takes a scientific approach
  • Other than during carefully staged DR exercises, DR plans usually don’t work
  • Availability Theater: A passive data center is not enough; exercise DR plan
  • Chaos Engineering is bringing it down to a level where you exercise it regularly to build resiliency
  • Start small when dealing with availability
  • Chaos Engineering is a journey of verifying, validating, and catching surprises in a safe environment
  • Get started with Chaos Engineering by asking: What could go wrong?
  • Embrace failure and prepare for it; business process resilience
  • Gremlin’s GameDay and Chaos Conf allows people to share experiences
  • Links:

    • Ho Ming Li on Twitter
    • Gremlin
    • Gremlin on Twitter
    • Gremlin on Facebook
    • Gremlin on Instagram
    • Gremlin: It’s GameDay
    • Chaos Engineering Slack
    • Chaos Conf
    • Amazon Leadership Principles
    • Adrian Cockcroft and Availability Theater
    • Digital Ocean
    • .
      ...more
      View all episodesView all episodes
      Download on the App Store

      Screaming in the CloudBy Corey Quinn

      • 4.7
      • 4.7
      • 4.7
      • 4.7
      • 4.7

      4.7

      92 ratings


      More shows like Screaming in the Cloud

      View all
      Software Engineering Radio by se-radio@computer.org

      Software Engineering Radio

      271 Listeners

      Hanselminutes with Scott Hanselman by Scott Hanselman

      Hanselminutes with Scott Hanselman

      383 Listeners

      The Changelog: Software Development, Open Source by Changelog Media

      The Changelog: Software Development, Open Source

      289 Listeners

      The a16z Show by Andreessen Horowitz

      The a16z Show

      1,092 Listeners

      Software Engineering Daily by Software Engineering Daily

      Software Engineering Daily

      622 Listeners

      The Cloudcast by Massive Studios

      The Cloudcast

      151 Listeners

      Thoughtworks Technology Podcast by Thoughtworks

      Thoughtworks Technology Podcast

      43 Listeners

      Y Combinator Startup Podcast by Y Combinator

      Y Combinator Startup Podcast

      225 Listeners

      Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

      Syntax - Tasty Web Development Treats

      987 Listeners

      AWS Podcast by Amazon Web Services

      AWS Podcast

      202 Listeners

      AWS Morning Brief by Corey Quinn

      AWS Morning Brief

      79 Listeners

      The Stack Overflow Podcast by The Stack Overflow Podcast

      The Stack Overflow Podcast

      63 Listeners

      Dwarkesh Podcast by Dwarkesh Patel

      Dwarkesh Podcast

      517 Listeners

      Oxide and Friends by Oxide Computer Company

      Oxide and Friends

      62 Listeners

      The AI Daily Brief: Artificial Intelligence News and Analysis by Nathaniel Whittemore

      The AI Daily Brief: Artificial Intelligence News and Analysis

      616 Listeners