OpenObservability Talks

SRE at Google: Planet-scale observability - OpenObservability Talks S2E05


Listen Later

Have you ever wondered how services are operated at Google’s scale? Here’s your opportunity to find out. Ramón will share how his SRE team runs Google’s identity services, and the elaborate end-to-end observability they use to achieve it with strict SLA. We’ll also get a glimpse at the birthplace of Kubernetes, OpenCensus, Dapper, Monarch and other cornerstones of today’s cloud-native DevOps and observability.

Ramón Medrano Llamas (@rmedranollamas) is a staff site reliability engineer at Google, focused on user identity and authentication. He concentrates on the reliability aspects of new Google products and new features of existing products, ensuring that they meet the same high bar as every other Google service. Before joining Google in 2013, he worked at CERN developing and designing distributed systems for physics. He holds a master’s degree in computer science and is pursuing a PhD on distributed systems.

The episode was live-streamed on 26 October 2021 and the video is available at https://youtube.com/live/jVTZf1SXZrg


Show Notes:

  • scale and size of Google Identity services operation
  • evolution from monitoring to observability
  • telemetry collection
  • SRE job description is changing
  • Google Dapper
  • Google Census
  • operating end-to-end observability at scale
  • flexibility vs. runbook in SRE
  • how SRE at google different
  • transition from monolith to MSA
  • Linux Foundation launching a DevOps bootcamp
  • Parca OSS launched
  • how to intro SRE culture
  • Resources:

    • Dapper paper: Dapper, a Large-Scale Distributed Systems Tracing Infrastructure
    • Borg paper: Large-scale cluster management at Google with Borg
    • MonArch paper: Monarch: Google’s Planet-Scale In-Memory Time Series Database
    • SRE books 
    • Systemantics
    • ...more
      View all episodesView all episodes
      Download on the App Store

      OpenObservability TalksBy Dotan Horovits

      • 5
      • 5
      • 5
      • 5
      • 5

      5

      2 ratings


      More shows like OpenObservability Talks

      View all
      Software Engineering Radio - the podcast for professional software developers by se-radio@computer.org

      Software Engineering Radio - the podcast for professional software developers

      266 Listeners

      Wait Wait... Don't Tell Me! by NPR

      Wait Wait... Don't Tell Me!

      38,660 Listeners

      SANS Internet Stormcenter Daily Cyber Security Podcast (Stormcast) by Johannes B. Ullrich

      SANS Internet Stormcenter Daily Cyber Security Podcast (Stormcast)

      628 Listeners

      The Changelog: Software Development, Open Source by Changelog Media

      The Changelog: Software Development, Open Source

      285 Listeners

      The Cloudcast by Massive Studios

      The Cloudcast

      153 Listeners

      Thoughtworks Technology Podcast by Thoughtworks

      Thoughtworks Technology Podcast

      42 Listeners

      Conversations with Tyler by Mercatus Center at George Mason University

      Conversations with Tyler

      2,397 Listeners

      Data Engineering Podcast by Tobias Macey

      Data Engineering Podcast

      139 Listeners

      The Daily by The New York Times

      The Daily

      111,160 Listeners

      Kubernetes Podcast from Google by Abdel Sghiouar, Kaslin Fields

      Kubernetes Podcast from Google

      180 Listeners

      Hard Fork by The New York Times

      Hard Fork

      5,370 Listeners

      System Design by Wes and Kevin

      System Design

      93 Listeners