Data Engineering Podcast

Honeycomb Data Infrastructure with Sam Stokes - Episode 20


Listen Later

Summary

One of the sources of data that often gets overlooked is the systems that we use to run our businesses. This data is not used to directly provide value to customers or understand the functioning of the business, but it is still a critical component of a successful system. Sam Stokes is an engineer at Honeycomb where he helps to build a platform that is able to capture all of the events and context that occur in our production environments and use them to answer all of your questions about what is happening in your system right now. In this episode he discusses the challenges inherent in capturing and analyzing event data, the tools that his team is using to make it possible, and how this type of knowledge can be used to improve your critical infrastructure.

Preamble
  • Hello and welcome to the Data Engineering Podcast, the show about modern data infrastructure
  • When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at dataengineeringpodcast.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your data pipelines or trying out the tools you hear about on the show.
  • Go to dataengineeringpodcast.com to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch.
  • You can help support the show by checking out the Patreon page which is linked from the site.
  • To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers
  • A few announcements:
    • There is still time to register for the O’Reilly Strata Conference in San Jose, CA March 5th-8th. Use the link dataengineeringpodcast.com/strata-san-jose to register and save 20%
    • The O’Reilly AI Conference is also coming up. Happening April 29th to the 30th in New York it will give you a solid understanding of the latest breakthroughs and best practices in AI for business. Go to dataengineeringpodcast.com/aicon-new-york to register and save 20%
    • If you work with data or want to learn more about how the projects you have heard about on the show get used in the real world then join me at the Open Data Science Conference in Boston from May 1st through the 4th. It has become one of the largest events for data scientists, data engineers, and data driven businesses to get together and learn how to be more effective. To save 60% off your tickets go to dataengineeringpodcast.com/odsc-east-2018 and register.

    • Your host is Tobias Macey and today I’m interviewing Sam Stokes about his work at Honeycomb, a modern platform for observability of software systems

    • Interview
      • Introduction
      • How did you get involved in the area of data management?
      • What is Honeycomb and how did you get started at the company?
      • Can you start by giving an overview of your data infrastructure and the path that an event takes from ingest to graph?
      • What are the characteristics of the event data that you are dealing with and what challenges does it pose in terms of processing it at scale?
      • In addition to the complexities of ingesting and storing data with a high degree of cardinality, being able to quickly analyze it for customer reporting poses a number of difficulties. Can you explain how you have built your systems to facilitate highly interactive usage patterns?
      • A high degree of visibility into a running system is desirable for developers and systems adminstrators, but they are not always willing or able to invest the effort to fully instrument the code or servers that they want to track. What have you found to be the most difficult aspects of data collection, and do you have any tooling to simplify the implementation for user?
      • How does Honeycomb compare to other systems that are available off the shelf or as a service, and when is it not the right tool?
      • What have been some of the most challenging aspects of building, scaling, and marketing Honeycomb?
      • Contact Info
        • @samstokes on Twitter
        • Blog
        • samstokes on GitHub
        • Parting Question
          • From your perspective, what is the biggest gap in the tooling or technology for data management today?
          • Links
            • Honeycomb
            • Retriever
            • Monitoring and Observability
            • Kafka
            • Column Oriented Storage
            • Elasticsearch
            • Elastic Stack
            • Django
            • Ruby on Rails
            • Heroku
            • Kubernetes
            • Launch Darkly
            • Splunk
            • Datadog
            • Cynefin Framework
            • Go-Lang
            • Terraform
            • AWS
            • The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

              Support Data Engineering Podcast

              ...more
              View all episodesView all episodes
              Download on the App Store

              Data Engineering PodcastBy Tobias Macey

              • 4.6
              • 4.6
              • 4.6
              • 4.6
              • 4.6

              4.6

              135 ratings


              More shows like Data Engineering Podcast

              View all
              Software Engineering Radio - the podcast for professional software developers by se-radio@computer.org

              Software Engineering Radio - the podcast for professional software developers

              272 Listeners

              The Changelog: Software Development, Open Source by Changelog Media

              The Changelog: Software Development, Open Source

              283 Listeners

              The Cloudcast by Massive Studios

              The Cloudcast

              153 Listeners

              Thoughtworks Technology Podcast by Thoughtworks

              Thoughtworks Technology Podcast

              41 Listeners

              Data Skeptic by Kyle Polich

              Data Skeptic

              483 Listeners

              Talk Python To Me by Michael Kennedy

              Talk Python To Me

              592 Listeners

              Software Engineering Daily by Software Engineering Daily

              Software Engineering Daily

              624 Listeners

              The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

              The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

              444 Listeners

              Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

              Super Data Science: ML & AI Podcast with Jon Krohn

              298 Listeners

              Python Bytes by Michael Kennedy and Brian Okken

              Python Bytes

              213 Listeners

              DataFramed by DataCamp

              DataFramed

              266 Listeners

              Practical AI by Practical AI LLC

              Practical AI

              190 Listeners

              The Stack Overflow Podcast by The Stack Overflow Podcast

              The Stack Overflow Podcast

              64 Listeners

              The Real Python Podcast by Real Python

              The Real Python Podcast

              140 Listeners

              Latent Space: The AI Engineer Podcast by swyx + Alessio

              Latent Space: The AI Engineer Podcast

              77 Listeners