Data Engineering Podcast

Honeycomb Data Infrastructure with Sam Stokes - Episode 20


Listen Later

Summary

One of the sources of data that often gets overlooked is the systems that we use to run our businesses. This data is not used to directly provide value to customers or understand the functioning of the business, but it is still a critical component of a successful system. Sam Stokes is an engineer at Honeycomb where he helps to build a platform that is able to capture all of the events and context that occur in our production environments and use them to answer all of your questions about what is happening in your system right now. In this episode he discusses the challenges inherent in capturing and analyzing event data, the tools that his team is using to make it possible, and how this type of knowledge can be used to improve your critical infrastructure.

Preamble
  • Hello and welcome to the Data Engineering Podcast, the show about modern data infrastructure
  • When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at dataengineeringpodcast.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your data pipelines or trying out the tools you hear about on the show.
  • Go to dataengineeringpodcast.com to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch.
  • You can help support the show by checking out the Patreon page which is linked from the site.
  • To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers
  • A few announcements:
    • There is still time to register for the O’Reilly Strata Conference in San Jose, CA March 5th-8th. Use the link dataengineeringpodcast.com/strata-san-jose to register and save 20%
    • The O’Reilly AI Conference is also coming up. Happening April 29th to the 30th in New York it will give you a solid understanding of the latest breakthroughs and best practices in AI for business. Go to dataengineeringpodcast.com/aicon-new-york to register and save 20%
    • If you work with data or want to learn more about how the projects you have heard about on the show get used in the real world then join me at the Open Data Science Conference in Boston from May 1st through the 4th. It has become one of the largest events for data scientists, data engineers, and data driven businesses to get together and learn how to be more effective. To save 60% off your tickets go to dataengineeringpodcast.com/odsc-east-2018 and register.

    • Your host is Tobias Macey and today I’m interviewing Sam Stokes about his work at Honeycomb, a modern platform for observability of software systems

    • Interview
      • Introduction
      • How did you get involved in the area of data management?
      • What is Honeycomb and how did you get started at the company?
      • Can you start by giving an overview of your data infrastructure and the path that an event takes from ingest to graph?
      • What are the characteristics of the event data that you are dealing with and what challenges does it pose in terms of processing it at scale?
      • In addition to the complexities of ingesting and storing data with a high degree of cardinality, being able to quickly analyze it for customer reporting poses a number of difficulties. Can you explain how you have built your systems to facilitate highly interactive usage patterns?
      • A high degree of visibility into a running system is desirable for developers and systems adminstrators, but they are not always willing or able to invest the effort to fully instrument the code or servers that they want to track. What have you found to be the most difficult aspects of data collection, and do you have any tooling to simplify the implementation for user?
      • How does Honeycomb compare to other systems that are available off the shelf or as a service, and when is it not the right tool?
      • What have been some of the most challenging aspects of building, scaling, and marketing Honeycomb?
      • Contact Info
        • @samstokes on Twitter
        • Blog
        • samstokes on GitHub
        • Parting Question
          • From your perspective, what is the biggest gap in the tooling or technology for data management today?
          • Links
            • Honeycomb
            • Retriever
            • Monitoring and Observability
            • Kafka
            • Column Oriented Storage
            • Elasticsearch
            • Elastic Stack
            • Django
            • Ruby on Rails
            • Heroku
            • Kubernetes
            • Launch Darkly
            • Splunk
            • Datadog
            • Cynefin Framework
            • Go-Lang
            • Terraform
            • AWS
            • The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

              Support Data Engineering Podcast

              ...more
              View all episodesView all episodes
              Download on the App Store

              Data Engineering PodcastBy Tobias Macey

              • 4.5
              • 4.5
              • 4.5
              • 4.5
              • 4.5

              4.5

              140 ratings


              More shows like Data Engineering Podcast

              View all
              Software Engineering Radio by se-radio@computer.org

              Software Engineering Radio

              273 Listeners

              The Changelog: Software Development, Open Source by Changelog Media

              The Changelog: Software Development, Open Source

              291 Listeners

              Software Engineering Daily by Software Engineering Daily

              Software Engineering Daily

              625 Listeners

              The Cloudcast by Massive Studios

              The Cloudcast

              153 Listeners

              Talk Python To Me by Michael Kennedy

              Talk Python To Me

              585 Listeners

              Thoughtworks Technology Podcast by Thoughtworks

              Thoughtworks Technology Podcast

              42 Listeners

              Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

              Super Data Science: ML & AI Podcast with Jon Krohn

              304 Listeners

              Python Bytes by Michael Kennedy and Brian Okken

              Python Bytes

              214 Listeners

              Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

              Syntax - Tasty Web Development Treats

              983 Listeners

              DataFramed by DataCamp

              DataFramed

              268 Listeners

              Practical AI by Practical AI LLC

              Practical AI

              212 Listeners

              AWS Podcast by Amazon Web Services

              AWS Podcast

              202 Listeners

              The Stack Overflow Podcast by The Stack Overflow Podcast

              The Stack Overflow Podcast

              63 Listeners

              The Real Python Podcast by Real Python

              The Real Python Podcast

              141 Listeners

              Latent Space: The AI Engineer Podcast by swyx + Alessio

              Latent Space: The AI Engineer Podcast

              95 Listeners