Streaming Audio: Apache Kafka® & Real-Time Data

Handling 2 Million Apache Kafka Messages Per Second at Honeycomb


Listen Later

How many messages can Apache Kafka® process per second? At Honeycomb, it's easily over one million messages. 
 
In this episode,  get a taste of how Honeycomb uses Kafka on massive scale. Liz Fong-Jones (Principal Developer Advocate, Honeycomb) explains how Honeycomb manages Kafka-based telemetry ingestion pipelines and scales Kafka clusters. 

And what is Honeycomb? Honeycomb is an observability platform that helps you visualize, analyze, and improve cloud application quality and performance. Their data volume has grown by a factor of 10 throughout the pandemic, while the total cost of ownership has only gone up by 20%. 

But how, you ask? As a developer advocate for site reliability engineering (SRE) and observability, Liz works alongside the platform engineering team on optimizing infrastructure for reliability and cost. Two years ago, the team was facing the prospect of growing from 20 Kafka brokers to 200 Kafka brokers as data volume increased. The challenge was to scale and shuffle data between the number of brokers while maintaining cost efficiency.

The Honeycomb engineering team has experimented with using sc1 or st1 EBS hard disks to store the majority of longer-term archives and keep only the latest hours of data on NVMe instance storage. However, this approach to cost reduction was not ideal, which resulted in needing to keep data that is older than 24 hours on SSD. The team began to explore and adopt Zstandard compression to decrease bandwidth and disk size; however, the clusters were still struggling to keep up. 

When Confluent Platform 6.0 rolled out Tiered Storage, the team saw it as a feature to help them break away from being storage bound. Before bringing the feature into production, the team did a proof of concept, which helped them gain confidence as they watched Kafka tolerate broker death and reduce latencies in fetching historical data. Tiered Storage now shrinks their clusters significantly so that they can hold on to local NVMe SSD and the tiered data is only stored once on Amazon S3, rather than consuming SSD on all replicas. In combination with the AWS Im4gn instance, Tiered Storage allows the team to scale for long-term growth. 

Honeycomb also saved 87% on the cost per megabyte of Kafka throughput by optimizing their Kafka clusters.

EPISODE LINKS

  • Tiered Storage
  • Introducing Confluent Platform 6.0
  • Scaling Kafka at Honeycomb
  • Watch the video version of this podcast
  • Kris Jenkins Twitter
  • Streaming Audio Playlist 
  • Join the Confluent Community
  • Learn more with Kafka tutorials, resources, and guides at Confluent Developer
  • Live demo: Intro to Event-Driven Microservices with Confluent
  • Use PODCAST100 to get an additional $100 of free Confluent Cloud usage (details)  
...more
View all episodesView all episodes
Download on the App Store

Streaming Audio: Apache Kafka® & Real-Time DataBy Confluent, founded by the original creators of Apache Kafka®

  • 4.8
  • 4.8
  • 4.8
  • 4.8
  • 4.8

4.8

44 ratings


More shows like Streaming Audio: Apache Kafka® & Real-Time Data

View all
Planet Money by NPR

Planet Money

30,892 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

285 Listeners

Talk Python To Me by Michael Kennedy

Talk Python To Me

586 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

631 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

201 Listeners

DataFramed by DataCamp

DataFramed

268 Listeners

Tech Lead Journal by Henry Suryawirawan

Tech Lead Journal

12 Listeners

System Design by Wes and Kevin

System Design

93 Listeners

Postgres FM by Nikolay Samokhvalov and Michael Christofides

Postgres FM

20 Listeners

Kubernetes for Humans by Komodor

Kubernetes for Humans

2 Listeners

Learn System Design by Ben Kitchell

Learn System Design

32 Listeners