Confluent Developer ft. Tim Berglund, Adi Polak & Viktor Gamov

Running Apache Kafka Efficiently on the Cloud ft. Adithya Chandra


Listen Later

Focused on optimizing Apache Kafka® performance with maximized efficiency, Confluent’s Product Infrastructure team has been actively exploring opportunities for scaling out Kafka clusters. They are able to run Kafka workloads with half the typical memory usage while saving infrastructure costs, which they have tested and now safely rolled out across Confluent Cloud. 

After spending seven years at Amazon Web Services (AWS) working on search services and Amazon Aurora as a software engineer, Adithya Chandra decided to apply his expertise in cluster management, load balancing, elasticity, and performance of search and storage clusters to the Confluent team.

Last year, Confluent shipped Tiered Storage, which moves eligible data to remote storage from a Kafka broker. As most of the data moves to remote storage, we can upgrade to better storage volumes backed by solid-state drives (SSDs). SSDs are capable of higher throughput compared to hard disk drives (HDDs), capable of fast, random IO, yet more expensive per provisioned gigabyte. Given that SSDs are useful at random IO and can support higher throughput, Confluent started investigating whether it was possible to run Kafka with lesser RAM, which is comparatively much more expensive per gigabyte compared to SSD. Instance types in the cloud had the same CPU but half the memory was 20% cheaper.

In this episode, Adithya covers how to run Kafka more efficiently on Confluent Cloud and dives into the following:

  • Memory allocation on an instance running Kafka
  • What is a JVM heap? Why should it be sized? How much is enough? What are the downsides of a small heap?
  • Memory usage of Datadog, Kubernetes, and other processes, and allocating memory correctly
  • What is the ideal page cache size? What is a page cache used for? Are there any parameters that can be tuned? How does Kafka use the page cache?
  • Testing via the simulation of a variety of workloads using Trogdor
  • High-throughput, high-connection, and high-partition tests and their results
  • Available cloud hardware and finding the best fit, including choosing the number of instance types, migrating from one instance to another, and using nodepools to migrate brokers safely, one by one
  • What do you do when your preferred hardware is not available? Can you run hybrid Kafka clusters if the preferred instance is not widely available?
  • Building infrastructure that allows you to perform testing easily and that can support newer hardware faster (ARM processors, SSDs, etc.)

EPISODE LINKS

  • Watch the video version of this podcast
  • Join the Confluent Community
  • Learn more with Kafka tutorials, resources, and guides at Confluent Developer
  • Live demo: Kafka streaming in 10 minutes on Confluent Cloud

SEASON 2
Hosted by Tim Berglund, Adi Polak and Viktor Gamov
Produced and Edited by Noelle Gallagher, Peter Furia and Nurie Mohamed
Music by Coastal Kites
Artwork by Phil Vo

  • 🎧 Subscribe to Confluent Developer wherever you listen to podcasts.
  • ▶️ Subscribe on YouTube, and hit the 🔔 to catch new episodes.
  • 👍 If you enjoyed this, please leave us a rating.
  • 🎧 Confluent also has a podcast for tech leaders: "Life Is But A Stream" hosted by our friend, Joseph Morais.
...more
View all episodesView all episodes
Download on the App Store

Confluent Developer ft. Tim Berglund, Adi Polak & Viktor GamovBy Confluent

  • 4.8
  • 4.8
  • 4.8
  • 4.8
  • 4.8

4.8

43 ratings


More shows like Confluent Developer ft. Tim Berglund, Adi Polak & Viktor Gamov

View all
Software Engineering Radio by se-radio@computer.org

Software Engineering Radio

273 Listeners

Economist Podcasts by The Economist

Economist Podcasts

4,194 Listeners

Motley Fool Money by The Motley Fool

Motley Fool Money

3,219 Listeners

a16z Podcast by Andreessen Horowitz

a16z Podcast

1,080 Listeners

Thoughtworks Technology Podcast by Thoughtworks

Thoughtworks Technology Podcast

42 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

626 Listeners

The Official SaaStr Podcast: SaaS | Founders | Investors by SaaStr

The Official SaaStr Podcast: SaaS | Founders | Investors

173 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

205 Listeners

Data Engineering Podcast by Tobias Macey

Data Engineering Podcast

143 Listeners

The Diary Of A CEO with Steven Bartlett by DOAC

The Diary Of A CEO with Steven Bartlett

8,232 Listeners

The Journal. by The Wall Street Journal & Spotify Studios

The Journal.

5,980 Listeners

Waveform: The MKBHD Podcast by Vox Media Podcast Network

Waveform: The MKBHD Podcast

5,967 Listeners

Morning Brew Daily by Morning Brew

Morning Brew Daily

2,987 Listeners

Grit by Kleiner Perkins

Grit

189 Listeners

The AI Daily Brief: Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief: Artificial Intelligence News and Analysis

547 Listeners