Confluent Developer ft. Tim Berglund, Adi Polak & Viktor Gamov

From Batch to Real-Time: Tips for Streaming Data Pipelines with Apache Kafka ft. Danica Fine


Listen Later

Implementing an event-driven data pipeline can be challenging, but doing so within the context of a legacy architecture is even more complex. Having spent three years building a streaming data infrastructure and being on the first team at a financial organization to implement Apache Kafka® event-driven data pipelines, Danica Fine (Senior Developer Advocate, Confluent) shares about the development process and how ksqlDB and Kafka Connect became instrumental to the implementation.

By moving away from batch processing to streaming data pipelines with Kafka, data can be distributed with increased data scalability and resiliency. Kafka decouples the source from the target systems, so you can react to data as it changes while ensuring accurate data in the target system. 

In order to transition from monolithic micro-batching applications to real-time microservices that can integrate with a legacy system that has been around for decades, Danica and her team started developing Kafka connectors to connect to various sources and target systems. 

  1. Kafka connectors: Building two major connectors for the data pipeline, including a source connector to connect the legacy data source to stream data into Kafka, and another target connector to pipe data from Kafka back into the legacy architecture. 
  2. Algorithm: Implementing Kafka Streams applications to migrate data from a monolithic architecture to a stream processing architecture. 
  3. Data join: Leveraging Kafka Connect and the JDBC source connector to bring in all data streams to complete the pipeline.
  4. Streams join: Using ksqlDB to join streams—the legacy data system continues to produce streams while the Kafka data pipeline is another stream of data. 

As a final tip, Danica suggests breaking algorithms into process steps. She also describes how her experience relates to the data pipelines course on Confluent Developer and encourages anyone who is interested in learning more to check it out. 

EPISODE LINKS

  • Data Pipelines course
  • Introduction to Streaming Data Pipelines with Apache Kafka and ksqlDB
  • Guided Exercise on Building Streaming Data Pipelines
  • Migrating from a Legacy System to Kafka Streams
  • Watch the video version of this podcast
  • Join the Confluent Community
  • Learn more with Kafka tutorials, resources, and guides at Confluent Developer
  • Live demo: Intro to Event-Driven Microservices with

SEASON 2
Hosted by Tim Berglund, Adi Polak and Viktor Gamov
Produced and Edited by Noelle Gallagher, Peter Furia and Nurie Mohamed
Music by Coastal Kites
Artwork by Phil Vo

  • 🎧 Subscribe to Confluent Developer wherever you listen to podcasts.
  • ▶️ Subscribe on YouTube, and hit the 🔔 to catch new episodes.
  • 👍 If you enjoyed this, please leave us a rating.
  • 🎧 Confluent also has a podcast for tech leaders: "Life Is But A Stream" hosted by our friend, Joseph Morais.
...more
View all episodesView all episodes
Download on the App Store

Confluent Developer ft. Tim Berglund, Adi Polak & Viktor GamovBy Confluent

  • 4.8
  • 4.8
  • 4.8
  • 4.8
  • 4.8

4.8

43 ratings


More shows like Confluent Developer ft. Tim Berglund, Adi Polak & Viktor Gamov

View all
Software Engineering Radio by se-radio@computer.org

Software Engineering Radio

273 Listeners

Economist Podcasts by The Economist

Economist Podcasts

4,194 Listeners

Motley Fool Money by The Motley Fool

Motley Fool Money

3,219 Listeners

a16z Podcast by Andreessen Horowitz

a16z Podcast

1,080 Listeners

Thoughtworks Technology Podcast by Thoughtworks

Thoughtworks Technology Podcast

42 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

626 Listeners

The Official SaaStr Podcast: SaaS | Founders | Investors by SaaStr

The Official SaaStr Podcast: SaaS | Founders | Investors

173 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

205 Listeners

Data Engineering Podcast by Tobias Macey

Data Engineering Podcast

143 Listeners

The Diary Of A CEO with Steven Bartlett by DOAC

The Diary Of A CEO with Steven Bartlett

8,232 Listeners

The Journal. by The Wall Street Journal & Spotify Studios

The Journal.

5,980 Listeners

Waveform: The MKBHD Podcast by Vox Media Podcast Network

Waveform: The MKBHD Podcast

5,967 Listeners

Morning Brew Daily by Morning Brew

Morning Brew Daily

2,987 Listeners

Grit by Kleiner Perkins

Grit

189 Listeners

The AI Daily Brief: Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief: Artificial Intelligence News and Analysis

547 Listeners