Confluent Developer ft. Tim Berglund, Adi Polak & Viktor Gamov

Build a Real Time AI Data Platform with Apache Kafka


Listen Later

Is it possible to build a real-time data platform without using stateful stream processing? Forecasty.ai is an artificial intelligence platform for forecasting commodity prices, imparting insights into the future valuations of raw materials for users. Nearly all AI models are batch-trained once, but precious commodities are linked to ever-fluctuating global financial markets, which require real-time insights. In this episode, Ralph Debusmann (CTO, Forecasty.ai) shares their journey of migrating from a batch machine learning platform to a real-time event streaming system with Apache Kafka® and delves into their approach to making the transition frictionless. 

Ralph explains that Forecasty.ai was initially built on top of batch processing, however, updating the models with batch-data syncs was costly and environmentally taxing. There was also the question of scalability—progressing from 60 commodities on offer to their eventual plan of over 200 commodities. Ralph observed that most real-time systems are non-batch, streaming-based real-time data platforms with stateful stream processing, using Kafka Streams, Apache Flink®, or even Apache Samza. However, stateful stream processing involves resources, such as teams of stream processing specialists to solve the task. 

With the existing team, Ralph decided to build a real-time data platform without using any sort of stateful stream processing. They strictly keep to the out-of-the-box components, such as Kafka topics, Kafka Producer API, Kafka Consumer API, and other Kafka connectors, along with a real-time database to process data streams and implement the necessary joins inside the database. 

Additionally, Ralph shares the tool he built to handle historical data, kash.py—a Kafka shell based on Python; discusses issues the platform needed to overcome for success, and how they can make the migration from batch processing to stream processing painless for the data science team. 

EPISODE LINKS

  • Kafka Streams 101 course
  • The Difference Engine for Unlocking the Kafka Black Box
  • GitHub repo: kash.py
  • Watch the video version of this podcast
  • Kris Jenkins’ Twitter
  • Streaming Audio Playlist 
  • Join the Confluent Community
  • Learn more with Kafka tutorials, resources, and guides at Confluent Developer
  • Live demo: Intro to Event-Driven Microservices with Confluent
  • Use PODCAST100 to get an addi

SEASON 2
Hosted by Tim Berglund, Adi Polak and Viktor Gamov
Produced and Edited by Noelle Gallagher, Peter Furia and Nurie Mohamed
Music by Coastal Kites
Artwork by Phil Vo

  • 🎧 Subscribe to Confluent Developer wherever you listen to podcasts.
  • ▶️ Subscribe on YouTube, and hit the 🔔 to catch new episodes.
  • 👍 If you enjoyed this, please leave us a rating.
  • 🎧 Confluent also has a podcast for tech leaders: "Life Is But A Stream" hosted by our friend, Joseph Morais.
...more
View all episodesView all episodes
Download on the App Store

Confluent Developer ft. Tim Berglund, Adi Polak & Viktor GamovBy Confluent

  • 4.8
  • 4.8
  • 4.8
  • 4.8
  • 4.8

4.8

43 ratings


More shows like Confluent Developer ft. Tim Berglund, Adi Polak & Viktor Gamov

View all
Software Engineering Radio by se-radio@computer.org

Software Engineering Radio

273 Listeners

Economist Podcasts by The Economist

Economist Podcasts

4,194 Listeners

Motley Fool Money by The Motley Fool

Motley Fool Money

3,219 Listeners

a16z Podcast by Andreessen Horowitz

a16z Podcast

1,080 Listeners

Thoughtworks Technology Podcast by Thoughtworks

Thoughtworks Technology Podcast

42 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

626 Listeners

The Official SaaStr Podcast: SaaS | Founders | Investors by SaaStr

The Official SaaStr Podcast: SaaS | Founders | Investors

173 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

205 Listeners

Data Engineering Podcast by Tobias Macey

Data Engineering Podcast

143 Listeners

The Diary Of A CEO with Steven Bartlett by DOAC

The Diary Of A CEO with Steven Bartlett

8,232 Listeners

The Journal. by The Wall Street Journal & Spotify Studios

The Journal.

5,980 Listeners

Waveform: The MKBHD Podcast by Vox Media Podcast Network

Waveform: The MKBHD Podcast

5,967 Listeners

Morning Brew Daily by Morning Brew

Morning Brew Daily

2,987 Listeners

Grit by Kleiner Perkins

Grit

189 Listeners

The AI Daily Brief: Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief: Artificial Intelligence News and Analysis

547 Listeners