Confluent Developer ft. Tim Berglund, Adi Polak & Viktor Gamov

Reddit Sentiment Analysis with Apache Kafka-Based Microservices


Listen Later

How do you analyze Reddit sentiment with Apache Kafka® and microservices? Bringing the fresh perspective of someone who is both new to Kafka and the industry, Shufan Liu, nascent Developer Advocate at Confluent, discusses projects he has worked on during his summer internship—a Cluster Linking extension to a conceptual data pipeline project, and a microservice-based Reddit sentiment-analysis project. Shufan demonstrates that it’s possible to quickly get up to speed with the tools in the Kafka ecosystem and to start building something productive early on in your journey.

Shufan's Cluster Linking project extends a demo by Danica Fine (Senior Developer Advocate, Confluent) that uses a Kafka-based data pipeline to address the challenge of automatic houseplant watering. He discusses his contribution to the project and shares details in his blog—Data Enrichment in Existing Data Pipelines Using Confluent Cloud.

The second project Shufan presents is a sentiment analysis system that gathers data from a given subreddit, then assigns the data a sentiment score. He points out that its results would be hard to duplicate manually by simply reading through a subreddit—you really need the assistance of AI. The project consists of four microservices:

  1. A user input service that collects requests in a Kafka topic, which consist of the desired subreddit, along with the dates between which data should be collected
  2. An API polling service that fetches the requests from the user input service, collects the relevant data from the Reddit API, then appends it to a new topic
  3. A sentiment analysis service that analyzes the appended topic from the API polling service using the Python library NLTK; it calculates averages with ksqlDB
  4. A results-displaying service that consumes from a topic with the calculations

Interesting subreddits that Shufan has analyzed for sentiment include gaming forums before and after key releases; crypto and stock trading forums at various meaningful points in time; and sports-related forums both before the season and several games into it. 

EPISODE LINKS

  • Data Enrichment in Existing Data Pipelines Using Confluent Cloud
  • Watch the video version of this podcast
  • Kris Jenkins’ Twitter
  • Streaming Audio Playlist 
  • Join the Confluent Community
  • Learn more with Kafka tutorials, resources, and guides at Confluent Developer
  • Live demo: Intro to Event-Driven Microservices with Confluent

SEASON 2
Hosted by Tim Berglund, Adi Polak and Viktor Gamov
Produced and Edited by Noelle Gallagher, Peter Furia and Nurie Mohamed
Music by Coastal Kites
Artwork by Phil Vo

  • 🎧 Subscribe to Confluent Developer wherever you listen to podcasts.
  • ▶️ Subscribe on YouTube, and hit the 🔔 to catch new episodes.
  • 👍 If you enjoyed this, please leave us a rating.
  • 🎧 Confluent also has a podcast for tech leaders: "Life Is But A Stream" hosted by our friend, Joseph Morais.
...more
View all episodesView all episodes
Download on the App Store

Confluent Developer ft. Tim Berglund, Adi Polak & Viktor GamovBy Confluent

  • 4.8
  • 4.8
  • 4.8
  • 4.8
  • 4.8

4.8

43 ratings


More shows like Confluent Developer ft. Tim Berglund, Adi Polak & Viktor Gamov

View all
Software Engineering Radio by se-radio@computer.org

Software Engineering Radio

271 Listeners

Hanselminutes with Scott Hanselman by Scott Hanselman

Hanselminutes with Scott Hanselman

383 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

289 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

626 Listeners

Talk Python To Me by Michael Kennedy

Talk Python To Me

585 Listeners

Soft Skills Engineering by Jamison Dance and Dave Smith

Soft Skills Engineering

288 Listeners

Thoughtworks Technology Podcast by Thoughtworks

Thoughtworks Technology Podcast

43 Listeners

Python Bytes by Michael Kennedy and Brian Okken

Python Bytes

215 Listeners

Practical AI by Practical AI LLC

Practical AI

210 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

203 Listeners

The Real Python Podcast by Real Python

The Real Python Podcast

142 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

502 Listeners

Big Technology Podcast by Alex Kantrowitz

Big Technology Podcast

493 Listeners

The AI Daily Brief: Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief: Artificial Intelligence News and Analysis

607 Listeners

Life Is But A Stream by Confluent

Life Is But A Stream

6 Listeners