Streaming Audio: Apache Kafka® & Real-Time Data

Next-Gen Data Modeling, Integrity, and Governance with YODA


Listen Later

In this episode, Kris interviews Doron Porat, Director of Infrastructure at Yotpo, and Liran Yogev, Director of Engineering at ZipRecruiter (formerly at Yotpo), about their experiences and strategies in dealing with data modeling at scale.

Yotpo has a vast and active data lake, comprising thousands of datasets that are processed by different engines, primarily Apache Spark™. They wanted to provide users with self-service tools for generating and utilizing data with maximum flexibility, but encountered difficulties, including poor standardization, low data reusability, limited data lineage, and unreliable datasets.

The team realized that Yotpo's modeling layer, which defines the structure and relationships of the data, needed to be separated from the execution layer, which defines and processes operations on the data.

This separation would give programmers better visibility into data pipelines across all execution engines, storage methods, and formats, as well as more governance control for exploration and automation.

To address these issues, they developed YODA, an internal tool that combines excellent developer experience, DBT, Databricks, Airflow, Looker and more, with a strong CI/CD and orchestration layer.

Yotpo is a B2B, SaaS e-commerce marketing platform that provides businesses with the necessary tools for accurate customer analytics, remarketing, support messaging, and more.

ZipRecruiter is a job site that utilizes AI matching to help businesses find the right candidates for their open roles.

EPISODE LINKS

  • Current 2022 Talk: Next Gen Data Modeling in the Open Data Platform
  • Data Mesh 101
  • Data Mesh Architecture: A Modern Distributed Data Model
  • Watch the video version of this podcast
  • Kris Jenkins’ Twitter
  • Streaming Audio Playlist 
  • Join the Confluent Community
  • Learn more with Kafka tutorials, resources, and guides at Confluent Developer
  • Live demo: Intro to Event-Driven Microservices with Confluent
  • Use PODCAST100 to get an additional $100 of free Confluent Cloud usage (details)
...more
View all episodesView all episodes
Download on the App Store

Streaming Audio: Apache Kafka® & Real-Time DataBy Confluent, founded by the original creators of Apache Kafka®

  • 4.8
  • 4.8
  • 4.8
  • 4.8
  • 4.8

4.8

44 ratings


More shows like Streaming Audio: Apache Kafka® & Real-Time Data

View all
Planet Money by NPR

Planet Money

30,888 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

285 Listeners

Talk Python To Me by Michael Kennedy

Talk Python To Me

586 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

629 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

201 Listeners

DataFramed by DataCamp

DataFramed

268 Listeners

Tech Lead Journal by Henry Suryawirawan

Tech Lead Journal

12 Listeners

System Design by Wes and Kevin

System Design

93 Listeners

Postgres FM by Nikolay Samokhvalov and Michael Christofides

Postgres FM

20 Listeners

Kubernetes for Humans by Komodor

Kubernetes for Humans

2 Listeners

Learn System Design by Ben Kitchell

Learn System Design

32 Listeners