Confluent Developer ft. Tim Berglund, Adi Polak & Viktor Gamov

How to use Data Contracts for Long-Term Schema Management


Listen Later

Have you ever struggled with managing data long term, especially as the schema changes over time? In order to manage and leverage data across an organization, it’s essential to have well-defined guidelines and standards in place around data quality, enforcement, and data transfer. To get started, Abraham Leal (Customer Success Technical Architect, Confluent) suggests that organizations associate their Apache Kafka® data with a data contract (schema). A data contract is an agreement between a service provider and data consumers. It defines the management and intended usage of data within an organization. In this episode, Abraham talks to Kris about how to use data contracts and schema enforcement to ensure long-term data management.

When an organization sends and stores critical and valuable data in Kafka, more often than not it would like to leverage that data in various valuable ways for multiple business units. Kafka is particularly suited for this use case, but it can be problematic later on if the governance rules aren’t established up front.

With schema registry, evolution is easy due to its robust security guarantees. When managing data pipelines, you can also use GitOps automation features for an extra control layer. It allows you to be creative with topic versioning, upcasting/downcasting the data collected, and adding quality assurance steps at the end of each run to ensure your project remains reliable.

Abraham explains that Protobuf and Avro are the best formats to use rather than XML or JSON because they are built to handle schema evolution. In addition, they have a much lower overhead per-record, so you can save bandwidth and data storage costs by adopting them.

There’s so much more to consider, but if you are thinking about implementing or integrating with your data quality team, Abraham suggests that you use schema registry heavily from the beginning.

If you have more questions, Kris invites you to join the conversation. You can also watch the KOR Financial Current talk Abraham mentions or take Danica Fine’s free course on how to use schema registry on Confluent Developer.

EPISODE LINKS

  • OS project
  • KOR Financial Current Talk
  • The Key Concepts of Schema Registry
  • Schema Evolution and Compatibility
  • Schema Registry Made Simple by Confluent Cloud ft. Magesh Nandakumar
  • Kris Jenkins’ Twitter
  • Watch the video version of this podcast
  • Streaming Audio Playlist 

SEASON 2
Hosted by Tim Berglund, Adi Polak and Viktor Gamov
Produced and Edited by Noelle Gallagher, Peter Furia and Nurie Mohamed
Music by Coastal Kites
Artwork by Phil Vo

  • 🎧 Subscribe to Confluent Developer wherever you listen to podcasts.
  • ▶️ Subscribe on YouTube, and hit the 🔔 to catch new episodes.
  • 👍 If you enjoyed this, please leave us a rating.
  • 🎧 Confluent also has a podcast for tech leaders: "Life Is But A Stream" hosted by our friend, Joseph Morais.
...more
View all episodesView all episodes
Download on the App Store

Confluent Developer ft. Tim Berglund, Adi Polak & Viktor GamovBy Confluent

  • 4.8
  • 4.8
  • 4.8
  • 4.8
  • 4.8

4.8

43 ratings


More shows like Confluent Developer ft. Tim Berglund, Adi Polak & Viktor Gamov

View all
Software Engineering Radio by se-radio@computer.org

Software Engineering Radio

273 Listeners

Economist Podcasts by The Economist

Economist Podcasts

4,194 Listeners

Motley Fool Money by The Motley Fool

Motley Fool Money

3,219 Listeners

a16z Podcast by Andreessen Horowitz

a16z Podcast

1,080 Listeners

Thoughtworks Technology Podcast by Thoughtworks

Thoughtworks Technology Podcast

42 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

626 Listeners

The Official SaaStr Podcast: SaaS | Founders | Investors by SaaStr

The Official SaaStr Podcast: SaaS | Founders | Investors

173 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

205 Listeners

Data Engineering Podcast by Tobias Macey

Data Engineering Podcast

143 Listeners

The Diary Of A CEO with Steven Bartlett by DOAC

The Diary Of A CEO with Steven Bartlett

8,232 Listeners

The Journal. by The Wall Street Journal & Spotify Studios

The Journal.

5,980 Listeners

Waveform: The MKBHD Podcast by Vox Media Podcast Network

Waveform: The MKBHD Podcast

5,967 Listeners

Morning Brew Daily by Morning Brew

Morning Brew Daily

2,987 Listeners

Grit by Kleiner Perkins

Grit

189 Listeners

The AI Daily Brief: Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief: Artificial Intelligence News and Analysis

547 Listeners