Data Engineering Podcast

Aligning Business and Data: The Essential Role of Data Modeling


Listen Later

Summary
In this episode of the Data Engineering Podcast Serge Gershkovich, head of product at SQL DBM, talks about the socio-technical aspects of data modeling. Serge shares his background in data modeling and highlights its importance as a collaborative process between business stakeholders and data teams. He debunks common misconceptions that data modeling is optional or secondary, emphasizing its crucial role in ensuring alignment between business requirements and data structures. The conversation covers challenges in complex environments, the impact of technical decisions on data strategy, and the evolving role of AI in data management. Serge stresses the need for business stakeholders' involvement in data initiatives and a systematic approach to data modeling, warning against relying solely on technical expertise without considering business alignment.

Announcements
  • Hello and welcome to the Data Engineering Podcast, the show about modern data management
  • Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.
  • Enterprises today face an enormous challenge: they’re investing billions into Snowflake and Databricks, but without strong foundations, those investments risk becoming fragmented, expensive, and hard to govern. And that’s especially evident in large, complex enterprise data environments. That’s why companies like DirecTV and Pfizer rely on SqlDBM. Data modeling may be one of the most traditional practices in IT, but it remains the backbone of enterprise data strategy. In today’s cloud era, that backbone needs a modern approach built natively for the cloud, with direct connections to the very platforms driving your business forward. Without strong modeling, data management becomes chaotic, analytics lose trust, and AI initiatives fail to scale. SqlDBM ensures enterprises don’t just move to the cloud—they maximize their ROI by creating governed, scalable, and business-aligned data environments. If global enterprises are using SqlDBM to tackle the biggest challenges in data management, analytics, and AI, isn’t it worth exploring what it can do for yours? Visit dataengineeringpodcast.com/sqldbm to learn more.
  • Your host is Tobias Macey and today I'm interviewing Serge Gershkovich about how and why data modeling is a sociotechnical endeavor
Interview
  • Introduction
  • How did you get involved in the area of data management?
  • Can you start by describing the activities that you think of when someone says the term "data modeling"?
    • What are the main groupings of incomplete or inaccurate definitions that you typically encounter in conversation on the topic?
    • How do those conceptions of the problem lead to challenges and bottlenecks in execution?
  • Data modeling is often associated with data warehouse design, but it also extends to source systems and unstructured/semi-structured assets. How does the inclusion of other data localities help in the overall success of a data/domain modeling effort?
  • Another aspect of data modeling that often consumes a substantial amount of debate is which pattern to adhere to (star/snowflake, data vault, one big table, anchor modeling, etc.). What are some of the ways that you have found effective to remove that as a stumbling block when first developing an organizational domain representation?
  • While the overall purpose of data modeling is to provide a digital representation of the business processes, there are inevitable technical decisions to be made. What are the most significant ways that the underlying technical systems can help or hinder the goals of building a digital twin of the business?
  • What impact (positive and negative) are you seeing from the introduction of LLMs into the workflow of data modeling?
    • How does tool use (e.g. MCP connection to warehouse/lakehouse) help when developing the transformation logic for achieving a given domain representation? 
  • What are the most interesting, innovative, or unexpected ways that you have seen organizations address the data modeling lifecycle?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working with organizations implementing a data modeling effort?
  • What are the overall trends in the ecosystem that you are monitoring related to data modeling practices?
Contact Info
  • LinkedIn
Parting Question
  • From your perspective, what is the biggest gap in the tooling or technology for data management today?
Links
  • sqlDBM
  • SAP
  • Joe Reis
  • ERD == Entity Relation Diagram
  • Master Data Management
  • dbt
  • Data Contracts
  • Data Modeling With Snowflake book by Serge (affiliate link)
  • Type 2 Dimension
  • Data Vault
  • Star Schema
  • Anchor Modeling
  • Ralph Kimball
  • Bill Inmon
  • Sixth Normal Form
  • MCP == Model Context Protocol
The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
...more
View all episodesView all episodes
Download on the App Store

Data Engineering PodcastBy Tobias Macey

  • 4.5
  • 4.5
  • 4.5
  • 4.5
  • 4.5

4.5

140 ratings


More shows like Data Engineering Podcast

View all
Software Engineering Radio by se-radio@computer.org

Software Engineering Radio

271 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

291 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

624 Listeners

The Cloudcast by Massive Studios

The Cloudcast

155 Listeners

Talk Python To Me by Michael Kennedy

Talk Python To Me

588 Listeners

Thoughtworks Technology Podcast by Thoughtworks

Thoughtworks Technology Podcast

41 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

301 Listeners

Python Bytes by Michael Kennedy and Brian Okken

Python Bytes

214 Listeners

Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

Syntax - Tasty Web Development Treats

984 Listeners

DataFramed by DataCamp

DataFramed

268 Listeners

Practical AI by Practical AI LLC

Practical AI

211 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

203 Listeners

The Stack Overflow Podcast by The Stack Overflow Podcast

The Stack Overflow Podcast

62 Listeners

The Real Python Podcast by Real Python

The Real Python Podcast

142 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

96 Listeners