Data Engineering Podcast

Branches, Diffs, and SQL: How Dolt Powers Agentic Workflows


Listen Later

Summary 
In this episode Tim Sehn, founder and CEO of DoltHub, talks about Dolt - the world’s first version‑controlled SQL database - and why Git‑style semantics belong at the heart of data systems and AI workflows. Tim explains how Dolt combines a MySQL/Postgres‑compatible interface with a novel storage engine built on a “Prollytree” to enable fast, row‑level branching, merging, and diffs of both schema and data. He digs into real production use cases: powering applications that expose version control to end users, reproducible ML feature stores, managing massive configuration for games, and enabling safe agentic writes via branch‑based review flows. He compares Dolt’s approach to LakeFS, Neon, and PlanetScale, and explores developer workflows unlocked by decentralized clones, full audit logs, and PR‑style data reviews. 

Announcements 
  • Hello and welcome to the Data Engineering Podcast, the show about modern data management
  • If you lead a data team, you know this pain: Every department needs dashboards, reports, custom views, and they all come to you. So you're either the bottleneck slowing everyone down, or you're spending all your time building one-off tools instead of doing actual data work. Retool gives you a way to break that cycle. Their platform lets people build custom apps on your company data—while keeping it all secure. Type a prompt like 'Build me a self-service reporting tool that lets teams query customer metrics from Databricks—and they get a production-ready app with the permissions and governance built in. They can self-serve, and you get your time back. It's data democratization without the chaos. Check out Retool at dataengineeringpodcast.com/retool today and see how other data teams are scaling self-service. Because let's be honest—we all need to Retool how we handle data requests.
  • Your host is Tobias Macey and today I'm interviewing Tim Sehn about Dolt, a version controlled database engine and its applications for agentic workflows

Interview
 
  • Introduction
  • How did you get involved in the area of data management?
  • Can you describe what Dolt is and the story behind it?
  • What are the key use cases that you are focused on solving by adding version control to the database layer?
  • There are numerous projects related to different aspects of versioning in different data contexts (e.g. LakeFS, Datomic, etc.). What are the versioning semantics that you are focused on?
  • You position Dolt as "the database for AI". How does data versioning relate to AI use cases?
  • What types of AI systems are able to make best use of Dolt's versioning capabilities?
  • Can you describe how Dolt and Doltgres are implemented?
  • How have the design and scope of the project changed since you first started working on it?
  • What are some of the architecture and integration patterns around relational databases that change when you introduce version control semantics as a core primitive?
  • What are some anti-patterns that you have seen teams develop around Dolt's versioning functionality?
  • What are the most interesting, innovative, or unexpected ways that you have seen Dolt used?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on Dolt?
  • When is Dolt the wrong choice?
  • What do you have planned for the future of Dolt?

Contact Info
 
  • LinkedIn

Parting Question
 
  • From your perspective, what is the biggest gap in the tooling or technology for data management today?

Closing Announcements
 
  • Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.

Links
 
  • Dolt
  • DoltHub
  • Stockmarket Data
  • LakeFS
  • Datomic
  • Git
  • MySQL
  • Prolly Tree
  • Neon
  • Django
  • Feature Store
  • MCP Server
  • Nessie
  • Iceberg
  • PlanetScale
  • O(NlogN) Big O Complexity
  • B-Tree
  • Git Merge
  • Git Rebase
  • AST == Abstract Syntax Tree
  • Supabase
  • CockroachDB
  • Document Database
  • MongoDB
  • Gastown
  • Beads

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
...more
View all episodesView all episodes
Download on the App Store

Data Engineering PodcastBy Tobias Macey

  • 4.5
  • 4.5
  • 4.5
  • 4.5
  • 4.5

4.5

142 ratings


More shows like Data Engineering Podcast

View all
This Week in Startups by Jason Calacanis

This Week in Startups

1,297 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

289 Listeners

The a16z Show by Andreessen Horowitz

The a16z Show

1,101 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

624 Listeners

Risky Business by Patrick Gray

Risky Business

374 Listeners

Talk Python To Me by Michael Kennedy

Talk Python To Me

581 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

300 Listeners

NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

347 Listeners

Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

Syntax - Tasty Web Development Treats

988 Listeners

Practical AI by Practical AI LLC

Practical AI

210 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

528 Listeners

The Data Engineering Show by The Firebolt Data Bros

The Data Engineering Show

8 Listeners

Latent Space: The AI Engineer Podcast by Latent.Space

Latent Space: The AI Engineer Podcast

98 Listeners

This Day in AI Podcast by Michael Sharkey, Chris Sharkey

This Day in AI Podcast

227 Listeners

The AI Daily Brief: Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief: Artificial Intelligence News and Analysis

648 Listeners