The GeekNarrator

Duckdb Internals with Mark Raasveldt


Listen Later

Deep Dive into DuckDB with CTO Mark Raasveldt

Decode the insights of databases with Geek Narrator podcast. In this episode, host Kaivalya Apte converses with Mark Raasveldt, the CTO of DuckDB labs, discussing his journey from being a database enthusiast to creating DuckDB. They delve into how DuckDB, an analytical database, differs from other databases, the design decisions, its internal mechanisms, and much more. The episode also highlights the advantages of DuckDB in analytics, the motivation behind its ACID compliance, and how DuckDB handles ingestion, transaction isolation, mutations, and queries. Join in to learn how your data workloads can benefit from DuckDB.
00:00 Introduction and Guest Introduction
00:44 Guest's Journey into Databases
03:40 The Birth of DuckDB
04:30 Challenges with Existing Databases
05:15 Technical Difficulties
05:16 Why Existing Databases Fall Short for Data Scientists
09:16 The Role of SQLite and Its Limitations
13:59 Defining DuckDB
16:48 Comparing DuckDB with Other Analytical Databases
19:50 Deployment Models for DuckDB
22:47 Data ingestion into DuckDB
22:51 Data Ingestion in DuckDB
30:24 How DuckDB Handles Updates and Mutations
35:35 Understanding Column Granularity and Rewrites
35:58 Implications of Compression on Data Updates
36:38 Trade-offs in Row Group Size
37:32 Benefits of Column Storage Model
38:15 Row Groups and Parallelism
39:02 Choosing Row Group Size: An Experimental Approach
40:00 Handling Data Type Changes in Columns
41:00 Internal Data Structures in DuckDB
42:21 Reading Data: Point Lookups, Aggregations, and Joins
47:22 Optimization for Full Table Scans
53:49 Understanding ACID Compliance in DuckDB
55:49 Multi-Version Concurrency Control (MVCC) in DuckDB
59:50 Use Cases and Applications of DuckDB
01:01:42 The Story Behind DuckDB's Name
01:02:34 Future Vision for DuckDB
References:
DuckDB: https://duckdb.org/
Mark's blog: https://mytherin.github.io/
===============================================================================
For discount on the below courses:
Appsync: https://appsyncmasterclass.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Testing serverless: https://testserverlessapps.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Production-Ready Serverless: https://productionreadyserverless.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Use the button, Add Discount and enter "geeknarrator" discount code to get 20% discount.
===============================================================================
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
Cheers,
The GeekNarrator

...more
View all episodesView all episodes
Download on the App Store

The GeekNarratorBy Kaivalya Apte

  • 5
  • 5
  • 5
  • 5
  • 5

5

3 ratings


More shows like The GeekNarrator

View all
Software Engineering Radio - the podcast for professional software developers by se-radio@computer.org

Software Engineering Radio - the podcast for professional software developers

272 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

284 Listeners

Thoughtworks Technology Podcast by Thoughtworks

Thoughtworks Technology Podcast

40 Listeners

Talk Python To Me by Michael Kennedy

Talk Python To Me

590 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

621 Listeners

Odd Lots by Bloomberg

Odd Lots

1,784 Listeners

Data Engineering Podcast by Tobias Macey

Data Engineering Podcast

140 Listeners

Practical AI by Practical AI LLC

Practical AI

192 Listeners

The Stack Overflow Podcast by The Stack Overflow Podcast

The Stack Overflow Podcast

62 Listeners

The Real Python Podcast by Real Python

The Real Python Podcast

139 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

408 Listeners

Oxide and Friends by Oxide Computer Company

Oxide and Friends

47 Listeners

BG2Pod with Brad Gerstner and Bill Gurley by BG2Pod

BG2Pod with Brad Gerstner and Bill Gurley

461 Listeners

Money Stuff: The Podcast by Bloomberg

Money Stuff: The Podcast

371 Listeners

The Pragmatic Engineer by Gergely Orosz

The Pragmatic Engineer

63 Listeners