The GeekNarrator

How would you design a database on Object Storage?


Listen Later

Join Kaivalya Apte and Simon Hørup Eskildsen from Turbopuffer as they talk about the complexities of building a database on top of object storage. Discover the key challenges, the nuances of various storage formats, and the critical trade-offs involved.

Learn from Simon's rich experience, from his time at Shopify to creating Turbopuffer. This episode covers everything—from approaches to write-ahead logs to multi-tenancy and object storage advancements. Perfect for database enthusiasts and those keen on first-principles thinking!
00:00 Introduction
00:17 Simon's Background and Journey to TurboBuffer
02:42 Challenges in Database Scalability
04:21 Experimenting with Vector Databases
05:02 Cost Implications of Vector Databases
05:52 Architectural Considerations for Search Workloads
07:39 Building a Database on Object Storage
16:14 Designing a Simple Database on Object Storage
26:01 Handling Multiple Writers and Consistency
31:26 Trade-offs in Write Operations
32:36 Optimizing MySQL Write Performance
34:03 Batching Writes in Object Storage
35:08 Time-Based vs Size-Based Batching
36:32 Understanding Amplification in Databases
42:26 Challenges with Cold Queries
44:02 Building and Persisting B-Trees
50:53 Separating Workloads in Databases
56:07 Multi-Tenancy Challenges
01:00:39 Choosing Storage Formats
01:06:10 Key Innovations in Object Storage Databases
Important links:
- https://github.com/sirupsen/napkin-math (numbers)
- https://turbopuffer.com/
- https://turbopuffer.com/architecture
- https://sirupsen.com/napkin/problem-10-mysql-transactions-per-second
- https://sirupsen.com (my blog, napkin math)
- https://sirupsen.com/subscribe (napkin math newsletter)
- https://github.com/rkyv/rkyv rkyv rust
Become a member of The GeekNarrator to get access to member only videos, notes and monthly 1:1 with me.
Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.
https://app.codecrafters.io/join?via=geeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!

...more
View all episodesView all episodes
Download on the App Store

The GeekNarratorBy Kaivalya Apte

  • 5
  • 5
  • 5
  • 5
  • 5

5

3 ratings


More shows like The GeekNarrator

View all
Practical AI by Practical AI LLC

Practical AI

212 Listeners

Software Unscripted by Richard Feldman

Software Unscripted

27 Listeners

Training Data by Sequoia Capital

Training Data

39 Listeners