Intellectually Curious

DuckDB v1.3.0: The Spatial Join Breakthrough — From Nested Loops to an On-the-Fly R-tree


Listen Later

Spatial joins connect data by location. In this episode we unpack DuckDB's v1.3.0 dedicated spatial join operator, how it builds an in‑memory R-tree and buffers the smaller table to probe it efficiently, and why this yields dramatic speedups (e.g., a 58M-row join against 310 neighborhoods dropping from ~30 minutes to under 30 seconds). We trace the journey from brute-force nested-loop to IE-join optimizations with bounding boxes, discuss current limits and ongoing work (larger-than-memory builds, more parallelism), and highlight implications for geospatial analysis.


Note:  This podcast was AI-generated, and sometimes AI can make mistakes.  Please double-check any critical information.

Sponsored by Embersilk LLC

...more
View all episodesView all episodes
Download on the App Store

Intellectually CuriousBy Mike Breault