
Sign up to save your podcasts
Or


Is your Python code held together with duct tape and prayers? Sam and Shifra untangle the spaghetti and walk you through what it actually means to write clean, maintainable data code, and which tools will get you there. From the humble origins of Pandas to the blazing speed of Polars and the SQL simplicity of DuckDB, this episode is your guide to leveling up without burning down your codebase.
🌊 Check out the deep dive here: https://youtu.be/htGazioOVvM
We talk about:
- What spaghetti code actually is (and why we've all written it)
- The real limitations of Pandas at scale (single threading, row storage, and bloated data types)
- One-line bolt-on fixes with PyArrow and NVIDIA RAPIDS cuDF
- Why Polars feels like the dplyr of Python
- What makes DuckDB the SQLite for analytics
- Polars vs DuckDB: how to pick the right tool for your team
- Future you is a different person, and other habits of engineers who sleep at night
Follow Saturdata, your favorite weekend data podcast:
Spotify: https://open.spotify.com/show/5QolhKm1jDZzVuHO0S9ZBo?si=910efb23833f4fc1
LinkedIn: https://www.linkedin.com/company/saturdata
Instagram: @SaturdataPod
#Saturdata #Pandas #Polars #DuckDB #DataEngineering
Chapters:
0:00 - Intro
0:47 - The spaghetti code confession
3:23 - All the pasta shapes of bad code
5:14 - Trial by fire: how you actually learn to write good code
8:34 - The pandas origin story
12:19 - What's wrong with pandas (we still love you though)
17:48 - The PyArrow bolt-on: a one-line glow-up
21:51 - GPU-powered dataframes with RAPIDS cuDF
25:15 - Running out of RAM and spilling the tea
31:22 - Enter Polars: the polar bear to pandas' panda
42:14 - DuckDB: the cute duck who does SQL fast
50:24 - So which tool should you actually use?
56:46 - Future you is a different person: tips for writing better code
59:16 - Comment the why, not the what
By Saturdata PodcastIs your Python code held together with duct tape and prayers? Sam and Shifra untangle the spaghetti and walk you through what it actually means to write clean, maintainable data code, and which tools will get you there. From the humble origins of Pandas to the blazing speed of Polars and the SQL simplicity of DuckDB, this episode is your guide to leveling up without burning down your codebase.
🌊 Check out the deep dive here: https://youtu.be/htGazioOVvM
We talk about:
- What spaghetti code actually is (and why we've all written it)
- The real limitations of Pandas at scale (single threading, row storage, and bloated data types)
- One-line bolt-on fixes with PyArrow and NVIDIA RAPIDS cuDF
- Why Polars feels like the dplyr of Python
- What makes DuckDB the SQLite for analytics
- Polars vs DuckDB: how to pick the right tool for your team
- Future you is a different person, and other habits of engineers who sleep at night
Follow Saturdata, your favorite weekend data podcast:
Spotify: https://open.spotify.com/show/5QolhKm1jDZzVuHO0S9ZBo?si=910efb23833f4fc1
LinkedIn: https://www.linkedin.com/company/saturdata
Instagram: @SaturdataPod
#Saturdata #Pandas #Polars #DuckDB #DataEngineering
Chapters:
0:00 - Intro
0:47 - The spaghetti code confession
3:23 - All the pasta shapes of bad code
5:14 - Trial by fire: how you actually learn to write good code
8:34 - The pandas origin story
12:19 - What's wrong with pandas (we still love you though)
17:48 - The PyArrow bolt-on: a one-line glow-up
21:51 - GPU-powered dataframes with RAPIDS cuDF
25:15 - Running out of RAM and spilling the tea
31:22 - Enter Polars: the polar bear to pandas' panda
42:14 - DuckDB: the cute duck who does SQL fast
50:24 - So which tool should you actually use?
56:46 - Future you is a different person: tips for writing better code
59:16 - Comment the why, not the what