Talk Python To Me

#503: The PyArrow Revolution


Listen Later

Pandas is at a the core of virtually all data science done in Python, that is virtually all data science. Since it's beginning, Pandas has been based upon numpy. But changes are afoot to update those internals and you can now optionally use PyArrow. PyArrow comes with a ton of benefits including it's columnar format which makes answering analytical questions faster, support for a range of high performance file formats, inter-machine data streaming, faster file IO and more. Reuven Lerner is here to give us the low-down on the PyArrow revolution.

Episode sponsors

NordLayer
Auth0
Talk Python Courses

Links from the show
Reuven: github.com/reuven
Apache Arrow: github.com
Parquet: parquet.apache.org
Feather format: arrow.apache.org
Python Workout Book (45% off with code talkpython45): manning.com
Pandas Workout Book (45% off with code talkpython45): manning.com
Pandas: pandas.pydata.org
PyArrow CSV docs: arrow.apache.org
Future string inference in Pandas: pandas.pydata.org
Pandas NA/nullable dtypes: pandas.pydata.org
Pandas `.iloc` indexing: pandas.pydata.org
DuckDB: duckdb.org
Pandas user guide: pandas.pydata.org
Pandas GitHub issues: github.com
Watch this episode on YouTube: youtube.com
Episode transcripts: talkpython.fm

--- Stay in touch with us ---
Subscribe to Talk Python on YouTube: youtube.com
Talk Python on Bluesky: @talkpython.fm at bsky.app
Talk Python on Mastodon: talkpython
Michael on Bluesky: @mkennedy.codes at bsky.app
Michael on Mastodon: mkennedy
...more
View all episodesView all episodes
Download on the App Store

Talk Python To MeBy Michael Kennedy

  • 4.8
  • 4.8
  • 4.8
  • 4.8
  • 4.8

4.8

577 ratings


More shows like Talk Python To Me

View all
Software Engineering Radio - the podcast for professional software developers by se-radio@computer.org

Software Engineering Radio - the podcast for professional software developers

266 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

285 Listeners

Thoughtworks Technology Podcast by Thoughtworks

Thoughtworks Technology Podcast

41 Listeners

Data Skeptic by Kyle Polich

Data Skeptic

470 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

629 Listeners

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

434 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

296 Listeners

Python Bytes by Michael Kennedy and Brian Okken

Python Bytes

213 Listeners

Data Engineering Podcast by Tobias Macey

Data Engineering Podcast

140 Listeners

Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

Syntax - Tasty Web Development Treats

988 Listeners

CoRecursive: Coding Stories by Adam Gordon Bell - Software Developer

CoRecursive: Coding Stories

186 Listeners

DataFramed by DataCamp

DataFramed

269 Listeners

Practical AI by Practical AI LLC

Practical AI

190 Listeners

The Stack Overflow Podcast by The Stack Overflow Podcast

The Stack Overflow Podcast

63 Listeners

The Real Python Podcast by Real Python

The Real Python Podcast

136 Listeners