Data Journeys

#23: Wes McKinney - The Creator of Pandas


Listen Later

Wes McKinney is the creator and "Benevolent Dictator for Life" (BDFL) of the open-source pandas package for data analysis in Python, and has also authored two versions of the reference book Python for Data Analysis. Wes is also one of the co-creators of the Apache Arrow project, which is currently his main focus. Most recently, he is the founder Ursa Labs, a not-for-profit open source development group in partnership with RStudio.

 

He describes himself as a problem-solver, and is particularly interested in improving the usability of data tools for programmers, accelerating data access and in-memory data processing performance, and improving data system interoperability.

 

In my conversation with Wes today, we focused on getting to know Wes on a more personal level, discussing his background and interests to get some insight into the living legend of open source he has become.

 

  • [3:48] How did coming from four generations of newspaperman impact Wes’s upbringing?
  • [6:00] What kind of hobbies was he interested in growing up, and what is the origin of his interest in computers?
  • [11:08] How did he come to run a Goldeneye 007 world record website, and update and maintain it by hand?
  • [16:10] Wes’s high school career as a mathlete, and how an early interest in math contributed to his approach to programming.
  • [18:15] How wes brings the rigor he learned in mathematics to software engineering.
  • [19:50] How languages and math scratch the same itch for composition.
  • [21:00] About learning enough German to complete a PhP programming internship in Munich.
  • [23:00] How Wes’s experience using data in his first year working post-undergrad set him down the path to Pandas.
  • [25:00] What went into his decision to take leave from grad school to build Pandas?
  • [27:00] The legendary tweet where Wes expressed his sense of purpose and motivation in building Pandas.
  • [29:52] Why Wes’s work is motivated by the desire to free up people’s time to realize their full potential.
  • [30:51] Zero to One - Peter Thiel
  • [31:40] Why is solving basic efficiency problems, like reading CSV files. so important?
  • [34:12] How community management has played such a huge role in making Pandas so successful compared to other tools.
  • [39:00] The importance of seeing peers in an open source project as people with good intentions and more than just a GitHub profile.
  • [46:00] How do the incentives of an open source project influence prioritization in a project?
  • [51:45] How Wes’s newest project, UrsaLabs, is tackling the problem of funding in open source software development.
  • [56:20] Wes’s goals for UrsaLabs over the next five years.

 

AJ’s Twitter: https://twitter.com/ajgoldstein393

Wes’s Twitter:https://twitter.com/wesmckinn

Wes’s personal website: http://wesmckinney.com

Wes’s LinkedIn: https://www.linkedin.com/in/wesmckinn/

...more
View all episodesView all episodes
Download on the App Store

Data JourneysBy AJ Goldstein