Jim Psota is the Co-Founder and CTO of Panjiva, which was named one of the top 10 Most Innovative Data Science Companies in the World by Fast Company in 2018. Panjiva has mapped the global supply chain using a combination of over 1B shipping transactions and machine learning, and recently the company was acquired by S&P Global.
Jim has spoken about artificial intelligence and entrepreneurship at Harvard Business School, MIT, and The White House, and at numerous academic and industry conferences. He also serves on the World Economic Forum’s Working Group for Artificial Intelligence and has done Ph.D. research in computer science at MIT. Some of the topics we discuss in today’s episode include:
What Jim learned from starting Panjiva from a data-first approach
Brian and Jim’s thoughts on problem solving driven by use cases and people vs. data and AI
3 things Jim wants teams to get right with data products
Jim and Brian’s thoughts on “blackbox” analytics that try to mask complex underlying data to make the UX easier
How Jim balances the messiness of 20+ third-party data sources, designing a good product, and billions of data points
Resources and Links:
Jim Psota
Jim Psota on Twitter
Panjiva
Quotes from Jim Psota
“When you’re dealing with essentially resolving 1.5 billion records, you could think of that you need to compute 1.5 billion squared pairs of potential similarities.”
“It’s much more fulfilling to be building for a person or a set of people that you’ve actually talked to… The engineers are going to develop a much better product and frankly be much happier having that connection to the user.”
“We have crossed a pretty cool threshold where a lot of value can be created now because we have this nice combination of data availability, strong algorithms, and compute power.”
“In our case and many other company’s cases, taking third-party data, no matter where you’re getting your data, there’s going to be issues with it, there’s going to be delays, format changes, granularity differences.”
“As much as possible, we try to use the tools of data science to actually correct the data deficiency or impute or whatever technique is actually going to be better than nothing, but then say this was imputed or this is reported versus imputed…then over time, the user starts to understand if it’s gray italics [the data] was imputed, and if it’s black regular text, that’s reported data, for example.”