Infinite Curiosity Pod with Prateek Joshi

LLM Data Frontiers


Listen Later

Curtis Northcutt is the cofounder and CEO of Cleanlab, a data curation platform for LLMs. They have raised $30M in funding from Bain Capital Ventures, Menlo, Databricks, and TQ. He was previously the cofounder and CTO of ChipBrain. He has a PhD in Computer Science from MIT.

(00:07) Data Curation in the Context of LLMs
(01:14) Connection between Language Models and Computer Science
(03:14) Importance of Data Curation for LLMs
(04:06) Challenges in Data Curation for LLMs
(06:09) Confident Learning and its Concept
(09:42) CleanLab and its Role
(12:42) Role of Open Source Datasets and Tooling
(15:08) Balancing Data and Privacy in Regulated Industries
(17:25) Feasibility of Federated Learning
(20:35) Decentralized Compute and Aggregating Compute Clusters
(25:19) Determining Model Size for Data Representation
(27:09) Advice for ML Engineers in Handling Data Curation
(30:20) Rapid Fire Round

Curtis's favorite book: The Bible (in the context of marketing)

--------
Where to find Prateek Joshi:

Newsletter: https://prateekjoshi.substack.com 
Website: https://prateekj.com 
LinkedIn: https://www.linkedin.com/in/prateek-joshi-91047b19 
Twitter: https://twitter.com/prateekvjoshi 

...more
View all episodesView all episodes
Download on the App Store

Infinite Curiosity Pod with Prateek JoshiBy Prateek Joshi

  • 4.9
  • 4.9
  • 4.9
  • 4.9
  • 4.9

4.9

8 ratings


More shows like Infinite Curiosity Pod with Prateek Joshi

View all
Invest Like the Best with Patrick O'Shaughnessy by Colossus | Investing & Business Podcasts

Invest Like the Best with Patrick O'Shaughnessy

2,347 Listeners