Infinite Curiosity Pod with Prateek Joshi

LLM Data Frontiers


Listen Later

Curtis Northcutt is the cofounder and CEO of Cleanlab, a data curation platform for LLMs. They have raised $30M in funding from Bain Capital Ventures, Menlo, Databricks, and TQ. He was previously the cofounder and CTO of ChipBrain. He has a PhD in Computer Science from MIT.

(00:07) Data Curation in the Context of LLMs
(01:14) Connection between Language Models and Computer Science
(03:14) Importance of Data Curation for LLMs
(04:06) Challenges in Data Curation for LLMs
(06:09) Confident Learning and its Concept
(09:42) CleanLab and its Role
(12:42) Role of Open Source Datasets and Tooling
(15:08) Balancing Data and Privacy in Regulated Industries
(17:25) Feasibility of Federated Learning
(20:35) Decentralized Compute and Aggregating Compute Clusters
(25:19) Determining Model Size for Data Representation
(27:09) Advice for ML Engineers in Handling Data Curation
(30:20) Rapid Fire Round

Curtis's favorite book: The Bible (in the context of marketing)

--------
Where to find Prateek Joshi:

Newsletter: https://prateekjoshi.substack.com 
Website: https://prateekj.com 
LinkedIn: https://www.linkedin.com/in/prateek-joshi-91047b19 
Twitter: https://twitter.com/prateekvjoshi 

...more
View all episodesView all episodes
Download on the App Store

Infinite Curiosity Pod with Prateek JoshiBy Prateek Joshi

  • 4.9
  • 4.9
  • 4.9
  • 4.9
  • 4.9

4.9

8 ratings


More shows like Infinite Curiosity Pod with Prateek Joshi

View all
Notes on the Week Ahead by Dr. David Kelly

Notes on the Week Ahead

189 Listeners

a16z Podcast by Andreessen Horowitz

a16z Podcast

1,034 Listeners

The Knowledge Project with Shane Parrish by Shane Parrish

The Knowledge Project with Shane Parrish

2,645 Listeners

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

441 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

298 Listeners

FYI - For Your Innovation by ARK Invest

FYI - For Your Innovation

394 Listeners

Last Week in AI by Skynet Today

Last Week in AI

287 Listeners

All-In with Chamath, Jason, Sacks & Friedberg by All-In Podcast, LLC

All-In with Chamath, Jason, Sacks & Friedberg

9,189 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

417 Listeners

Barron's Live by Barron's Live

Barron's Live

195 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

121 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

75 Listeners

The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis

485 Listeners

The Ben & Marc Show by Marc Andreessen, Ben Horowitz

The Ben & Marc Show

135 Listeners

AI + a16z by a16z

AI + a16z

31 Listeners