Infinite Curiosity Pod with Prateek Joshi

LLM Data Frontiers


Listen Later

Curtis Northcutt is the cofounder and CEO of Cleanlab, a data curation platform for LLMs. They have raised $30M in funding from Bain Capital Ventures, Menlo, Databricks, and TQ. He was previously the cofounder and CTO of ChipBrain. He has a PhD in Computer Science from MIT.

(00:07) Data Curation in the Context of LLMs
(01:14) Connection between Language Models and Computer Science
(03:14) Importance of Data Curation for LLMs
(04:06) Challenges in Data Curation for LLMs
(06:09) Confident Learning and its Concept
(09:42) CleanLab and its Role
(12:42) Role of Open Source Datasets and Tooling
(15:08) Balancing Data and Privacy in Regulated Industries
(17:25) Feasibility of Federated Learning
(20:35) Decentralized Compute and Aggregating Compute Clusters
(25:19) Determining Model Size for Data Representation
(27:09) Advice for ML Engineers in Handling Data Curation
(30:20) Rapid Fire Round

Curtis's favorite book: The Bible (in the context of marketing)

--------
Where to find Prateek Joshi:

Newsletter: https://prateekjoshi.substack.com 
Website: https://prateekj.com 
LinkedIn: https://www.linkedin.com/in/prateek-joshi-91047b19 
Twitter: https://twitter.com/prateekvjoshi 

...more
View all episodesView all episodes
Download on the App Store

Infinite Curiosity Pod with Prateek JoshiBy Prateek Joshi

  • 4.9
  • 4.9
  • 4.9
  • 4.9
  • 4.9

4.9

8 ratings


More shows like Infinite Curiosity Pod with Prateek Joshi

View all
Reveal by The Center for Investigative Reporting and PRX

Reveal

8,299 Listeners

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch by Harry Stebbings

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

532 Listeners

a16z Podcast by Andreessen Horowitz

a16z Podcast

1,087 Listeners

Unchained by Laura Shin

Unchained

1,212 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

303 Listeners

The Daily by The New York Times

The Daily

112,427 Listeners

Y Combinator Startup Podcast by Y Combinator

Y Combinator Startup Podcast

235 Listeners

Practical AI by Practical AI LLC

Practical AI

212 Listeners

All-In with Chamath, Jason, Sacks & Friedberg by All-In Podcast, LLC

All-In with Chamath, Jason, Sacks & Friedberg

9,900 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

76 Listeners

Hard Fork by The New York Times

Hard Fork

5,470 Listeners

The Ezra Klein Show by New York Times Opinion

The Ezra Klein Show

16,026 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

130 Listeners

The AI Daily Brief: Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief: Artificial Intelligence News and Analysis

587 Listeners

Training Data by Sequoia Capital

Training Data

39 Listeners