July 30, 2024

For data-hungry tech companies, YouTube is a gold mine

Listen Later

11 minutes

Companies competing in the chatbot wars are using something known in the industry as “the Pile” to train their large language models. It’s a trove of open-source data made up of text scraped from all around the internet, including Wikipedia and the European Parliament. Annie Gilbertson, investigative reporter for Proof News, recently took a deep dive into the Pile and discovered something else: a dataset called “YouTube Subtitles.” Marketplace’s Lily Jamali spoke with Gilbertson about her investigation and how YouTube creators feel about their content being used without their consent.

...more

View all episodes

View all episodes

Download on the App Store

Download on the App Store

Get it on Google Play

Marketplace Tech

By Marketplace

4.4

7777 ratings

July 30, 2024

For data-hungry tech companies, YouTube is a gold mine

Listen Later

11 minutes

Companies competing in the chatbot wars are using something known in the industry as “the Pile” to train their large language models. It’s a trove of open-source data made up of text scraped from all around the internet, including Wikipedia and the European Parliament. Annie Gilbertson, investigative reporter for Proof News, recently took a deep dive into the Pile and discovered something else: a dataset called “YouTube Subtitles.” Marketplace’s Lily Jamali spoke with Gilbertson about her investigation and how YouTube creators feel about their content being used without their consent.

...more

More shows like Marketplace Tech

Planet Money by NPR

Planet Money

30,609 Listeners

Marketplace by Marketplace

Marketplace

8,801 Listeners

Marketplace Morning Report by Marketplace

Marketplace Morning Report

941 Listeners

Marketplace All-in-One by Marketplace

Marketplace All-in-One

1,390 Listeners

Marketplace Tech by Marketplace

Marketplace Tech

1,290 Listeners

Motley Fool Hidden Gems Investing by The Motley Fool

Motley Fool Hidden Gems Investing

3,228 Listeners

WSJ Your Money Briefing by The Wall Street Journal

WSJ Your Money Briefing

1,713 Listeners

Pivot by New York Magazine

Pivot

9,724 Listeners

WSJ Tech News Briefing by The Wall Street Journal

WSJ Tech News Briefing

1,649 Listeners

Make Me Smart by Marketplace

Make Me Smart

5,480 Listeners

The Daily by The New York Times

The Daily

113,121 Listeners

Bold Names by The Wall Street Journal

Bold Names

1,448 Listeners

The Indicator from Planet Money by NPR

The Indicator from Planet Money

9,556 Listeners

Composers Datebook by American Public Media

Composers Datebook

10 Listeners

Piano Puzzler by American Public Media

Piano Puzzler

35 Listeners

Hard Fork by The New York Times

Hard Fork

5,576 Listeners

The Ezra Klein Show by New York Times Opinion

The Ezra Klein Show

16,525 Listeners