
Sign up to save your podcasts
Or
Modern AI is trained on a huge fraction of the internet, especially at the cutting edge, with the best models trained on close to all the high quality data we’ve got.[1] And data is really important! You can scale up compute, you can make algorithms more efficient, or you can add infrastructure around a model to make it more useful, but on the margin, great datasets are king. And, naively, we’re about to run out of fresh data to use.
It's rumored that the top firms are looking for ways to get around the data wall. One possible approach is having LLMs create their own data to train on, for which there is kinda-sorta a precedent from, e.g. modern chess AIs learning by playing games against themselves.[2] Or just finding ways to make AI dramatically more sample efficient with the data we’ve already got: the [...]
The original text contained 3 footnotes which were omitted from this narration.
---
First published:
Source:
Narrated by TYPE III AUDIO.
Modern AI is trained on a huge fraction of the internet, especially at the cutting edge, with the best models trained on close to all the high quality data we’ve got.[1] And data is really important! You can scale up compute, you can make algorithms more efficient, or you can add infrastructure around a model to make it more useful, but on the margin, great datasets are king. And, naively, we’re about to run out of fresh data to use.
It's rumored that the top firms are looking for ways to get around the data wall. One possible approach is having LLMs create their own data to train on, for which there is kinda-sorta a precedent from, e.g. modern chess AIs learning by playing games against themselves.[2] Or just finding ways to make AI dramatically more sample efficient with the data we’ve already got: the [...]
The original text contained 3 footnotes which were omitted from this narration.
---
First published:
Source:
Narrated by TYPE III AUDIO.
26,446 Listeners
2,388 Listeners
7,910 Listeners
4,133 Listeners
87 Listeners
1,462 Listeners
9,095 Listeners
87 Listeners
389 Listeners
5,429 Listeners
15,174 Listeners
474 Listeners
121 Listeners
75 Listeners
459 Listeners