
Sign up to save your podcasts
Or
A recent study predicts that tech companies will exhaust the supply of publicly available training data for AI language models by 2026-2032, threatening the current pace of progress in AI development. The tens of trillions of words online, used to make AI systems smarter, will soon run out, forcing companies to rely on less-reliable synthetic data or tap into sensitive information. This could lead to degraded performance, bias, and unfairness in AI systems. Researchers are exploring ways to address the issue, which raises questions about the future of human-generated content and online accessibility.
4.9
88 ratings
A recent study predicts that tech companies will exhaust the supply of publicly available training data for AI language models by 2026-2032, threatening the current pace of progress in AI development. The tens of trillions of words online, used to make AI systems smarter, will soon run out, forcing companies to rely on less-reliable synthetic data or tap into sensitive information. This could lead to degraded performance, bias, and unfairness in AI systems. Researchers are exploring ways to address the issue, which raises questions about the future of human-generated content and online accessibility.
1,272 Listeners
9,257 Listeners
331 Listeners
4,716 Listeners
111,917 Listeners
192 Listeners
2,543 Listeners
2,969 Listeners
9,207 Listeners
5,462 Listeners
28,494 Listeners
15,335 Listeners
173 Listeners
121 Listeners
491 Listeners