
Sign up to save your podcasts
Or


A recent study predicts that tech companies will exhaust the supply of publicly available training data for AI language models by 2026-2032, threatening the current pace of progress in AI development. The tens of trillions of words online, used to make AI systems smarter, will soon run out, forcing companies to rely on less-reliable synthetic data or tap into sensitive information. This could lead to degraded performance, bias, and unfairness in AI systems. Researchers are exploring ways to address the issue, which raises questions about the future of human-generated content and online accessibility.
By Dr. Tony Hoang4.6
99 ratings
A recent study predicts that tech companies will exhaust the supply of publicly available training data for AI language models by 2026-2032, threatening the current pace of progress in AI development. The tens of trillions of words online, used to make AI systems smarter, will soon run out, forcing companies to rely on less-reliable synthetic data or tap into sensitive information. This could lead to degraded performance, bias, and unfairness in AI systems. Researchers are exploring ways to address the issue, which raises questions about the future of human-generated content and online accessibility.

91,142 Listeners

32,147 Listeners

229,051 Listeners

1,095 Listeners

340 Listeners

56,472 Listeners

153 Listeners

8,889 Listeners

2,040 Listeners

9,909 Listeners

70 Listeners

1,864 Listeners

80 Listeners

268 Listeners

4,233 Listeners