
Sign up to save your podcasts
Or


A recent study predicts that tech companies will exhaust the supply of publicly available training data for AI language models by 2026-2032, threatening the current pace of progress in AI development. The tens of trillions of words online, used to make AI systems smarter, will soon run out, forcing companies to rely on less-reliable synthetic data or tap into sensitive information. This could lead to degraded performance, bias, and unfairness in AI systems. Researchers are exploring ways to address the issue, which raises questions about the future of human-generated content and online accessibility.
By Dr. Tony Hoang4.6
99 ratings
A recent study predicts that tech companies will exhaust the supply of publicly available training data for AI language models by 2026-2032, threatening the current pace of progress in AI development. The tens of trillions of words online, used to make AI systems smarter, will soon run out, forcing companies to rely on less-reliable synthetic data or tap into sensitive information. This could lead to degraded performance, bias, and unfairness in AI systems. Researchers are exploring ways to address the issue, which raises questions about the future of human-generated content and online accessibility.

204 Listeners

3,298 Listeners