
Sign up to save your podcasts
Or


In facial recognition and AI development, computers are trained on massive sets of data, millions of pictures gathered from all over the web. There are only a few publicly available datasets, and a lot of organizations use them. And they are problematic. Molly speaks with Vinay Prabhu, chief scientist at UnifyID. He and Abeba Birhane at University College Dublin recently studied these academic datasets. Most of the pictures are gathered without consent, people can be identified in them and there are racist and pornographic images and text. Ultimately, the researchers said, maybe it’s not the data that’s the problem. Maybe it’s the whole field.
By Marketplace4.4
7676 ratings
In facial recognition and AI development, computers are trained on massive sets of data, millions of pictures gathered from all over the web. There are only a few publicly available datasets, and a lot of organizations use them. And they are problematic. Molly speaks with Vinay Prabhu, chief scientist at UnifyID. He and Abeba Birhane at University College Dublin recently studied these academic datasets. Most of the pictures are gathered without consent, people can be identified in them and there are racist and pornographic images and text. Ultimately, the researchers said, maybe it’s not the data that’s the problem. Maybe it’s the whole field.

38,599 Listeners

6,842 Listeners

30,868 Listeners

8,777 Listeners

5,138 Listeners

932 Listeners

1,387 Listeners

1,282 Listeners

6,451 Listeners

5,497 Listeners

57,023 Listeners

9,578 Listeners

10 Listeners

16,454 Listeners

36 Listeners

6,573 Listeners

6,454 Listeners