
Sign up to save your podcasts
Or


In facial recognition and AI development, computers are trained on massive sets of data, millions of pictures gathered from all over the web. There are only a few publicly available datasets, and a lot of organizations use them. And they are problematic. Molly speaks with Vinay Prabhu, chief scientist at UnifyID. He and Abeba Birhane at University College Dublin recently studied these academic datasets. Most of the pictures are gathered without consent, people can be identified in them and there are racist and pornographic images and text. Ultimately, the researchers said, maybe it’s not the data that’s the problem. Maybe it’s the whole field.
By Marketplace4.5
12471,247 ratings
In facial recognition and AI development, computers are trained on massive sets of data, millions of pictures gathered from all over the web. There are only a few publicly available datasets, and a lot of organizations use them. And they are problematic. Molly speaks with Vinay Prabhu, chief scientist at UnifyID. He and Abeba Birhane at University College Dublin recently studied these academic datasets. Most of the pictures are gathered without consent, people can be identified in them and there are racist and pornographic images and text. Ultimately, the researchers said, maybe it’s not the data that’s the problem. Maybe it’s the whole field.

31,993 Listeners

30,734 Listeners

8,765 Listeners

926 Listeners

1,389 Listeners

1,707 Listeners

4,324 Listeners

2,179 Listeners

5,490 Listeners

56,500 Listeners

1,446 Listeners

9,536 Listeners

3,588 Listeners

6,444 Listeners

6,396 Listeners

163 Listeners

2,997 Listeners

5,510 Listeners

1,378 Listeners

90 Listeners