
Sign up to save your podcasts
Or


In facial recognition and AI development, computers are trained on massive sets of data, millions of pictures gathered from all over the web. There are only a few publicly available datasets, and a lot of organizations use them. And they are problematic. Molly speaks with Vinay Prabhu, chief scientist at UnifyID. He and Abeba Birhane at University College Dublin recently studied these academic datasets. Most of the pictures are gathered without consent, people can be identified in them and there are racist and pornographic images and text. Ultimately, the researchers said, maybe it’s not the data that’s the problem. Maybe it’s the whole field.
By Marketplace4.5
12561,256 ratings
In facial recognition and AI development, computers are trained on massive sets of data, millions of pictures gathered from all over the web. There are only a few publicly available datasets, and a lot of organizations use them. And they are problematic. Molly speaks with Vinay Prabhu, chief scientist at UnifyID. He and Abeba Birhane at University College Dublin recently studied these academic datasets. Most of the pictures are gathered without consent, people can be identified in them and there are racist and pornographic images and text. Ultimately, the researchers said, maybe it’s not the data that’s the problem. Maybe it’s the whole field.

32,250 Listeners

30,625 Listeners

8,790 Listeners

936 Listeners

1,390 Listeners

1,649 Listeners

2,178 Listeners

5,482 Listeners

113,307 Listeners

56,974 Listeners

9,551 Listeners

10,329 Listeners

3,618 Listeners

6,101 Listeners

6,590 Listeners

6,466 Listeners

163 Listeners

2,989 Listeners

154 Listeners

1,376 Listeners

91 Listeners