The AI Podcast

MLCommons’ David Kanter, NVIDIA’s Daniel Galvez on Publicly Accessible Datasets - Ep. 167

04.12.2022 - By NVIDIAPlay

Download our free app to listen on your phone

Download on the App StoreGet it on Google Play

In deep learning and machine learning, having a large enough dataset is key to training a system and getting it to produce results.

So what does a ML researcher do when there just isn’t enough publicly accessible data?

Enter the MLCommons Association, a global engineering consortium with the aim of making ML better for everyone.

MLCommons recently announced the general availability of the People’s Speech Dataset, a 30,000 hour English-language conversational speech dataset, and the Multilingual Spoken Words Corpus, an audio speech dataset with over 340,000 keywords in 50 languages, to help advance ML research.

On this episode of NVIDIA’s AI Podcast, host Noah Kravitz spoke with David Kanter, founder and executive director of MLCommons, and NVIDIA senior AI developer technology engineer Daniel Galvez, about the democratization of access to speech technology and how ML Commons is helping advance the research and development of machine learning for everyone.

https://blogs.nvidia.com/blog/2022/04/13/mlcommons/

More episodes from The AI Podcast