Two Voice Devs

Episode 212 - Data Labeling for Developers


Listen Later

Join Mark and Allen, your Two Voice Devs, as they delve into the crucial world of data labeling for machine learning model training. Whether you're a seasoned data scientist or a developer just starting to explore AI, understanding data labeling is essential for building effective models. In this episode, they explore various data labeling techniques, from manual labeling for simple voice apps to automated approaches using open-source libraries like Snorkel. Discover how labeled data powers everything from chatbots and voice assistants to spam filters and advanced language models like BERT. They discuss practical examples and highlight the role developers play in preparing and refining data for optimal model performance.


Timestamps:

[00:00:00] Introduction

[00:00:18] What is data labeling?

[00:03:41] Jovo example: Manual data labeling for voice apps and chatbots

[00:08:01] Labeling with slots and entities

[00:13:54] BERT example: Automated labeling during model training

[00:18:52] BERT inference and fine-tuning

[00:25:36] Snorkel: Programmatic data labeling with Python

[00:29:43] Snorkel example and labeling functions

[00:31:23] Leveraging LLMs for data labeling and augmentation

[00:33:24] The role of developers in data labeling

[00:35:02] Call to action: Share your data labeling experiences!


Thumbnail by Imagen 3 with prompt:

Cartoon ink and paint, with a touch of tech.

Scene: Two podcast hosts, sitting in front of microphones, smiling and engaging in conversation.

Both hosts are male, caucasian, software developers in their early 50s, wearing glasses, and are clean shaven.

The host on the left is wearing a blue t-shirt and a brown flat cap.

The host on the right is wearing a light blue polo shirt.

Warm, inviting lighting.

Background:

Individual data items, represented by squares or circles. Some are marked with a red A while others are marked with a green B. There are dotted lines grouping some of them together.

Negative prompt:

beards


#DataScience #MachineLearning #ML #AI #DataLabeling #DataTraining #ModelTraining #Jovo #BERT #Snorkel #Developers #SoftwareDevelopment #VoiceFirst

...more
View all episodesView all episodes
Download on the App Store

Two Voice DevsBy Mark and Allen

  • 1
  • 1
  • 1
  • 1
  • 1

1

1 ratings


More shows like Two Voice Devs

View all
Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

350 Listeners

The Daily AI Show by The Daily AI Show Crew - Brian, Beth, Jyunmi, Andy, Karl, and Eran

The Daily AI Show

3 Listeners