Linear Digressions

Pre-training language models for natural language processing problems


Listen Later

When you build a model for natural language processing (NLP), such as a recurrent neural network, it helps a ton if you’re not starting from zero. In other words, if you can draw upon other datasets for building your understanding of word meanings, and then use your training dataset just for subject-specific refinements, you’ll get farther than just using your training dataset for everything. This idea of starting with some pre-trained resources has an analogue in computer vision, where initializations from ImageNet used for the first few layers of a CNN have become the new standard. There’s a similar progression under way in NLP, where simple(r) embeddings like word2vec are giving way to more advanced pre-processing methods that aim to capture more sophisticated understanding of word meanings, contexts, language structure, and more.
Relevant links:
https://thegradient.pub/nlp-imagenet/
...more
View all episodesView all episodes
Download on the App Store

Linear DigressionsBy Ben Jaffe and Katie Malone

  • 4.8
  • 4.8
  • 4.8
  • 4.8
  • 4.8

4.8

353 ratings


More shows like Linear Digressions

View all
99% Invisible by Roman Mars

99% Invisible

26,182 Listeners

You Are Not So Smart by You Are Not So Smart

You Are Not So Smart

1,705 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

292 Listeners

The Daily by The New York Times

The Daily

111,399 Listeners

The Ezra Klein Show by New York Times Opinion

The Ezra Klein Show

15,319 Listeners

WSJ's Take On the Week by The Wall Street Journal

WSJ's Take On the Week

127 Listeners

The Severance Podcast with Ben Stiller & Adam Scott by Audacy, Red Hour, Great Scott

The Severance Podcast with Ben Stiller & Adam Scott

2,292 Listeners