Journal Club

Dark Secrets of Bert, Radioactive Data, and Vanishing Gradients


Listen Later

Today on the show, Lan presents a blog post revealing the Dark secrets of BERT. This work uses telling visualizations of self-attention patterns before and after fine-tuning to probe: what happens in the fine-tuned BERT?  George brings a novel technique to the show, "radioactive data" - a marriage of data and steganography. This work from Facebook AI Research gives us the ability to know exactly who's been training models on our data. Last but not least, Kyle discusses the work "Learning Important Features Through Propagating Activation Differences."

...more
View all episodesView all episodes
Download on the App Store

Journal ClubBy Data Skeptic

  • 5
  • 5
  • 5
  • 5
  • 5

5

4 ratings