January 13, 2022

#33 - The state of the art is self-supervised! How to pre train NLP and computer vision transformer architectures.

29 minutes

Hey guys, in this episode I talk about the pre training process of the state of the art NLP and computer vision transformer architectures. Since 2017 we train NLP (BERT, GPT, ELECTRA) networks with a masked language model using a self-supervised procedure, and now (since 2022) we are also able to train vision (MAE) networks using the same masked language model procedure. This way of self-supervised pre training enable us to train accurate models that really understands semantic and context without labeled data. I also talk about a tabular transformed architecture (TabTransformer - 2020) using the same approach achieve state of the art results compared to ensemble methods.

Instagram: https://www.instagram.com/podcast.lifewithai/

Linkedin: https://www.linkedin.com/company/life-with-ai

BERT paper: https://arxiv.org/pdf/1810.04805.pdf

GPT3 paper: https://arxiv.org/pdf/2005.14165.pdf

ELECTRA paper: https://arxiv.org/pdf/2003.10555.pdf

MAE paper: https://arxiv.org/pdf/2111.06377.pdf

TabTransformers paper: https://arxiv.org/pdf/2012.06678.pdf

...more