
Sign up to save your podcasts
Or
Hey guys, in this episode I talk about the pre training process of the state of the art NLP and computer vision transformer architectures. Since 2017 we train NLP (BERT, GPT, ELECTRA) networks with a masked language model using a self-supervised procedure, and now (since 2022) we are also able to train vision (MAE) networks using the same masked language model procedure. This way of self-supervised pre training enable us to train accurate models that really understands semantic and context without labeled data. I also talk about a tabular transformed architecture (TabTransformer - 2020) using the same approach achieve state of the art results compared to ensemble methods.
Instagram: https://www.instagram.com/podcast.lifewithai/
Linkedin: https://www.linkedin.com/company/life-with-ai
BERT paper: https://arxiv.org/pdf/1810.04805.pdf
GPT3 paper: https://arxiv.org/pdf/2005.14165.pdf
ELECTRA paper: https://arxiv.org/pdf/2003.10555.pdf
MAE paper: https://arxiv.org/pdf/2111.06377.pdf
TabTransformers paper: https://arxiv.org/pdf/2012.06678.pdf
5
22 ratings
Hey guys, in this episode I talk about the pre training process of the state of the art NLP and computer vision transformer architectures. Since 2017 we train NLP (BERT, GPT, ELECTRA) networks with a masked language model using a self-supervised procedure, and now (since 2022) we are also able to train vision (MAE) networks using the same masked language model procedure. This way of self-supervised pre training enable us to train accurate models that really understands semantic and context without labeled data. I also talk about a tabular transformed architecture (TabTransformer - 2020) using the same approach achieve state of the art results compared to ensemble methods.
Instagram: https://www.instagram.com/podcast.lifewithai/
Linkedin: https://www.linkedin.com/company/life-with-ai
BERT paper: https://arxiv.org/pdf/1810.04805.pdf
GPT3 paper: https://arxiv.org/pdf/2005.14165.pdf
ELECTRA paper: https://arxiv.org/pdf/2003.10555.pdf
MAE paper: https://arxiv.org/pdf/2111.06377.pdf
TabTransformers paper: https://arxiv.org/pdf/2012.06678.pdf
223,325 Listeners
764 Listeners