
Sign up to save your podcasts
Or


In this episode of Generative AI 101, we explore the intricate process of training Large Language Models (LLMs). Imagine training a brilliant student with the entire internet as their textbook—books, academic papers, Wikipedia, social media posts, and code repositories. We’ll cover the stages of data collection, cleaning, and tokenization. Learn how transformers, with their self-attention mechanisms, help these models understand and generate coherent text. Discover the training process using powerful GPUs or TPUs and techniques like distributed and mixed precision training. We'll also address the challenges, including the need for computational resources and ensuring data diversity. Finally, understand how fine-tuning these models for specific tasks makes them even more capable.
Connect with Emily Laird on LinkedIn
By Emily Laird4.6
2020 ratings
In this episode of Generative AI 101, we explore the intricate process of training Large Language Models (LLMs). Imagine training a brilliant student with the entire internet as their textbook—books, academic papers, Wikipedia, social media posts, and code repositories. We’ll cover the stages of data collection, cleaning, and tokenization. Learn how transformers, with their self-attention mechanisms, help these models understand and generate coherent text. Discover the training process using powerful GPUs or TPUs and techniques like distributed and mixed precision training. We'll also address the challenges, including the need for computational resources and ensuring data diversity. Finally, understand how fine-tuning these models for specific tasks makes them even more capable.
Connect with Emily Laird on LinkedIn

32,105 Listeners

542 Listeners

1,654 Listeners

56,561 Listeners

8,544 Listeners

179 Listeners

213 Listeners

27,813 Listeners

5,108 Listeners

10,187 Listeners

16,211 Listeners

1,776 Listeners

691 Listeners

111 Listeners

1 Listeners