
Sign up to save your podcasts
Or


In this episode of Generative AI 101, we explore the intricate process of training Large Language Models (LLMs). Imagine training a brilliant student with the entire internet as their textbook—books, academic papers, Wikipedia, social media posts, and code repositories. We’ll cover the stages of data collection, cleaning, and tokenization. Learn how transformers, with their self-attention mechanisms, help these models understand and generate coherent text. Discover the training process using powerful GPUs or TPUs and techniques like distributed and mixed precision training. We'll also address the challenges, including the need for computational resources and ensuring data diversity. Finally, understand how fine-tuning these models for specific tasks makes them even more capable.
Connect with Emily Laird on LinkedIn
By Emily Laird4.6
2020 ratings
In this episode of Generative AI 101, we explore the intricate process of training Large Language Models (LLMs). Imagine training a brilliant student with the entire internet as their textbook—books, academic papers, Wikipedia, social media posts, and code repositories. We’ll cover the stages of data collection, cleaning, and tokenization. Learn how transformers, with their self-attention mechanisms, help these models understand and generate coherent text. Discover the training process using powerful GPUs or TPUs and techniques like distributed and mixed precision training. We'll also address the challenges, including the need for computational resources and ensuring data diversity. Finally, understand how fine-tuning these models for specific tasks makes them even more capable.
Connect with Emily Laird on LinkedIn

32,246 Listeners

536 Listeners

1,649 Listeners

56,944 Listeners

8,876 Listeners

175 Listeners

212 Listeners

27,584 Listeners

5,109 Listeners

10,254 Listeners

16,525 Listeners

1,788 Listeners

688 Listeners

112 Listeners

0 Listeners