
Sign up to save your podcasts
Or


In this episode of Generative AI 101, we explore the intricate process of training Large Language Models (LLMs). Imagine training a brilliant student with the entire internet as their textbook—books, academic papers, Wikipedia, social media posts, and code repositories. We’ll cover the stages of data collection, cleaning, and tokenization. Learn how transformers, with their self-attention mechanisms, help these models understand and generate coherent text. Discover the training process using powerful GPUs or TPUs and techniques like distributed and mixed precision training. We'll also address the challenges, including the need for computational resources and ensuring data diversity. Finally, understand how fine-tuning these models for specific tasks makes them even more capable.
Connect with Emily Laird on LinkedIn
By Emily Laird4.6
1919 ratings
In this episode of Generative AI 101, we explore the intricate process of training Large Language Models (LLMs). Imagine training a brilliant student with the entire internet as their textbook—books, academic papers, Wikipedia, social media posts, and code repositories. We’ll cover the stages of data collection, cleaning, and tokenization. Learn how transformers, with their self-attention mechanisms, help these models understand and generate coherent text. Discover the training process using powerful GPUs or TPUs and techniques like distributed and mixed precision training. We'll also address the challenges, including the need for computational resources and ensuring data diversity. Finally, understand how fine-tuning these models for specific tasks makes them even more capable.
Connect with Emily Laird on LinkedIn

334 Listeners

152 Listeners

208 Listeners

197 Listeners

154 Listeners

227 Listeners

608 Listeners

274 Listeners

107 Listeners

54 Listeners

173 Listeners

55 Listeners

146 Listeners

62 Listeners

24 Listeners