747: Technical Intro to Transformers and LLMs, with Kirill Eremenko

01.09.2024 - By Jon Krohn Play

Download our free app to listen on your phone

Attention and transformers in LLMs, the five stages of data processing, and a brand-new Large Language Models A-Z course: Kirill Eremenko joins host Jon Krohn to explore what goes into well-crafted LLMs, what makes Transformers so powerful, and how to succeed as a data scientist in this new age of generative AI.

This episode is brought to you by Intel and HPE Ezmeral Software Solutions (https://hpe.com/ezmeral/chatwithyourdata), and by Prophets of AI (https://prophetsofai.com), the leading agency for AI experts. Interested in sponsoring a SuperDataScience Podcast episode? Visit https://passionfroot.me/superdatascience for sponsorship information.

In this episode you will learn:

• Supply and demand in AI recruitment [08:30]

• Kirill and Hadelin's new course on LLMs, “Large Language Models (LLMs), Transformers & GPT A-Z” [15:37]

• The learning difficulty in understanding LLMs [19:46]

• The basics of LLMs [22:00]

• The five building blocks of transformer architecture [36:29]

- 1: Input embedding [44:10]

- 2: Positional encoding [50:46]

- 3: Attention mechanism [54:04]

- 4: Feedforward neural network [1:16:17]

- 5: Linear transformation and softmax [1:19:16]

• Inference vs training time [1:29:12]

• Why transformers are so powerful [1:49:22]

Additional materials: www.superdatascience.com/747

More episodes from Super Data Science: ML & AI Podcast with Jon Krohn