March 24, 2025

Building Large Language Models: Data, Evaluation, and Systems

27 minutes

This YouTube transcript from a Stanford CS229 lecture provides an overview of building large language models (LLMs). It outlines key components for training LLMs, emphasizing architecture, training loss, data, evaluation, and system considerations. The lecture distinguishes between pre-training, focused on modeling internet text, and post-training, aimed at creating AI assistants. The discussion covers essential concepts like tokenization, evaluation metrics such as perplexity, and the critical role of data acquisition and scaling laws in LLM development. Furthermore, it touches upon post-training techniques like supervised fine-tuning and reinforcement learning from human feedback (RLHF), including its simplification through Direct Preference Optimization (DPO). Finally, the transcript briefly introduces system-level optimizations for efficient GPU utilization in training these large models.

...more

View all episodes

By Sublimetechie

March 24, 2025

Building Large Language Models: Data, Evaluation, and Systems

27 minutes

...more

Share Building Large Language Models: Data, Evaluation, and Systems

Sign up to save your podcasts

Building Large Language Models: Data, Evaluation, and Systems

Building Large Language Models: Data, Evaluation, and Systems