March 27, 2025

State of Reasoning Models, Building LLMs from Scratch and 7 Years of Scaling GPT | Sebastian Raschka

56 minutes

Join us for an insightful conversation with Sebastian Raschka, a renowned machine learning expert and author who has significantly contributed to AI education through his book "Build a Large Language Model from Scratch." Sebastian shares his journey in machine learning, offers advice for newcomers to the field, discusses the latest advancements in reasoning models, and explores the future of model architectures.

TOPICS DISCUSSED:

1. Learning AI from ScratchSebastian discusses effective approaches to learning AI today, emphasizing the importance of finding balance between theory and practical projects, and maintaining focus despite the overwhelming amount of available resources.

2. Reading Scientific PapersInsights on how Sebastian approaches scientific literature, his method for filtering relevant papers, and how he extracts valuable information without getting lost in the flood of new research.

3. Reasoning ModelsAn exploration of reasoning models, their practical applications, and how they differ from traditional LLMs in providing step-by-step solutions for complex problems.

4. Future of Model ArchitecturesSebastian discusses the evolution of transformer architectures, state space models like Mamba, and Google's Titan models, offering his perspective on where architectural innovation is heading.

5. Multi-GPU Training EnvironmentsPractical insights into the challenges of training large models on multiple GPUs, including hardware considerations and the realities of resource-constrained environments.

6. Open-Source ContributionsSebastian shares his experience working with PyTorch founders at Lightning AI and discusses how open-source projects can be sustainable while balancing commercial interests.

INSIGHTS:- Find a project that excites you to stay motivated when learning AI and balance learning theory with practical application- Reasoning models excel at tasks requiring step-by-step solutions, particularly for code and math problems- The ability to toggle reasoning capabilities on and off is becoming a standard feature in modern LLMs- The pre-training paradigm may be reaching saturation, with more opportunities in post-training approaches- Open-source contributions create synergies that benefit both companies and the broader community

FURTHER POINTERS:- Article on Reasoning Models: State of LLM Reasoning and Inference Scaling- Sebastian's Book: Build a Large Language Model from Scratch- Lightning AI platform

CONTACT INFO:- GitHub: Sebastian Raschka- LinkedIn: Sebastian Raschka

CHAPTERS
00:46 Introduction to Sebastian's career
02:27 Learning AI from scratch in 2025
07:47 Managing information overload and learning resources
10:48 Approaching scientific papers effectively
14:02 Reading papers with a purpose
17:38 Reasoning models and their applications
27:26 Future of LLM integration in applications
29:35 Future of model architectures beyond transformers
37:36 Evolution of pre-training and post-training approaches
40:18 Multi-GPU environments and challenges
48:44 Balancing open source with commercial interests
55:24 Closing recommendations

...more

View all episodes

By Elina Lesyk

March 27, 2025

State of Reasoning Models, Building LLMs from Scratch and 7 Years of Scaling GPT | Sebastian Raschka

56 minutes

TOPICS DISCUSSED:

3. Reasoning ModelsAn exploration of reasoning models, their practical applications, and how they differ from traditional LLMs in providing step-by-step solutions for complex problems.

FURTHER POINTERS:- Article on Reasoning Models: State of LLM Reasoning and Inference Scaling- Sebastian's Book: Build a Large Language Model from Scratch- Lightning AI platform

CONTACT INFO:- GitHub: Sebastian Raschka- LinkedIn: Sebastian Raschka

...more

Share State of Reasoning Models, Building LLMs from Scratch and 7 Years of Scaling GPT | Sebastian Raschka

Sign up to save your podcasts

State of Reasoning Models, Building LLMs from Scratch and 7 Years of Scaling GPT | Sebastian Raschka

State of Reasoning Models, Building LLMs from Scratch and 7 Years of Scaling GPT | Sebastian Raschka