Latent Space: The AI Engineer Podcast

⚡️GPT 4.1: The New OpenAI Workhorse


Listen Later

We’ll keep this brief because we’re on a tight turnaround: GPT 4.1, previously known as the Quasar and Optimus models, is now live as the natural update for 4o/4o-mini (and the research preview of GPT 4.5). Though it is a general purpose model family, the headline features are:

Coding abilities (o1-level SWEBench and SWELancer, but ok Aider)

Instruction Following (with a very notable prompting guide)

Long Context up to 1m tokens (with new MRCR and Graphwalk benchmarks)

Vision (simply o1 level)

Cheaper Pricing (cheaper than 4o, greatly improved prompt caching savings)

We caught up with returning guest Michelle Pokrass and Josh McGrath to get more detail on each!

Chapters
  • 00:00:00 Introduction and Guest Welcome
  • 00:00:57 GPC 4.1 Launch Overview
  • 00:01:54 Developer Feedback and Model Names
  • 00:02:53 Model Naming and Starry Themes
  • 00:03:49 Confusion Over GPC 4.1 vs 4.5
  • 00:04:47 Distillation and Model Improvements
  • 00:05:45 Omnimodel Architecture and Future Plans
  • 00:06:43 Core Capabilities of GPC 4.1
  • 00:07:40 Training Techniques and Long Context
  • 00:08:37 Challenges in Long Context Reasoning
  • 00:09:34 Context Utilization in Models
  • 00:10:31 Graph Walks and Model Evaluation
  • 00:11:31 Real Life Applications of Graph Tasks
  • 00:12:30 Multi-Hop Reasoning Benchmarks
  • 00:13:30 Agentic Workflows and Backtracking
  • 00:14:28 Graph Traversals for Agent Planning
  • 00:15:24 Context Usage in API and Memory Systems
  • 00:16:21 Model Performance in Long Context Tasks
  • 00:17:17 Instruction Following and Real World Data
  • 00:18:12 Challenges in Grading Instructions
  • 00:19:09 Instruction Following Techniques
  • 00:20:09 Prompting Techniques and Model Responses
  • 00:21:05 Agentic Workflows and Model Persistence
  • 00:22:01 Balancing Persistence and User Control
  • 00:22:56 Evaluations on Model Edits and Persistence
  • 00:23:55 XML vs JSON in Prompting
  • 00:24:50 Instruction Placement in Context
  • 00:25:49 Optimizing for Prompt Caching
  • 00:26:49 Chain of Thought and Reasoning Models
  • 00:27:46 Choosing the Right Model for Your Task
  • 00:28:46 Coding Capabilities of GPC 4.1
  • 00:29:41 Model Performance in Coding Tasks
  • 00:30:39 Understanding Coding Model Differences
  • 00:31:36 Using Smaller Models for Coding
  • 00:32:33 Future of Coding in OpenAI
  • 00:33:28 Internal Use and Success Stories
  • 00:34:26 Vision and Multi-Modal Capabilities
  • 00:35:25 Screen vs Embodied Vision
  • 00:36:22 Vision Benchmarks and Model Improvements
  • 00:37:19 Model Deprecation and GPU Usage
  • 00:38:13 Fine-Tuning and Preference Steering
  • 00:39:12 Upcoming Reasoning Models
  • 00:40:10 Creative Writing and Model Humor
  • 00:41:07 Feedback and Developer Community
  • 00:42:03 Pricing and Blended Model Costs
  • 00:44:02 Conclusion and Wrap-Up
...more
View all episodesView all episodes
Download on the App Store

Latent Space: The AI Engineer PodcastBy swyx + Alessio

  • 4.8
  • 4.8
  • 4.8
  • 4.8
  • 4.8

4.8

58 ratings


More shows like Latent Space: The AI Engineer Podcast

View all
a16z Podcast by Andreessen Horowitz

a16z Podcast

994 Listeners

Data Skeptic by Kyle Polich

Data Skeptic

474 Listeners

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

431 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

293 Listeners

NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

323 Listeners

Practical AI by Practical AI LLC

Practical AI

194 Listeners

Last Week in AI by Skynet Today

Last Week in AI

279 Listeners

Machine Learning Street Talk (MLST) by Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

90 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

333 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

122 Listeners

This Day in AI Podcast by Michael Sharkey, Chris Sharkey

This Day in AI Podcast

191 Listeners

The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis

419 Listeners

AI + a16z by a16z

AI + a16z

26 Listeners

Lightcone Podcast by Y Combinator

Lightcone Podcast

16 Listeners

Training Data by Sequoia Capital

Training Data

30 Listeners