What’s happening with the latest releases of large language models? Is the industry hitting the edge of the scaling laws, and do the current benchmarks provide reliable performance assessments? This week on the show, Jodie Burchell returns to discuss the current state of LLM releases.
The most recent release of GPT-5 has been a wake-up call for the LLM industry. We discuss how the current scaling of these systems is reaching a diminishing edge. Jodie also shares how many AI model assessments and benchmarks are flawed. We also take a sober look at the productivity gains from using these tools for software development within companies.
We discuss how newer developers should consider additional factors when looking at the current job market. Jodie digs into how economic changes and rising interest rates are influencing layoffs and hiring freezes. Then we share a wide collection of resources for you to continue exploring these topics.
This episode is sponsored by InfluxData.
Course Spotlight: Exploring Python Closures: Examples and Use Cases
Learn about Python closures: function-like objects with extended scope used for decorators, factories, and stateful functions.
00:00:00 – Introduction00:03:00 – Recent conferences and talks00:04:18 – What’s going on with LLMs?00:06:06 – What happened with the GPT-5 release?00:08:14 – Simon Willison - 2025 in LLMs so far00:09:00 – How did we get here?00:10:37 – OpenAI’s and scaling laws00:12:25 – Pivoting to post-training00:16:01 – Some history of AI eras00:17:54 – Issues with measuring performance and benchmarks00:22:19 – Chatbot Arena00:24:06 – Languages are finite00:26:22 – LLMs and the illusion of humanity00:30:41 – Sponsor: InfluxData00:31:34 – Types of solutions to move past these limits00:36:57 – Does AI actually boost developer productivity?00:44:19 – Agentic Al Programming with Python00:48:02 – Results of non-programmers vibe coding00:50:18 – Back to the concept of overfitting00:52:52 – The money involved in training00:56:50 – Video Course Spotlight00:58:21 – Deepseek and new methods of training01:01:02 – Quantizing and fitting on a local machine01:04:48 – The layoffs and the economic changes01:10:32 – AI implementation failures01:21:01 – Don’t doubt yourself as a developer01:24:06 – What are you excited about in the world of Python?01:25:39 – What do you want to learn next?01:26:42 – What’s the best way to follow your work online?01:27:04 – Thanks and goodbyeListener Survey - Help Shape the Future of the Real Python PodcastEuroPython 2025 - July 14th-20th 2025 - Prague, Czech Republic & RemoteEpisode #232: Exploring Modern Sentiment Analysis Approaches in PythonGPT-5: Overdue, overhyped and underwhelming. And that’s not the worst of it.GPT 5’s Rocky Launch Highlights AI Disillusionment - IEEE Spectrum2025 in LLMs so far, illustrated by Pelicans on Bicycles — Simon WillisonAttention is All You Need - GoogleScaling laws for neural language models - OpenAIWhat if AI Doesn’t Get Much Better Than This? - Cal NewportHiltzik: AI hype is fading fast - Los Angeles TimesDoes AI Actually Boost Developer Productivity? (100k Devs Study) - Yegor Denisov-Blanch, Stanford - YouTubeMeasuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity - METRAmazon Cloud Chief: Replacing Junior Staff With AI Is ‘Dumbest’ Idea - Business Insider20 LLM evaluation benchmarks and how they workMMLU - Measuring Massive Multitask Language UnderstandingHellaSwag: Can a Machine Really Finish Your Sentence?Mechanical Turk - WikipediaAmazon Mechanical TurkChatbot Arena - LMArenaLLMs Can’t Reason - The Reversal Curse, The Alice In Wonderland Test, And The ARC - AGI Challenge - CustomGPTMirror, mirror: LLMs and the illusion of humanity - Jodie Burchell - NDC Oslo 2024 - YouTubeIs Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution LensContext Rot: How Increasing Input Tokens Impacts LLM Performance - YouTubeDoes AI Actually Boost Developer Productivity? (100k Devs Study) - Yegor Denisov-Blanch, Stanford - YouTubeAWS CEO says no more programmers in 2 years - Tech Industry - BlindMIT report: 95% of generative AI pilots at companies are failing - FortuneAgentic Al Programming with Python - Talk Python To Me PodcastVibe coding through the GPT-5 mess - The VergeOverfitting - WikipediaAndrej Karpathy - Busy Person’s Intro to LLMs - YouTubeAI Isn’t Taking Your Job – The Economy Is - Andrew StiefelCommonwealth Bank backtracks on AI job cuts, apologizes for ‘error’ as call volumes rise - ABC NewsKlarna CEO Reverses Course By Hiring More Humans, Not AI | EntrepreneurHas Duolingo Lost Its Streak? - Matt Jones - MediumMcDonald’s removes AI drive-throughs after order errorsOpenAI Usage Plummets in the Summer, When Students Aren’t Cheating on HomeworkWhat Happened When I Tried to Replace Myself with ChatGPT in My English Classroom - Literary HubLearning to code in the age of AI — Sheena O’Connell - YouTubeJodie Burchell - The JetBrains BlogJodie Burchell’s Blog - Standard errorJodie Burchell (@t-redactyl.bsky.social) — BlueskyJodie Burchell 🇦🇺🇩🇪 (@[email protected]) - FosstodonJetBrains: Essential tools for software developers and teamsLevel up your Python skills with our expert-led courses:
Python Decorators 101A History of Python Versions and FeaturesExploring Python Closures: Examples and Use Cases Support the podcast & join our community of Pythonistas