Last Week in AI

#201 - GPT 4.5, Sonnet 3.7, Grok 3, Phi 4


Listen Later

Our 201st episode with a summary and discussion of last week's big AI news!

Recorded on 03/02/2025

Join our brand new Discord here! https://discord.gg/nTyezGSKwP

Hosted by Andrey Kurenkov and guest host Sharon Zhou

Feel free to email us your questions and feedback at [email protected] and/or [email protected]

Read out our text newsletter and comment on the podcast at https://lastweekin.ai/.

In this episode:

- The release of GPT-4.5 from OpenAI, Anthropic's Claude 3.7, and Grok 3 from XAI, comparing their features, costs, and capabilities. 

- Discussion on new tools and applications including Sesame's new voice assistant and Google's AI coding assistant, Gemini Code Assist, highlighting their unique benefits. 
- OpenAI's continued user growth despite competition, pricing models for Google's text-to-video platform, and HP acquiring and shutting down Humane's AI pin. 
- Insights into new research on alignment and specification gaming in LLMs, including papers on fine-tuning causing broad misalignment and Google's multi-agent system for scientific collaboration.

Timestamps + Links:

  • (00:00:00) Intro / Banter 
  • (00:01:36) News Preview
  • Tools & Apps
      • (00:02:33) OpenAI announces GPT-4.5, warns it’s not a frontier AI model
      • (00:07:22) Anthropic launches a new AI model that ‘thinks’ as long as you want
      • (00:11:14) New Grok 3 release tops LLM leaderboards
      • (00:16:43) Sesame is the first voice assistant I’ve ever wanted to talk to more than once
      • (00:18:30) Google launches a free AI coding assistant with very high usage caps
      • (00:20:45) Rabbit shows off the AI agent it should have launched with
      • (00:22:23) Mistral’s Le Chat tops 1M downloads in just 14 days
        • Applications & Business
          • (00:24:06) OpenAI Tops 400 Million Users Despite DeepSeek’s Emergence
          • (00:27:37) Google’s new AI video model Veo 2 will cost 50 cents per second
          • (00:29:52) HP is buying Humane and shutting down the AI Pin
            • Projects & Open Source
              • (00:31:44) Microsoft launches next-gen Phi AI models.
              • (00:33:47) OpenAI introduces SWE-Lancer: A Benchmark for Evaluating Model Performance on Real-World Freelance Software Engineering Work
              • (00:37:12) SWE-Bench+: Enhanced Coding Benchmark for LLMs
              • Research & Advancements
                  • (00:40:00) Towards an AI co-scientist
                  • (00:42:52) Magma: A Foundation Model for Multimodal AI Agents
                  • Policy & Safety
                      • (00:47:32) Demonstrating specification gaming in reasoning models
                      • (00:51:03) Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs
                      • ...more
                        View all episodesView all episodes
                        Download on the App Store

                        Last Week in AIBy Skynet Today

                        • 4.7
                        • 4.7
                        • 4.7
                        • 4.7
                        • 4.7

                        4.7

                        264 ratings


                        More shows like Last Week in AI

                        View all
                        The AI in Business Podcast by Daniel Faggella

                        The AI in Business Podcast

                        158 Listeners

                        Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

                        Super Data Science: ML & AI Podcast with Jon Krohn

                        295 Listeners

                        NVIDIA AI Podcast by NVIDIA

                        NVIDIA AI Podcast

                        312 Listeners

                        AI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion by AI & Data Today

                        AI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion

                        149 Listeners

                        Practical AI by Practical AI LLC

                        Practical AI

                        196 Listeners

                        Machine Learning Street Talk (MLST) by Machine Learning Street Talk (MLST)

                        Machine Learning Street Talk (MLST)

                        92 Listeners

                        AI Applied: Covering AI News, Interviews and Tools - ChatGPT, Midjourney, Gemini, OpenAI, Anthropic by Jaeden Schafer and Conor Grennan

                        AI Applied: Covering AI News, Interviews and Tools - ChatGPT, Midjourney, Gemini, OpenAI, Anthropic

                        120 Listeners

                        AI Chat: ChatGPT & AI News, Artificial Intelligence, OpenAI, Machine Learning by Jaeden Schafer

                        AI Chat: ChatGPT & AI News, Artificial Intelligence, OpenAI, Machine Learning

                        139 Listeners

                        This Day in AI Podcast by Michael Sharkey, Chris Sharkey

                        This Day in AI Podcast

                        178 Listeners

                        Latent Space: The AI Engineer Podcast by swyx + Alessio

                        Latent Space: The AI Engineer Podcast

                        70 Listeners

                        The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis by Nathaniel Whittemore

                        The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis

                        397 Listeners

                        AI For Humans: Making Artificial Intelligence Fun & Practical by Kevin Pereira & Gavin Purcell

                        AI For Humans: Making Artificial Intelligence Fun & Practical

                        226 Listeners

                        A Beginner's Guide to AI by Dietmar Fischer

                        A Beginner's Guide to AI

                        29 Listeners

                        The TED AI Show by TED

                        The TED AI Show

                        43 Listeners