Last Week in AI

#201 - GPT 4.5, Sonnet 3.7, Grok 3, Phi 4


Listen Later

Our 201st episode with a summary and discussion of last week's big AI news!

Recorded on 03/02/2025

Join our brand new Discord here! https://discord.gg/nTyezGSKwP

Hosted by Andrey Kurenkov and guest host Sharon Zhou

Feel free to email us your questions and feedback at [email protected] and/or [email protected]

Read out our text newsletter and comment on the podcast at https://lastweekin.ai/.

In this episode:

- The release of GPT-4.5 from OpenAI, Anthropic's Claude 3.7, and Grok 3 from XAI, comparing their features, costs, and capabilities. 

- Discussion on new tools and applications including Sesame's new voice assistant and Google's AI coding assistant, Gemini Code Assist, highlighting their unique benefits. 
- OpenAI's continued user growth despite competition, pricing models for Google's text-to-video platform, and HP acquiring and shutting down Humane's AI pin. 
- Insights into new research on alignment and specification gaming in LLMs, including papers on fine-tuning causing broad misalignment and Google's multi-agent system for scientific collaboration.

Timestamps + Links:

  • (00:00:00) Intro / Banter 
  • (00:01:36) News Preview
  • Tools & Apps
      • (00:02:33) OpenAI announces GPT-4.5, warns it’s not a frontier AI model
      • (00:07:22) Anthropic launches a new AI model that ‘thinks’ as long as you want
      • (00:11:14) New Grok 3 release tops LLM leaderboards
      • (00:16:43) Sesame is the first voice assistant I’ve ever wanted to talk to more than once
      • (00:18:30) Google launches a free AI coding assistant with very high usage caps
      • (00:20:45) Rabbit shows off the AI agent it should have launched with
      • (00:22:23) Mistral’s Le Chat tops 1M downloads in just 14 days
        • Applications & Business
          • (00:24:06) OpenAI Tops 400 Million Users Despite DeepSeek’s Emergence
          • (00:27:37) Google’s new AI video model Veo 2 will cost 50 cents per second
          • (00:29:52) HP is buying Humane and shutting down the AI Pin
            • Projects & Open Source
              • (00:31:44) Microsoft launches next-gen Phi AI models.
              • (00:33:47) OpenAI introduces SWE-Lancer: A Benchmark for Evaluating Model Performance on Real-World Freelance Software Engineering Work
              • (00:37:12) SWE-Bench+: Enhanced Coding Benchmark for LLMs
              • Research & Advancements
                  • (00:40:00) Towards an AI co-scientist
                  • (00:42:52) Magma: A Foundation Model for Multimodal AI Agents
                  • Policy & Safety
                      • (00:47:32) Demonstrating specification gaming in reasoning models
                      • (00:51:03) Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs
                      • See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

                        ...more
                        View all episodesView all episodes
                        Download on the App Store

                        Last Week in AIBy Skynet Today

                        • 4.6
                        • 4.6
                        • 4.6
                        • 4.6
                        • 4.6

                        4.6

                        300 ratings


                        More shows like Last Week in AI

                        View all
                        The AI in Business Podcast by Daniel Faggella

                        The AI in Business Podcast

                        174 Listeners

                        Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

                        Super Data Science: ML & AI Podcast with Jon Krohn

                        303 Listeners

                        NVIDIA AI Podcast by NVIDIA

                        NVIDIA AI Podcast

                        341 Listeners

                        AI Today Podcast by AI & Data Today

                        AI Today Podcast

                        154 Listeners

                        Practical AI by Practical AI LLC

                        Practical AI

                        213 Listeners

                        Machine Learning Street Talk (MLST) by Machine Learning Street Talk (MLST)

                        Machine Learning Street Talk (MLST)

                        90 Listeners

                        The Artificial Intelligence Show by Paul Roetzer and Mike Kaput

                        The Artificial Intelligence Show

                        188 Listeners

                        AI Chat: ChatGPT, AI News, Artificial Intelligence, OpenAI, Machine Learning by Jaeden Schafer

                        AI Chat: ChatGPT, AI News, Artificial Intelligence, OpenAI, Machine Learning

                        155 Listeners

                        This Day in AI Podcast by Michael Sharkey, Chris Sharkey

                        This Day in AI Podcast

                        209 Listeners

                        The AI Daily Brief: Artificial Intelligence News and Analysis by Nathaniel Whittemore

                        The AI Daily Brief: Artificial Intelligence News and Analysis

                        591 Listeners

                        AI For Humans: Making Artificial Intelligence Fun & Practical by Kevin Pereira & Gavin Purcell

                        AI For Humans: Making Artificial Intelligence Fun & Practical

                        268 Listeners

                        Everyday AI Podcast – An AI and ChatGPT Podcast by Everyday AI

                        Everyday AI Podcast – An AI and ChatGPT Podcast

                        104 Listeners

                        A Beginner's Guide to AI by Dietmar Fischer

                        A Beginner's Guide to AI

                        53 Listeners

                        AI Hustle: Make Money from AI and ChatGPT, Midjourney, NVIDIA, Anthropic, OpenAI by Jaeden Schafer and Jamie McCauley

                        AI Hustle: Make Money from AI and ChatGPT, Midjourney, NVIDIA, Anthropic, OpenAI

                        174 Listeners

                        The TED AI Show by TED

                        The TED AI Show

                        46 Listeners