AI Explained Official Podcast

Gemini 2.5 Pro - It’s a Smart Chatbot … (New Simple High Score)


Listen Later

Gemini gets a new record on Simple Bench, and several other benchmarks. I’ll go deep to explore its nuances, including how it deceptively reverse engineers answers, does better on certain coding benchmarks than others, may have a universal ‘conceptual language’ …

https://weave-docs.wandb.ai/?utm_source=sponsorship&utm_medium=simple_bench&utm_campaign=ai_explained

… and more. Plus practical tips, a note on security and Kling vs Veo 2 guest appearance.


AI Insiders ($9!): https://www.patreon.com/AIExplained

Chapters:
00:00 - Introduction
00:36 - Fiction Bench
02:41 - Practicality - YouTube urls + Security - cut-off date
03:42 - Coding 
06:22 - WeirdML Bench
07:01 - Simple Bench Record High 
11:23 - Reverse Engineering!
13:22 - Anthropic Paper
17:49 - 3 Caveats

Gemini 2.5 Updated: https://deepmind.google/technologies/gemini/

Fiction Live Bench: https://fiction.live/stories/Fiction-liveBench-Feb-19-2025/oQdzQvKHw8JyXbN87

https://simple-bench.com/

WeirdML: https://htihle.github.io/weirdml.html
https://x.com/htihle/status/1905014058228625542

Anthropic Thoughts: https://www.anthropic.com/research/tracing-thoughts-language-model
https://transformer-circuits.pub/2025/attribution-graphs/biology.html#dives-cot

https://aistudio.google.com/prompts/new_chat

Search Study: https://www.cjr.org/tow_center/we-compared-eight-ai-search-engines-theyre-all-bad-at-citing-news.php

Live bench: https://livebench.ai/#/
Paper: https://arxiv.org/pdf/2406.19314

LiveCode Bench: https://livecodebench.github.io/

SWE-Verified: https://arxiv.org/pdf/2310.06770


Non-hype Newsletter: https://signaltonoise.beehiiv.com/

...more
View all episodesView all episodes
Download on the App Store

AI Explained Official PodcastBy Philip - Host of AI Explained YT

  • 3.1
  • 3.1
  • 3.1
  • 3.1
  • 3.1

3.1

9 ratings


More shows like AI Explained Official Podcast

View all
NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

348 Listeners

Google DeepMind: The Podcast by Hannah Fry

Google DeepMind: The Podcast

201 Listeners

Last Week in AI by Skynet Today

Last Week in AI

310 Listeners

Machine Learning Street Talk (MLST) by Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

98 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

529 Listeners

Big Technology Podcast by Alex Kantrowitz

Big Technology Podcast

512 Listeners

Hard Fork by The New York Times

Hard Fork

5,548 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

142 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

98 Listeners

This Day in AI Podcast by Michael Sharkey, Chris Sharkey

This Day in AI Podcast

226 Listeners

The AI Daily Brief: Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief: Artificial Intelligence News and Analysis

638 Listeners

Everyday AI Podcast – An AI and ChatGPT Podcast by Everyday AI

Everyday AI Podcast – An AI and ChatGPT Podcast

106 Listeners

Money Stuff: The Podcast by Bloomberg

Money Stuff: The Podcast

403 Listeners

AI Explored by Michael Stelzner, Social Media Examiner—AI marketing

AI Explored

99 Listeners

How I AI by Claire Vo

How I AI

151 Listeners