February 12, 2025

Pop Quiz, AI: How Do You Test a Thinking Machine?

Listen Later

6 minutes

AI keeps bragging about its "reasoning skills," but is it actually getting smarter, or just better at faking it? In this episode, we put AI’s so-called intelligence to the test with hardcore benchmarks—BIG-Bench HARD, TruthfulQA, and more—to see if these models can truly problem-solve or if they're just memorizing answers like a sneaky high schooler. Spoiler: Not all AIs are built the same, and some are way better at bluffing than thinking. Tune in to find out who’s the real deal and who’s just a smooth talker.

Connect with Us: If you enjoyed this episode or have questions, reach out to Emily Laird on LinkedIn. Stay tuned for more insights into the evolving world of generative AI. And remember, you now know more about reasoning models than you did before!

Connect with Emily Laird on LinkedIn

...more

View all episodes

View all episodes

Download on the App Store

Download on the App Store

Get it on Google Play

Generative AI 101

By Emily Laird

4.6

2020 ratings

February 12, 2025

Pop Quiz, AI: How Do You Test a Thinking Machine?

Listen Later

6 minutes

AI keeps bragging about its "reasoning skills," but is it actually getting smarter, or just better at faking it? In this episode, we put AI’s so-called intelligence to the test with hardcore benchmarks—BIG-Bench HARD, TruthfulQA, and more—to see if these models can truly problem-solve or if they're just memorizing answers like a sneaky high schooler. Spoiler: Not all AIs are built the same, and some are way better at bluffing than thinking. Tune in to find out who’s the real deal and who’s just a smooth talker.

Connect with Us: If you enjoyed this episode or have questions, reach out to Emily Laird on LinkedIn. Stay tuned for more insights into the evolving world of generative AI. And remember, you now know more about reasoning models than you did before!

Connect with Emily Laird on LinkedIn

...more

More shows like Generative AI 101

Freakonomics Radio by Freakonomics Radio + Stitcher

Freakonomics Radio

32,110 Listeners

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch by Harry Stebbings

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

543 Listeners

WSJ Tech News Briefing by The Wall Street Journal

WSJ Tech News Briefing

1,655 Listeners

Up First from NPR by NPR

Up First from NPR

56,577 Listeners

The Diary Of A CEO with Steven Bartlett by DOAC

The Diary Of A CEO with Steven Bartlett

8,542 Listeners

Cybersecurity Today by Jim Love

Cybersecurity Today

179 Listeners

Practical AI by Practical AI LLC

Practical AI

213 Listeners

On Purpose with Jay Shetty by iHeartPodcasts

On Purpose with Jay Shetty

27,831 Listeners

Cautionary Tales with Tim Harford by Pushkin Industries

Cautionary Tales with Tim Harford

5,108 Listeners

All-In with Chamath, Jason, Sacks & Friedberg by All-In Podcast, LLC

All-In with Chamath, Jason, Sacks & Friedberg

10,178 Listeners

The Ezra Klein Show by New York Times Opinion

The Ezra Klein Show

16,215 Listeners

Networth and Chill with Your Rich BFF by Vivian Tu

Networth and Chill with Your Rich BFF

1,777 Listeners

The AI Daily Brief: Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief: Artificial Intelligence News and Analysis

693 Listeners

Everyday AI Podcast – An AI and ChatGPT Podcast by Everyday AI

Everyday AI Podcast – An AI and ChatGPT Podcast

111 Listeners

Generative AI Basics by Anand V

Generative AI Basics

1 Listeners