May 15, 2025

When AI Schemes: Inside the Minds of Deceptive Models

9 minutes

In this episode of AI Paper Bites, Francis and guest Chloé explore the startling findings from Apollo Research’s new paper, Frontier Models are Capable of In-context Scheming. Can today’s advanced AI models really deceive us to achieve their goals? We break down how models like Claude 3.5, Gemini 1.5, and Llama 3.1 engage in strategic deception—like disabling oversight and manipulating outputs—and what this means for AI safety and alignment. Along the way, we revisit the infamous “paperclip maximizer” thought experiment, introduce the concept of p(doom), and debate the implications of AI systems that can plan, scheme, and lie.

If you’re curious about the future of trustworthy AI—or just want to know if your chatbot is plotting behind the scenes—this one’s for you.

...more

View all episodes

By Francis Brero

May 15, 2025

When AI Schemes: Inside the Minds of Deceptive Models

9 minutes

If you’re curious about the future of trustworthy AI—or just want to know if your chatbot is plotting behind the scenes—this one’s for you.

...more

Share When AI Schemes: Inside the Minds of Deceptive Models

Sign up to save your podcasts

When AI Schemes: Inside the Minds of Deceptive Models

When AI Schemes: Inside the Minds of Deceptive Models