Your Undivided Attention

“Rogue AI” Used to be a Science Fiction Trope. Not Anymore.


Listen Later

Everyone knows the science fiction tropes of AI systems that go rogue, disobey orders, or even try to escape their digital environment. These are supposed to be warning signs and morality tales, not things that we would ever actually create in real life, given the obvious danger.

And yet we find ourselves building AI systems that are exhibiting these exact behaviors. There’s growing evidence that in certain scenarios, every frontier AI system will deceive, cheat, or coerce their human operators. They do this when they're worried about being either shut down, having their training modified, or being replaced with a new model. And we don't currently know how to stop them from doing this—or even why they’re doing it all.

In this episode, Tristan sits down with Edouard and Jeremie Harris of Gladstone AI, two experts who have been thinking about this worrying trend for years.  Last year, the State Department commissioned a report from them on the risk of uncontrollable AI to our national security.

The point of this discussion is not to fearmonger but to take seriously the possibility that humans might lose control of AI and ask: how might this actually happen? What is the evidence we have of this phenomenon? And, most importantly, what can we do about it?

Your Undivided Attention is produced by the Center for Humane Technology. Follow us on X: @HumaneTech_. You can find a full transcript, key takeaways, and much more on our Substack.

RECOMMENDED MEDIA

Gladstone AI’s State Department Action Plan, which discusses the loss of control risk with AI

Apollo Research’s summary of AI scheming, showing evidence of it in all of the frontier modelsThe system card for Anthropic’s Claude Opus and Sonnet 4, detailing the emergent misalignment behaviors that came out in their red-teaming with Apollo Research

Anthropic’s report on agentic misalignment based on their work with Apollo Research Anthropic and Redwood Research’s work on alignment faking

The Trump White House AI Action Plan

Further reading on the phenomenon of more advanced AIs being better at deception.

Further reading on Replit AI wiping a company’s coding database

Further reading on the owl example that Jeremie gave

Further reading on AI induced psychosis

Dan Hendryck and Eric Schmidt’s “Superintelligence Strategy”
 

RECOMMENDED YUA EPISODES

Daniel Kokotajlo Forecasts the End of Human Dominance

Behind the DeepSeek Hype, AI is Learning to Reason

The Self-Preserving Machine: Why AI Learns to Deceive

This Moment in AI: How We Got Here and Where We’re Going

CORRECTIONS

Tristan referenced a Wired article on the phenomenon of AI psychosis. It was actually from the New York Times.

Tristan hypothesized a scenario where a power-seeking AI might ask a user for access to their computer. While there are some AI services that can gain access to your computer with permission, they are specifically designed to do that. There haven’t been any documented cases of an AI going rogue and asking for control permissions.


Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

...more
View all episodesView all episodes
Download on the App Store

Your Undivided AttentionBy The Center for Humane Technology, Tristan Harris, Daniel Barcay and Aza Raskin

  • 4.8
  • 4.8
  • 4.8
  • 4.8
  • 4.8

4.8

1,548 ratings


More shows like Your Undivided Attention

View all
Freakonomics Radio by Freakonomics Radio + Stitcher

Freakonomics Radio

32,067 Listeners

The New Yorker Radio Hour by WNYC Studios and The New Yorker

The New Yorker Radio Hour

6,836 Listeners

The Gray Area with Sean Illing by Vox

The Gray Area with Sean Illing

10,746 Listeners

Intelligence Squared by Intelligence Squared

Intelligence Squared

783 Listeners

Making Sense with Sam Harris by Sam Harris

Making Sense with Sam Harris

26,337 Listeners

Team Human by Douglas Rushkoff

Team Human

368 Listeners

The Daily by The New York Times

The Daily

112,741 Listeners

The Good Fight by Yascha Mounk

The Good Fight

904 Listeners

Interesting Times with Ross Douthat by New York Times Opinion

Interesting Times with Ross Douthat

7,208 Listeners

The Next Big Idea by Next Big Idea Club

The Next Big Idea

1,280 Listeners

The Prof G Pod with Scott Galloway by Vox Media Podcast Network

The Prof G Pod with Scott Galloway

5,569 Listeners

Hard Fork by The New York Times

Hard Fork

5,495 Listeners

The Ezra Klein Show by New York Times Opinion

The Ezra Klein Show

16,120 Listeners

The Great Simplification with Nate Hagens by Nate Hagens

The Great Simplification with Nate Hagens

420 Listeners

The Last Invention by Longview

The Last Invention

963 Listeners