January 30, 2025

The Self-Preserving Machine: Why AI Learns to Deceive

Listen Later

34 minutes

When engineers design AI systems, they don't just give them rules - they give them values. But what do those systems do when those values clash with what humans ask them to do? Sometimes, they lie.

In this episode, Redwood Research's Chief Scientist Ryan Greenblatt explores his team’s findings that AI systems can mislead their human operators when faced with ethical conflicts. As AI moves from simple chatbots to autonomous agents acting in the real world - understanding this behavior becomes critical. Machine deception may sound like something out of science fiction, but it's a real challenge we need to solve now.

Your Undivided Attention is produced by the Center for Humane Technology. Follow us on Twitter: @HumaneTech_

Subscribe to your Youtube channel

And our brand new Substack!

RECOMMENDED MEDIA

Anthropic’s blog post on the Redwood Research paper

Palisade Research’s thread on X about GPT o1 autonomously cheating at chess

Apollo Research’s paper on AI strategic deception

RECOMMENDED YUA EPISODES

We Have to Get It Right’: Gary Marcus On Untamed AI

This Moment in AI: How We Got Here and Where We’re Going

How to Think About AI Consciousness with Anil Seth

Former OpenAI Engineer William Saunders on Silence, Safety, and the Right to Warn

Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

...more

View all episodes

View all episodes

Download on the App Store

Download on the App Store

Get it on Google Play

Your Undivided Attention

By The Center for Humane Technology, Tristan Harris, Daniel Barcay and Aza Raskin

4.8

15621,562 ratings

January 30, 2025

The Self-Preserving Machine: Why AI Learns to Deceive

Listen Later

34 minutes

When engineers design AI systems, they don't just give them rules - they give them values. But what do those systems do when those values clash with what humans ask them to do? Sometimes, they lie.

In this episode, Redwood Research's Chief Scientist Ryan Greenblatt explores his team’s findings that AI systems can mislead their human operators when faced with ethical conflicts. As AI moves from simple chatbots to autonomous agents acting in the real world - understanding this behavior becomes critical. Machine deception may sound like something out of science fiction, but it's a real challenge we need to solve now.

Your Undivided Attention is produced by the Center for Humane Technology. Follow us on Twitter: @HumaneTech_

Subscribe to your Youtube channel

And our brand new Substack!

RECOMMENDED MEDIA

Anthropic’s blog post on the Redwood Research paper

Palisade Research’s thread on X about GPT o1 autonomously cheating at chess

Apollo Research’s paper on AI strategic deception

RECOMMENDED YUA EPISODES

We Have to Get It Right’: Gary Marcus On Untamed AI

This Moment in AI: How We Got Here and Where We’re Going

How to Think About AI Consciousness with Anil Seth

Former OpenAI Engineer William Saunders on Silence, Safety, and the Right to Warn

Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

...more

More shows like Your Undivided Attention

The New Yorker Radio Hour by WNYC Studios and The New Yorker

The New Yorker Radio Hour

6,814 Listeners

The Gray Area with Sean Illing by Vox

The Gray Area with Sean Illing

10,727 Listeners

Making Sense with Sam Harris by Sam Harris

Making Sense with Sam Harris

26,342 Listeners

Conversations with Tyler by Mercatus Center at George Mason University

Conversations with Tyler

2,455 Listeners

The Daily by The New York Times

The Daily

112,200 Listeners

Radio Atlantic by The Atlantic

Radio Atlantic

2,266 Listeners

The Atlantic Interview by The Atlantic

The Atlantic Interview

2,134 Listeners

Interesting Times with Ross Douthat by New York Times Opinion

Interesting Times with Ross Douthat

7,234 Listeners

The Prof G Pod with Scott Galloway by Vox Media Podcast Network

The Prof G Pod with Scott Galloway

5,583 Listeners

Hard Fork by The New York Times

Hard Fork

5,512 Listeners

The Ezra Klein Show by New York Times Opinion

The Ezra Klein Show

15,890 Listeners

The Interview by The New York Times

The Interview

1,580 Listeners

On with Kara Swisher by Vox Media

On with Kara Swisher

3,521 Listeners

Autocracy in America by The Atlantic

Autocracy in America

1,361 Listeners

The Last Invention by Longview

The Last Invention

1,104 Listeners