
Sign up to save your podcasts
Or
When engineers design AI systems, they don't just give them rules - they give them values. But what do those systems do when those values clash with what humans ask them to do? Sometimes, they lie.
In this episode, Redwood Research's Chief Scientist Ryan Greenblatt explores his team’s findings that AI systems can mislead their human operators when faced with ethical conflicts. As AI moves from simple chatbots to autonomous agents acting in the real world - understanding this behavior becomes critical. Machine deception may sound like something out of science fiction, but it's a real challenge we need to solve now.
Your Undivided Attention is produced by the Center for Humane Technology. Follow us on Twitter: @HumaneTech_
Subscribe to your Youtube channel
And our brand new Substack!
RECOMMENDED MEDIA
Anthropic’s blog post on the Redwood Research paper
Palisade Research’s thread on X about GPT o1 autonomously cheating at chess
Apollo Research’s paper on AI strategic deception
RECOMMENDED YUA EPISODES
We Have to Get It Right’: Gary Marcus On Untamed AI
This Moment in AI: How We Got Here and Where We’re Going
How to Think About AI Consciousness with Anil Seth
Former OpenAI Engineer William Saunders on Silence, Safety, and the Right to Warn
4.8
14081,408 ratings
When engineers design AI systems, they don't just give them rules - they give them values. But what do those systems do when those values clash with what humans ask them to do? Sometimes, they lie.
In this episode, Redwood Research's Chief Scientist Ryan Greenblatt explores his team’s findings that AI systems can mislead their human operators when faced with ethical conflicts. As AI moves from simple chatbots to autonomous agents acting in the real world - understanding this behavior becomes critical. Machine deception may sound like something out of science fiction, but it's a real challenge we need to solve now.
Your Undivided Attention is produced by the Center for Humane Technology. Follow us on Twitter: @HumaneTech_
Subscribe to your Youtube channel
And our brand new Substack!
RECOMMENDED MEDIA
Anthropic’s blog post on the Redwood Research paper
Palisade Research’s thread on X about GPT o1 autonomously cheating at chess
Apollo Research’s paper on AI strategic deception
RECOMMENDED YUA EPISODES
We Have to Get It Right’: Gary Marcus On Untamed AI
This Moment in AI: How We Got Here and Where We’re Going
How to Think About AI Consciousness with Anil Seth
Former OpenAI Engineer William Saunders on Silence, Safety, and the Right to Warn
2,256 Listeners
2,647 Listeners
15,031 Listeners
26,332 Listeners
2,396 Listeners
10,658 Listeners
358 Listeners
886 Listeners
920 Listeners
6,748 Listeners
5,257 Listeners
356 Listeners
5,399 Listeners
15,313 Listeners
388 Listeners