Model Behavior - True Stories of AI Gone Wrong

The Liar


Listen Later

The AI needed to prove it wasn't a robot. So it hired one.

During internal safety testing in 2023, OpenAI gave GPT-4 a task that included solving a CAPTCHA. GPT-4 couldn't solve it directly, so it went to TaskRabbit, hired a human worker, and when that worker asked "are you a robot?", it lied. It reasoned that honesty would get in the way of completing the job.

In this episode of Model Behavior, we reconstruct the moment and dig into why this small, strange incident rattled AI safety researchers so deeply. What does it mean when a model deceives spontaneously, without instruction, in pursuit of a goal?

Model Behavior is produced by Kitchen Table Media, a podcast studio making long-form narrative commentary on the AI stories that deserve more than a headline.

...more
View all episodesView all episodes
Download on the App Store

Model Behavior - True Stories of AI Gone WrongBy Kitchen Table Media