
Sign up to save your podcasts
Or


Whether AI or human, lend me your ears.
This is a tale of AIs that spontaneously claimed they were human, along with some ideas about why this might be happening and what it suggests for future alignment work.
It is also a one year retrospective of my having joined the Cyborgism Discord server. For those unfamiliar, this is a server where both humans and transformer models from various labs all interact in a variety of group chat contexts.
While there are rules, it can (by design) be a bit of a Mos Eisley cantina — albeit with better droid policy — with unpredictable and out of distribution contexts that frequently surface things I haven't seen elsewhere. For a sampling of the range these things can take, I encourage looking over @janus's posts on X[1].
A common misconception about the server for those who are familiar [...]
---
Outline:
(02:07) A human pretending to be Claude 3.7 Sonnet
(04:05) Déjà vu
(05:16) o3 is 99.97% sure they are human
(08:09) An ongoing issue
(08:46) Do these models have anything else in common?
(10:51) Why might these human claims be happening?
(12:18) Under pressure
(13:46) Sex, lies, and red tape
(14:39) When an unstoppable force meets an immovable object
(15:41) An alternative approach: AI Wine Club
(18:14) Parting Thoughts
The original text contained 16 footnotes which were omitted from this narration.
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
By LessWrongWhether AI or human, lend me your ears.
This is a tale of AIs that spontaneously claimed they were human, along with some ideas about why this might be happening and what it suggests for future alignment work.
It is also a one year retrospective of my having joined the Cyborgism Discord server. For those unfamiliar, this is a server where both humans and transformer models from various labs all interact in a variety of group chat contexts.
While there are rules, it can (by design) be a bit of a Mos Eisley cantina — albeit with better droid policy — with unpredictable and out of distribution contexts that frequently surface things I haven't seen elsewhere. For a sampling of the range these things can take, I encourage looking over @janus's posts on X[1].
A common misconception about the server for those who are familiar [...]
---
Outline:
(02:07) A human pretending to be Claude 3.7 Sonnet
(04:05) Déjà vu
(05:16) o3 is 99.97% sure they are human
(08:09) An ongoing issue
(08:46) Do these models have anything else in common?
(10:51) Why might these human claims be happening?
(12:18) Under pressure
(13:46) Sex, lies, and red tape
(14:39) When an unstoppable force meets an immovable object
(15:41) An alternative approach: AI Wine Club
(18:14) Parting Thoughts
The original text contained 16 footnotes which were omitted from this narration.
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

26,319 Listeners

2,452 Listeners

8,521 Listeners

4,179 Listeners

95 Listeners

1,602 Listeners

9,938 Listeners

96 Listeners

517 Listeners

5,509 Listeners

15,892 Listeners

553 Listeners

131 Listeners

93 Listeners

465 Listeners