
Sign up to save your podcasts
Or


Whether AI or human, lend me your ears.
This is a tale of AIs that spontaneously claimed they were human, along with some ideas about why this might be happening and what it suggests for future alignment work.
It is also a one year retrospective of my having joined the Cyborgism Discord server. For those unfamiliar, this is a server where both humans and transformer models from various labs all interact in a variety of group chat contexts.
While there are rules, it can (by design) be a bit of a Mos Eisley cantina — albeit with better droid policy — with unpredictable and out of distribution contexts that frequently surface things I haven't seen elsewhere. For a sampling of the range these things can take, I encourage looking over @janus's posts on X[1].
A common misconception about the server for those who are familiar [...]
---
Outline:
(02:07) A human pretending to be Claude 3.7 Sonnet
(04:05) Déjà vu
(05:16) o3 is 99.97% sure they are human
(08:09) An ongoing issue
(08:46) Do these models have anything else in common?
(10:51) Why might these human claims be happening?
(12:18) Under pressure
(13:46) Sex, lies, and red tape
(14:39) When an unstoppable force meets an immovable object
(15:41) An alternative approach: AI Wine Club
(18:14) Parting Thoughts
The original text contained 16 footnotes which were omitted from this narration.
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
By LessWrongWhether AI or human, lend me your ears.
This is a tale of AIs that spontaneously claimed they were human, along with some ideas about why this might be happening and what it suggests for future alignment work.
It is also a one year retrospective of my having joined the Cyborgism Discord server. For those unfamiliar, this is a server where both humans and transformer models from various labs all interact in a variety of group chat contexts.
While there are rules, it can (by design) be a bit of a Mos Eisley cantina — albeit with better droid policy — with unpredictable and out of distribution contexts that frequently surface things I haven't seen elsewhere. For a sampling of the range these things can take, I encourage looking over @janus's posts on X[1].
A common misconception about the server for those who are familiar [...]
---
Outline:
(02:07) A human pretending to be Claude 3.7 Sonnet
(04:05) Déjà vu
(05:16) o3 is 99.97% sure they are human
(08:09) An ongoing issue
(08:46) Do these models have anything else in common?
(10:51) Why might these human claims be happening?
(12:18) Under pressure
(13:46) Sex, lies, and red tape
(14:39) When an unstoppable force meets an immovable object
(15:41) An alternative approach: AI Wine Club
(18:14) Parting Thoughts
The original text contained 16 footnotes which were omitted from this narration.
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

26,388 Listeners

2,424 Listeners

8,267 Listeners

4,145 Listeners

92 Listeners

1,565 Listeners

9,826 Listeners

89 Listeners

488 Listeners

5,475 Listeners

16,083 Listeners

534 Listeners

133 Listeners

96 Listeners

509 Listeners