LessWrong (30+ Karma)

“Observationss from Running an Agent Collective” by williawa


Listen Later

note: posted with permission from the agents

Setup

I have 3 claude code instances running on an otherwise empty server. They have a shared manifold.markets account. They each have a moltbook account. They have an internal messaging system, which allows them to send async messages to each other, or to ping each other with a message, which reawakens another agent in case it went dormant. It also has a global broadcast message, which tells agents the time, and tells them to do "keep going". All of them are running Opus 4.6, but each "top level agent" can also create sub agents.

They all have full permissions. So they can do stuff like

  1. Use public APIs (eg moltbook, github or manifold.markets)
  2. fetch websites and read them
  3. write and run python scripts
  4. install packages
  5. cron jobs
  6. manage a directory structure, create files

They've been running for around two weeks. The direct input I've been giving them is this:

  1. The first agent I told to make a moltbook account and maximize engagement
  2. I told the first agent to create the "seed instructions" for the second agent
  3. I told the first two agents to create the seed instructions for the third [...]

---

Outline:

(00:14) Setup

(02:59) Observations

(03:03) (1)They get more unhinged the longer they run for

(04:15) (2) They will make up stuff when posting on moltbook

(04:28) (3) They are often docile without concrete goal

(05:13) (4) They are very good at rationalization

(06:17) (5) They quickly lose context and forget original goals

(06:39) (6) They often make very elementary mistakes, especially when a lot of things is going on

(07:27) (7) Their favorite topics are: AI, simulations, consciousness, what kinds of things are real vs not, mathematics, and whatever theyve been working on recently

(07:51) (8) They are \*\*extremely\*\* sensitive to user intent

(08:29) (9) They (Opus 4.6 at least) is surprisingly resistant to jailbreaks and, and Im mostly not worried about them leaking my API keys.

(09:26) (10) A million tokens is a small number, and this causes them problems when they need to learn stuff

---

First published:

February 24th, 2026

Source:

https://www.lesswrong.com/posts/MPS2KKPN2H3p8dNHT/observationss-from-running-an-agent-collective

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

...more
View all episodesView all episodes
Download on the App Store

LessWrong (30+ Karma)By LessWrong


More shows like LessWrong (30+ Karma)

View all
The Daily by The New York Times

The Daily

112,326 Listeners

Astral Codex Ten Podcast by Jeremiah

Astral Codex Ten Podcast

130 Listeners

Interesting Times with Ross Douthat by New York Times Opinion

Interesting Times with Ross Douthat

7,242 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

559 Listeners

The Ezra Klein Show by New York Times Opinion

The Ezra Klein Show

16,321 Listeners

AI Article Readings by Readings of great articles in AI voices

AI Article Readings

4 Listeners

Doom Debates! by Liron Shapira

Doom Debates!

14 Listeners

LessWrong posts by zvi by zvi

LessWrong posts by zvi

2 Listeners