March 10, 2026

“Letting Claude do Autonomous Research to Improve SAEs” by chanind

16 minutes

This work was done as part of MATS 7.1

I pointed Claude at our new synthetic Sparse Autoencoder benchmark, told it to improve Sparse Autoencoder (SAE) performance, and left it running overnight. By morning, it had boosted F1 score from 0.88 to 0.95. Within another day, with occasional input from me, it had matched the logistic regression probe ceiling of 0.97 -- a score I honestly hadn't thought was possible for an SAE on this benchmark.

The most surprising development was when Claude autonomously found a dictionary-learning paper from 2010, turned its algorithm into an SAE encoder, and Matryoshka-ified it, improving performance by a few percentage points in the process. I had never heard of this algorithm before (although I really should have).

In this post, I'll describe the setup, walk through the improvements Claude found, and discuss what this experiment taught me about the strengths and weaknesses of autonomous AI research.

We haven't yet verified how well these improvements transfer to LLM SAEs, so don't rush to implement every change mentioned here into your SAEs just yet! We'll discuss challenges and next-steps for LLM verification at the end of the post.

The TASK.md we gave Claude and resulting [...]

---

Outline:

(01:50) The setup

(03:17) SAE improvements

(04:15) Diving deeper: LISTA encoder

(06:42) Validating on LLMs with SAEBench

(07:58) Claudes research strengths and weaknesses

(11:07) Next steps

(12:15) Give it a try!

(12:32) Appendix: Improvement details

(12:38) Linearly decrease K during training

(13:06) Detach inner Matryoshka levels, but not the final level

(14:00) LISTA encoder

(14:59) TERM loss

(15:54) Dynamic Matryoshka levels by firing frequency

---

First published:

March 10th, 2026

Source:

https://www.lesswrong.com/posts/rbqJoxFZtae9x93mx/letting-claude-do-autonomous-research-to-improve-saes

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

...more

View all episodes

By LessWrong

March 10, 2026

“Letting Claude do Autonomous Research to Improve SAEs” by chanind

16 minutes

This work was done as part of MATS 7.1

In this post, I'll describe the setup, walk through the improvements Claude found, and discuss what this experiment taught me about the strengths and weaknesses of autonomous AI research.

The TASK.md we gave Claude and resulting [...]

---

Outline:

(01:50) The setup

(03:17) SAE improvements

(04:15) Diving deeper: LISTA encoder

(06:42) Validating on LLMs with SAEBench

(07:58) Claudes research strengths and weaknesses

(11:07) Next steps

(12:15) Give it a try!

(12:32) Appendix: Improvement details

(12:38) Linearly decrease K during training

(13:06) Detach inner Matryoshka levels, but not the final level

(14:00) LISTA encoder

(14:59) TERM loss

(15:54) Dynamic Matryoshka levels by firing frequency

---

First published:

March 10th, 2026

Source:

https://www.lesswrong.com/posts/rbqJoxFZtae9x93mx/letting-claude-do-autonomous-research-to-improve-saes

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

...more

More shows like LessWrong (30+ Karma)

View all

The Daily

112,326 Listeners

Astral Codex Ten Podcast

130 Listeners

Interesting Times with Ross Douthat

7,242 Listeners

Dwarkesh Podcast

559 Listeners

The Ezra Klein Show

16,321 Listeners

AI Article Readings

4 Listeners

Doom Debates!

14 Listeners

LessWrong posts by zvi

2 Listeners

Share “Letting Claude do Autonomous Research to Improve SAEs” by chanind

Sign up to save your podcasts

“Letting Claude do Autonomous Research to Improve SAEs” by chanind

“Letting Claude do Autonomous Research to Improve SAEs” by chanind

More shows like LessWrong (30+ Karma)

The Daily

Astral Codex Ten Podcast

Interesting Times with Ross Douthat

Dwarkesh Podcast

The Ezra Klein Show

AI Article Readings

Doom Debates!

LessWrong posts by zvi