LessWrong (30+ Karma)

“Narrow finetuning is different” by cloud, Stewy Slocum


Listen Later

Epistemic status: an informal note.

It is common to use finetuning on a narrow data distribution, or narrow finetuning (NFT), to study AI safety. In these experiments, a model is trained on a very specific type of data, then evaluated for broader properties, such as a capability or general disposition.

Ways that narrow finetuning is different

Narrow finetuning is different than the training procedures that frontier AI companies use, like pretraining on the internet, or posttraining on a diverse mixture of data and tasks. Here are some ways it is different:

  1. Underspecification of broader behavior - training a model on a narrow data distribution means that most of the model's behavior (behavior outside the training distribution) is not incorporated in the loss. This means that all sorts of undesired, degenerate, or unusual behavior can arise that would normally be prevented by the loss function (e.g., as in emergent [...]

---

Outline:

(00:31) Ways that narrow finetuning is different

(02:08) Anecdote

(03:05) Examples

(03:37) Counterpoints

(04:54) Takeaways

The original text contained 1 footnote which was omitted from this narration.

---

First published:

August 5th, 2025

Source:

https://www.lesswrong.com/posts/7emjxGADozzm7uwKL/narrow-finetuning-is-different

---

Narrated by TYPE III AUDIO.

...more
View all episodesView all episodes
Download on the App Store

LessWrong (30+ Karma)By LessWrong


More shows like LessWrong (30+ Karma)

View all
The Daily by The New York Times

The Daily

112,586 Listeners

Astral Codex Ten Podcast by Jeremiah

Astral Codex Ten Podcast

130 Listeners

Interesting Times with Ross Douthat by New York Times Opinion

Interesting Times with Ross Douthat

7,219 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

531 Listeners

The Ezra Klein Show by New York Times Opinion

The Ezra Klein Show

16,096 Listeners

AI Article Readings by Readings of great articles in AI voices

AI Article Readings

4 Listeners

Doom Debates by Liron Shapira

Doom Debates

14 Listeners

LessWrong posts by zvi by zvi

LessWrong posts by zvi

2 Listeners