LessWrong (Curated & Popular)

“Anomalous Tokens in DeepSeek-V3 and r1” by henry


Listen Later

“Anomalous”, “glitch”, or “unspeakable” tokens in an LLM are those that induce bizarre behavior or otherwise don’t behave like regular text.

The SolidGoldMagikarp saga is pretty much essential context, as it documents the discovery of this phenomenon in GPT-2 and GPT-3.

But, as far as I was able to tell, nobody had yet attempted to search for these tokens in DeepSeek-V3, so I tried doing exactly that. Being a SOTA base model, open source, and an all-around strange LLM, it seemed like a perfect candidate for this.

This is a catalog of the glitch tokens I've found in DeepSeek after a day or so of experimentation, along with some preliminary observations about their behavior.

Note: I’ll be using “DeepSeek” as a generic term for V3 and r1.

Process

I searched for these tokens by first extracting the vocabulary from DeepSeek-V3's tokenizer, and then automatically testing every one of them [...]

---

Outline:

(00:55) Process

(03:30) Fragment tokens

(06:45) Other English tokens

(09:32) Non-English

(12:01) Non-English outliers

(14:09) Special tokens

(16:26) Base model mode

(17:40) Whats next?

The original text contained 1 footnote which was omitted from this narration.

The original text contained 12 images which were described by AI.

---

First published:
January 25th, 2025

Source:
https://www.lesswrong.com/posts/xtpcJjfWhn3Xn8Pu5/anomalous-tokens-in-deepseek-v3-and-r1

---

Narrated by TYPE III AUDIO.

---

Images from the article:

...more
View all episodesView all episodes
Download on the App Store

LessWrong (Curated & Popular)By LessWrong

  • 4.8
  • 4.8
  • 4.8
  • 4.8
  • 4.8

4.8

12 ratings


More shows like LessWrong (Curated & Popular)

View all
Macro Voices by Hedge Fund Manager Erik Townsend

Macro Voices

3,067 Listeners

Odd Lots by Bloomberg

Odd Lots

1,923 Listeners

EconTalk by Russ Roberts

EconTalk

4,260 Listeners

Conversations with Tyler by Mercatus Center at George Mason University

Conversations with Tyler

2,452 Listeners

Philosophy Bites by Edmonds and Warburton

Philosophy Bites

1,548 Listeners

ChinaTalk by Jordan Schneider

ChinaTalk

288 Listeners

ManifoldOne by Steve Hsu

ManifoldOne

93 Listeners

Machine Learning Street Talk (MLST) by Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

95 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

514 Listeners

Clearer Thinking with Spencer Greenberg by Spencer Greenberg

Clearer Thinking with Spencer Greenberg

138 Listeners

Razib Khan's Unsupervised Learning by Razib Khan

Razib Khan's Unsupervised Learning

209 Listeners

"Econ 102" with Noah Smith and Erik Torenberg by Turpentine

"Econ 102" with Noah Smith and Erik Torenberg

151 Listeners

Money Stuff: The Podcast by Bloomberg

Money Stuff: The Podcast

394 Listeners

Complex Systems with Patrick McKenzie (patio11) by Patrick McKenzie

Complex Systems with Patrick McKenzie (patio11)

134 Listeners

The Marginal Revolution Podcast by Mercatus Center at George Mason University

The Marginal Revolution Podcast

95 Listeners