February 12, 2024

“Skepticism about DeepMind’s ‘Grandmaster-Level’ Chess Without Search” by Arjun Panickssery

Listen Later

5 minutes

First, addressing misconceptions that could come from the title or from the paper's framing as relevant to LLM scaling:

The model didn't learn from observing many move traces from high-level games. Instead, they trained a 270M-parameter model to map _(text{board state}, text{legal move})_ pairs to Stockfish 16's predicted win probability after playing the move. This can be described as imitating the play of an oracle that reflects Stockfish's ability at 50 milliseconds per move. Then they evaluated a system that made the model's highest-value move for a given position.
The system resulted in some quirks that required workarounds during play.
1. The board states were encoded in FEN notation, which doesn't provide information about which previous board states have occurred; this is relevant in a small number of situations because players can claim an immediate draw when a board state is repeated three times.
2. The model is a classifier, so [...]

---

First published:

February 12th, 2024

Source:

https://www.lesswrong.com/posts/PXRi9FMrJjyBcEA3r/skepticism-about-deepmind-s-grandmaster-level-chess-without

---

Narrated by TYPE III AUDIO.

...more

View all episodes

View all episodes

Download on the App Store

Download on the App Store

Get it on Google Play

LessWrong (30+ Karma)

By LessWrong

February 12, 2024

“Skepticism about DeepMind’s ‘Grandmaster-Level’ Chess Without Search” by Arjun Panickssery

Listen Later

5 minutes

First, addressing misconceptions that could come from the title or from the paper's framing as relevant to LLM scaling:

The model didn't learn from observing many move traces from high-level games. Instead, they trained a 270M-parameter model to map _(text{board state}, text{legal move})_ pairs to Stockfish 16's predicted win probability after playing the move. This can be described as imitating the play of an oracle that reflects Stockfish's ability at 50 milliseconds per move. Then they evaluated a system that made the model's highest-value move for a given position.
The system resulted in some quirks that required workarounds during play.
1. The board states were encoded in FEN notation, which doesn't provide information about which previous board states have occurred; this is relevant in a small number of situations because players can claim an immediate draw when a board state is repeated three times.
2. The model is a classifier, so [...]

---

First published:

February 12th, 2024

Source:

https://www.lesswrong.com/posts/PXRi9FMrJjyBcEA3r/skepticism-about-deepmind-s-grandmaster-level-chess-without

---

Narrated by TYPE III AUDIO.

...more

More shows like LessWrong (30+ Karma)

The Daily by The New York Times

The Daily

113,164 Listeners

Astral Codex Ten Podcast by Jeremiah

Astral Codex Ten Podcast

130 Listeners

Interesting Times with Ross Douthat by New York Times Opinion

Interesting Times with Ross Douthat

7,255 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

535 Listeners

The Ezra Klein Show by New York Times Opinion

The Ezra Klein Show

16,266 Listeners

AI Article Readings by Readings of great articles in AI voices

AI Article Readings

4 Listeners

Doom Debates by Liron Shapira

Doom Debates

14 Listeners

LessWrong posts by zvi by zvi

LessWrong posts by zvi

2 Listeners