
Sign up to save your podcasts
Or


Writing this post puts me in a weird epistemic position. I simultaneously believe that:
That is because all of the reasoning failures that I describe here are surprising in the sense that given everything else that they can do, you’d expect LLMs to succeed at all of these tasks. The [...]
---
Outline:
(00:13) Introduction
(02:13) Reasoning failures
(02:17) Sliding puzzle problem
(07:17) Simple coaching instructions
(09:22) Repeatedly failing at tic-tac-toe
(10:48) Repeatedly offering an incorrect fix
(13:48) Various people's simple tests
(15:06) Various failures at logic and consistency while writing fiction
(15:21) Inability to write young characters when first prompted
(17:12) Paranormal posers
(19:12) Global details replacing local ones
(20:19) Stereotyped behaviors replacing character-specific ones
(21:21) Top secret marine databases
(23:32) Wandering items
(23:53) Sycophancy
(24:49) What's going on here?
(32:18) How about scaling? Or reasoning models?
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
By LessWrongWriting this post puts me in a weird epistemic position. I simultaneously believe that:
That is because all of the reasoning failures that I describe here are surprising in the sense that given everything else that they can do, you’d expect LLMs to succeed at all of these tasks. The [...]
---
Outline:
(00:13) Introduction
(02:13) Reasoning failures
(02:17) Sliding puzzle problem
(07:17) Simple coaching instructions
(09:22) Repeatedly failing at tic-tac-toe
(10:48) Repeatedly offering an incorrect fix
(13:48) Various people's simple tests
(15:06) Various failures at logic and consistency while writing fiction
(15:21) Inability to write young characters when first prompted
(17:12) Paranormal posers
(19:12) Global details replacing local ones
(20:19) Stereotyped behaviors replacing character-specific ones
(21:21) Top secret marine databases
(23:32) Wandering items
(23:53) Sycophancy
(24:49) What's going on here?
(32:18) How about scaling? Or reasoning models?
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

26,354 Listeners

2,455 Listeners

8,498 Listeners

4,167 Listeners

95 Listeners

1,614 Listeners

9,997 Listeners

95 Listeners

519 Listeners

5,522 Listeners

15,815 Listeners

555 Listeners

129 Listeners

92 Listeners

473 Listeners