
Sign up to save your podcasts
Or


Writing this post puts me in a weird epistemic position. I simultaneously believe that:
That is because all of the reasoning failures that I describe here are surprising in the sense that given everything else that they can do, you’d expect LLMs to succeed at all of these tasks. The [...]
---
Outline:
(00:13) Introduction
(02:13) Reasoning failures
(02:17) Sliding puzzle problem
(07:17) Simple coaching instructions
(09:22) Repeatedly failing at tic-tac-toe
(10:48) Repeatedly offering an incorrect fix
(13:48) Various people's simple tests
(15:06) Various failures at logic and consistency while writing fiction
(15:21) Inability to write young characters when first prompted
(17:12) Paranormal posers
(19:12) Global details replacing local ones
(20:19) Stereotyped behaviors replacing character-specific ones
(21:21) Top secret marine databases
(23:32) Wandering items
(23:53) Sycophancy
(24:49) What's going on here?
(32:18) How about scaling? Or reasoning models?
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
By LessWrongWriting this post puts me in a weird epistemic position. I simultaneously believe that:
That is because all of the reasoning failures that I describe here are surprising in the sense that given everything else that they can do, you’d expect LLMs to succeed at all of these tasks. The [...]
---
Outline:
(00:13) Introduction
(02:13) Reasoning failures
(02:17) Sliding puzzle problem
(07:17) Simple coaching instructions
(09:22) Repeatedly failing at tic-tac-toe
(10:48) Repeatedly offering an incorrect fix
(13:48) Various people's simple tests
(15:06) Various failures at logic and consistency while writing fiction
(15:21) Inability to write young characters when first prompted
(17:12) Paranormal posers
(19:12) Global details replacing local ones
(20:19) Stereotyped behaviors replacing character-specific ones
(21:21) Top secret marine databases
(23:32) Wandering items
(23:53) Sycophancy
(24:49) What's going on here?
(32:18) How about scaling? Or reasoning models?
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

26,365 Listeners

2,437 Listeners

9,046 Listeners

4,153 Listeners

92 Listeners

1,595 Listeners

9,911 Listeners

90 Listeners

70 Listeners

5,470 Listeners

16,097 Listeners

536 Listeners

131 Listeners

95 Listeners

520 Listeners