November 09, 2024

“LLMs Look Increasingly Like General Reasoners” by eggsyntax

6 minutes

Summary

Four months after my post 'LLM Generality is a Timeline Crux', results on o1-preview update me significantly toward LLMs being capable of general reasoning, and hence of scaling straight to AGI.

Previous post

In June of 2024, I made a post, 'LLM Generality is a Timeline Crux', in which I argue

Reasons to update

In the original post, I gave the three main pieces of evidence against LLMs doing general reasoning that I found most compelling: blocksworld, planning/scheduling, and ARC-AGI (see original for details). All three of those seem significantly weakened in light of recent research.

Most dramatically, a new paper on blocksworld has recently been published by some of the same highly LLM-skeptical researchers (Valmeekam et al, led by Subbarao Kambhampati: 'LLMs Still Can’t Plan; Can LRMs? A Preliminary Evaluation of Openai’S O1 on Planbench'. TODO

The planning/scheduling evidence seemed weaker almost immediately after the post [...]

---

Outline:

(00:04) Summary

(00:21) Previous post

(00:32) Reasons to update

(03:30) Updated probability estimates

---

First published:

November 7th, 2024

Source:

https://www.lesswrong.com/posts/wN4oWB4xhiiHJF9bS/llms-look-increasingly-like-general-reasoners

---

Narrated by TYPE III AUDIO.

...more

View all episodes

By LessWrong