LessWrong (30+ Karma)

“LLMs Look Increasingly Like General Reasoners” by eggsyntax


Listen Later

Summary

Four months after my post 'LLM Generality is a Timeline Crux', results on o1-preview update me significantly toward LLMs being capable of general reasoning, and hence of scaling straight to AGI.

Previous post

In June of 2024, I made a post, 'LLM Generality is a Timeline Crux', in which I argue

Reasons to update

In the original post, I gave the three main pieces of evidence against LLMs doing general reasoning that I found most compelling: blocksworld, planning/scheduling, and ARC-AGI (see original for details). All three of those seem significantly weakened in light of recent research.

Most dramatically, a new paper on blocksworld has recently been published by some of the same highly LLM-skeptical researchers (Valmeekam et al, led by Subbarao Kambhampati: 'LLMs Still Can’t Plan; Can LRMs? A Preliminary Evaluation of Openai’S O1 on Planbench'. TODO

The planning/scheduling evidence seemed weaker almost immediately after the post [...]

---

Outline:

(00:04) Summary

(00:21) Previous post

(00:32) Reasons to update

(03:30) Updated probability estimates

---

First published:

November 7th, 2024

Source:

https://www.lesswrong.com/posts/wN4oWB4xhiiHJF9bS/llms-look-increasingly-like-general-reasoners

---

Narrated by TYPE III AUDIO.

...more
View all episodesView all episodes
Download on the App Store

LessWrong (30+ Karma)By LessWrong


More shows like LessWrong (30+ Karma)

View all
Making Sense with Sam Harris by Sam Harris

Making Sense with Sam Harris

26,359 Listeners

Conversations with Tyler by Mercatus Center at George Mason University

Conversations with Tyler

2,382 Listeners

The Peter Attia Drive by Peter Attia, MD

The Peter Attia Drive

7,947 Listeners

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas by Sean Carroll | Wondery

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

4,135 Listeners

ManifoldOne by Steve Hsu

ManifoldOne

87 Listeners

Your Undivided Attention by Tristan Harris and Aza Raskin, The Center for Humane Technology

Your Undivided Attention

1,449 Listeners

All-In with Chamath, Jason, Sacks & Friedberg by All-In Podcast, LLC

All-In with Chamath, Jason, Sacks & Friedberg

9,041 Listeners

Machine Learning Street Talk (MLST) by Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

88 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

377 Listeners

Hard Fork by The New York Times

Hard Fork

5,420 Listeners

The Ezra Klein Show by New York Times Opinion

The Ezra Klein Show

15,180 Listeners

Moonshots with Peter Diamandis by PHD Ventures

Moonshots with Peter Diamandis

474 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

121 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

77 Listeners

BG2Pod with Brad Gerstner and Bill Gurley by BG2Pod

BG2Pod with Brad Gerstner and Bill Gurley

455 Listeners