July 29, 2024

Prof. Subbarao Kambhampati - LLMs don't reason, they memorize (ICML2024 2/13)

1 hour 42 minutes

Prof. Subbarao Kambhampati argues that while LLMs are impressive and useful tools, especially for creative tasks, they have fundamental limitations in logical reasoning and cannot provide guarantees about the correctness of their outputs. He advocates for hybrid approaches that combine LLMs with external verification systems.

MLST is sponsored by Brave:

The Brave Search API covers over 20 billion webpages, built from scratch without Big Tech biases or the recent extortionate price hikes on search API access. Perfect for AI model training and retrieval augmentated generation. Try it now - get 2,000 free queries monthly at http://brave.com/api.

TOC (sorry the ones baked into the MP3 were wrong apropos due to LLM hallucination!)

[00:00:00] Intro

[00:02:06] Bio

[00:03:02] LLMs are n-gram models on steroids

[00:07:26] Is natural language a formal language?

[00:08:34] Natural language is formal?

[00:11:01] Do LLMs reason?

[00:19:13] Definition of reasoning

[00:31:40] Creativity in reasoning

[00:50:27] Chollet's ARC challenge

[01:01:31] Can we reason without verification?

[01:10:00] LLMs cant solve some tasks

[01:19:07] LLM Modulo framework

[01:29:26] Future trends of architecture

[01:34:48] Future research directions

Youtube version: https://www.youtube.com/watch?v=y1WnHpedi2A

Refs: (we didn't have space for URLs here, check YT video description instead)

Can LLMs Really Reason and Plan?

On the Planning Abilities of Large Language Models : A Critical Investigation

Chain of Thoughtlessness? An Analysis of CoT in Planning

On the Self-Verification Limitations of Large Language Models on Reasoning and Planning Tasks

LLMs Can't Plan, But Can Help Planning in LLM-Modulo Frameworks

Embers of Autoregression: Understanding Large Language Models Through the Problem They are Trained to Solve

"Task Success" is not Enough

Partition function (number theory) (Srinivasa Ramanujan and G.H. Hardy's work)

Poincaré conjecture

Gödel's incompleteness theorems

ROT13 (Rotate13, "rotate by 13 places")

A Mathematical Theory of Communication (C. E. SHANNON)

Sparks of AGI

Kambhampati thesis on speech recognition (1983)

PlanBench: An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about Change

Explainable human-AI interaction

Tree of Thoughts

On the Measure of Intelligence (ARC Challenge)

Getting 50% (SoTA) on ARC-AGI with GPT-4o (Ryan Greenblatt ARC solution)

PROGRAMS WITH COMMON SENSE (John McCarthy) - "AI should be an advice taker program"

Original chain of thought paper

ICAPS 2024 Keynote: Dale Schuurmans on "Computing and Planning with Large Generative Models" (COT)

The Hardware Lottery (Hooker)

A Path Towards Autonomous Machine Intelligence (JEPA/LeCun)

AlphaGeometry

FunSearch

Emergent Abilities of Large Language Models

Language models are not naysayers (Negation in LLMs)

The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"

Embracing negative results

...more

View all episodes

By Machine Learning Street Talk (MLST)

4.6

9595 ratings

July 29, 2024

Prof. Subbarao Kambhampati - LLMs don't reason, they memorize (ICML2024 2/13)

1 hour 42 minutes

MLST is sponsored by Brave:

TOC (sorry the ones baked into the MP3 were wrong apropos due to LLM hallucination!)

[00:00:00] Intro

[00:02:06] Bio

[00:03:02] LLMs are n-gram models on steroids

[00:07:26] Is natural language a formal language?

[00:08:34] Natural language is formal?

[00:11:01] Do LLMs reason?

[00:19:13] Definition of reasoning

[00:31:40] Creativity in reasoning

[00:50:27] Chollet's ARC challenge

[01:01:31] Can we reason without verification?

[01:10:00] LLMs cant solve some tasks

[01:19:07] LLM Modulo framework

[01:29:26] Future trends of architecture

[01:34:48] Future research directions

Youtube version: https://www.youtube.com/watch?v=y1WnHpedi2A

Refs: (we didn't have space for URLs here, check YT video description instead)

Can LLMs Really Reason and Plan?

On the Planning Abilities of Large Language Models : A Critical Investigation

Chain of Thoughtlessness? An Analysis of CoT in Planning

On the Self-Verification Limitations of Large Language Models on Reasoning and Planning Tasks

LLMs Can't Plan, But Can Help Planning in LLM-Modulo Frameworks

Embers of Autoregression: Understanding Large Language Models Through the Problem They are Trained to Solve

"Task Success" is not Enough

Partition function (number theory) (Srinivasa Ramanujan and G.H. Hardy's work)

Poincaré conjecture

Gödel's incompleteness theorems

ROT13 (Rotate13, "rotate by 13 places")

A Mathematical Theory of Communication (C. E. SHANNON)

Sparks of AGI

Kambhampati thesis on speech recognition (1983)

PlanBench: An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about Change

Explainable human-AI interaction

Tree of Thoughts

On the Measure of Intelligence (ARC Challenge)

Getting 50% (SoTA) on ARC-AGI with GPT-4o (Ryan Greenblatt ARC solution)

PROGRAMS WITH COMMON SENSE (John McCarthy) - "AI should be an advice taker program"

Original chain of thought paper

ICAPS 2024 Keynote: Dale Schuurmans on "Computing and Planning with Large Generative Models" (COT)

The Hardware Lottery (Hooker)

A Path Towards Autonomous Machine Intelligence (JEPA/LeCun)

AlphaGeometry

FunSearch

Emergent Abilities of Large Language Models

Language models are not naysayers (Negation in LLMs)

The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"

Embracing negative results

...more

More shows like Machine Learning Street Talk (MLST)

View all

The a16z Show

1,093 Listeners

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

436 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn

301 Listeners

NVIDIA AI Podcast

345 Listeners

Practical AI

208 Listeners

Google DeepMind: The Podcast

202 Listeners

Last Week in AI

314 Listeners

Dwarkesh Podcast

576 Listeners

Big Technology Podcast

508 Listeners

No Priors: Artificial Intelligence | Technology | Startups

143 Listeners

Latent Space: The AI Engineer Podcast

101 Listeners

This Day in AI Podcast

226 Listeners

The AI Daily Brief: Artificial Intelligence News and Analysis

682 Listeners

BG2Pod with Brad Gerstner and Bill Gurley

491 Listeners

AI + a16z

34 Listeners

Share Prof. Subbarao Kambhampati - LLMs don't reason, they memorize (ICML2024 2/13)

Sign up to save your podcasts

Prof. Subbarao Kambhampati - LLMs don't reason, they memorize (ICML2024 2/13)

Prof. Subbarao Kambhampati - LLMs don't reason, they memorize (ICML2024 2/13)

More shows like Machine Learning Street Talk (MLST)

The a16z Show

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Super Data Science: ML & AI Podcast with Jon Krohn

NVIDIA AI Podcast

Practical AI

Google DeepMind: The Podcast

Last Week in AI

Dwarkesh Podcast

Big Technology Podcast

No Priors: Artificial Intelligence | Technology | Startups

Latent Space: The AI Engineer Podcast

This Day in AI Podcast

The AI Daily Brief: Artificial Intelligence News and Analysis

BG2Pod with Brad Gerstner and Bill Gurley

AI + a16z