Machine Learning Street Talk (MLST)

Subbarao Kambhampati - Do o1 models search?


Listen Later

Join Prof. Subbarao Kambhampati and host Tim Scarfe for a deep dive into OpenAI's O1 model and the future of AI reasoning systems.


* How O1 likely uses reinforcement learning similar to AlphaGo, with hidden reasoning tokens that users pay for but never see

* The evolution from traditional Large Language Models to more sophisticated reasoning systems

* The concept of "fractal intelligence" in AI - where models work brilliantly sometimes but fail unpredictably

* Why O1's improved performance comes with substantial computational costs

* The ongoing debate between single-model approaches (OpenAI) vs hybrid systems (Google)

* The critical distinction between AI as an intelligence amplifier vs autonomous decision-maker


SPONSOR MESSAGES:

***

CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments.

https://centml.ai/pricing/


Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. Are you interested in working on reasoning, or getting involved in their events?


Goto https://tufalabs.ai/

***


TOC:

1. **O1 Architecture and Reasoning Foundations**

[00:00:00] 1.1 Fractal Intelligence and Reasoning Model Limitations

[00:04:28] 1.2 LLM Evolution: From Simple Prompting to Advanced Reasoning

[00:14:28] 1.3 O1's Architecture and AlphaGo-like Reasoning Approach

[00:23:18] 1.4 Empirical Evaluation of O1's Planning Capabilities


2. **Monte Carlo Methods and Model Deep-Dive**

[00:29:30] 2.1 Monte Carlo Methods and MARCO-O1 Implementation

[00:31:30] 2.2 Reasoning vs. Retrieval in LLM Systems

[00:40:40] 2.3 Fractal Intelligence Capabilities and Limitations

[00:45:59] 2.4 Mechanistic Interpretability of Model Behavior

[00:51:41] 2.5 O1 Response Patterns and Performance Analysis


3. **System Design and Real-World Applications**

[00:59:30] 3.1 Evolution from LLMs to Language Reasoning Models

[01:06:48] 3.2 Cost-Efficiency Analysis: LLMs vs O1

[01:11:28] 3.3 Autonomous vs Human-in-the-Loop Systems

[01:16:01] 3.4 Program Generation and Fine-Tuning Approaches

[01:26:08] 3.5 Hybrid Architecture Implementation Strategies


Transcript: https://www.dropbox.com/scl/fi/d0ef4ovnfxi0lknirkvft/Subbarao.pdf?rlkey=l3rp29gs4hkut7he8u04mm1df&dl=0


REFS:

[00:02:00] Monty Python (1975)

Witch trial scene: flawed logical reasoning.

https://www.youtube.com/watch?v=zrzMhU_4m-g


[00:04:00] Cade Metz (2024)

Microsoft–OpenAI partnership evolution and control dynamics.

https://www.nytimes.com/2024/10/17/technology/microsoft-openai-partnership-deal.html


[00:07:25] Kojima et al. (2022)

Zero-shot chain-of-thought prompting ('Let's think step by step').

https://arxiv.org/pdf/2205.11916


[00:12:50] DeepMind Research Team (2023)

Multi-bot game solving with external and internal planning.

https://deepmind.google/research/publications/139455/


[00:15:10] Silver et al. (2016)

AlphaGo's Monte Carlo Tree Search and Q-learning.

https://www.nature.com/articles/nature16961


[00:16:30] Kambhampati, S. et al. (2023)

Evaluates O1's planning in "Strawberry Fields" benchmarks.

https://arxiv.org/pdf/2410.02162


[00:29:30] Alibaba AIDC-AI Team (2023)

MARCO-O1: Chain-of-Thought + MCTS for improved reasoning.

https://arxiv.org/html/2411.14405


[00:31:30] Kambhampati, S. (2024)

Explores LLM "reasoning vs retrieval" debate.

https://arxiv.org/html/2403.04121v2


[00:37:35] Wei, J. et al. (2022)

Chain-of-thought prompting (introduces last-letter concatenation).

https://arxiv.org/pdf/2201.11903


[00:42:35] Barbero, F. et al. (2024)

Transformer attention and "information over-squashing."

https://arxiv.org/html/2406.04267v2


[00:46:05] Ruis, L. et al. (2023)

Influence functions to understand procedural knowledge in LLMs.

https://arxiv.org/html/2411.12580v1


(truncated - continued in shownotes/transcript doc)

...more
View all episodesView all episodes
Download on the App Store

Machine Learning Street Talk (MLST)By Machine Learning Street Talk (MLST)

  • 4.7
  • 4.7
  • 4.7
  • 4.7
  • 4.7

4.7

85 ratings


More shows like Machine Learning Street Talk (MLST)

View all
Data Skeptic by Kyle Polich

Data Skeptic

477 Listeners

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

433 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

302 Listeners

Practical AI by Practical AI LLC

Practical AI

212 Listeners

Google DeepMind: The Podcast by Hannah Fry

Google DeepMind: The Podcast

197 Listeners

Last Week in AI by Skynet Today

Last Week in AI

306 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

72 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

131 Listeners

Unsupervised Learning by by Redpoint Ventures

Unsupervised Learning

49 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

95 Listeners

This Day in AI Podcast by Michael Sharkey, Chris Sharkey

This Day in AI Podcast

210 Listeners

The AI Daily Brief: Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief: Artificial Intelligence News and Analysis

588 Listeners

AI + a16z by a16z

AI + a16z

34 Listeners

Lightcone Podcast by Y Combinator

Lightcone Podcast

22 Listeners

Training Data by Sequoia Capital

Training Data

39 Listeners