Deep Papers

Sleep-time Compute: Beyond Inference Scaling at Test-time


Listen Later

What if your LLM could think ahead—preparing answers before questions are even asked?

In this week's paper read, we dive into a groundbreaking new paper from researchers at Letta, introducing sleep-time compute: a novel technique that lets models do their heavy lifting offline, well before the user query arrives. By predicting likely questions and precomputing key reasoning steps, sleep-time compute dramatically reduces test-time latency and cost—without sacrificing performance.

​We explore new benchmarks—Stateful GSM-Symbolic, Stateful AIME, and the multi-query extension of GSM—that show up to 5x lower compute at inference, 2.5x lower cost per query, and up to 18% higher accuracy when scaled.

​You’ll also see how this method applies to realistic agent use cases and what makes it most effective.If you care about LLM efficiency, scalability, or cutting-edge research.

Explore more AI research, or sign up to hear the next session live: arize.com/ai-research-papers

Learn more about AI observability and evaluation, join the Arize AI Slack community or get the latest on LinkedIn and X.

...more
View all episodesView all episodes
Download on the App Store

Deep PapersBy Arize AI

  • 5
  • 5
  • 5
  • 5
  • 5

5

13 ratings


More shows like Deep Papers

View all
a16z Podcast by Andreessen Horowitz

a16z Podcast

1,007 Listeners

Talk Python To Me by Michael Kennedy

Talk Python To Me

587 Listeners

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

442 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

296 Listeners

NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

321 Listeners

Y Combinator Startup Podcast by Y Combinator

Y Combinator Startup Podcast

210 Listeners

Practical AI by Practical AI LLC

Practical AI

188 Listeners

Machine Learning Street Talk (MLST) by Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

90 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

350 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

128 Listeners

This Day in AI Podcast by Michael Sharkey, Chris Sharkey

This Day in AI Podcast

196 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

72 Listeners

AI + a16z by a16z

AI + a16z

33 Listeners

Lightcone Podcast by Y Combinator

Lightcone Podcast

22 Listeners

Training Data by Sequoia Capital

Training Data

37 Listeners