March 06, 2026

EP28: How to Control a Stochastic Agent with Stefano Soatto (VP AWS/ Pro. UCLA)

1 hour 2 minutes

Stefano Soatto, VP for AI at AWS and Professor at UCLA, the person responsible for agentic AI at AWS, joins us to explain why building reliable AI agents is fundamentally a control theory problem.

Stefano sees LLMs as stochastic dynamical systems that need to be controlled, not just prompted. He introduces "strands coding," a new framework AWS is building that sits between vibe coding and spec coding, you write a skeleton with AI functions constrained by pre- and post-conditions, verifying intent before a single line of code is generated. The surprising part: even as AI coding adoption goes up, developer trust in the output is going down.

We go deep into the philosophy of models and the world. Stefano argues that the dichotomy between "language models" and "world models" doesn't really exist, where a reasoning engine trained on rich enough data is a world model. He walks us through why naive realism is indefensible, how reverse diffusion was originally intended to show that models can't be identical to reality, and why that matters now.

We also discuss three types of information, Shannon, algorithmic, and conceptual, and why algorithmic information is the one that actually matters to agents. Synthetic data doesn't add Shannon information, but it adds algorithmic information, which is why it works. Intelligence isn't about scaling to Solomonov's universal induction; it's about learning to solve new problems fast.

Takeaways:

Vibe coding is local feedback control with high cognitive load; spec coding is open-loop global control with silent failures, neither scales well alone.
Trust in AI-generated code is declining even as adoption rises.
The distinction between next-token prediction and world model is mostly nomenclature - reasoning engines operating on multimodal data are world models.
Algorithmic information, not Shannon information, is what matters in the agentic setting.
Intelligence isn't minimizing inference uncertainty - it's minimizing time to solve unforeseen tasks.
The intent gap between user and model cannot be fully automated or delegated.

Timeline

(00:13) Introduction and guest welcome

(01:12) How the agentic era changed machine learning

(06:11) Vibe coding one year later

(07:23) Vibe vs. spec vs. strands coding

(14:30) Why English is not a programming language

(16:36) Constrained generation and agent choreography

(20:44) Diffusion models vs. autoregressive models (25:59) The platonic representation hypothesis and naive realism

(31:14) Synthetic data and the information bottleneck

(36:22) Three types of information: Shannon, algorithmic, conceptual

(38:47) Scaling laws and Solomonov induction

(42:14) World models and the Goethian vs. Marrian approach

(49:00) Encoding vs. generation and JEPA-style training

(55:50) Are language models already world models?

(59:13) Closing thoughts on trust, education, and responsibility.

Music:

"Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.
"Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0. Changes: trimmed

About

The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

...more

View all episodes

By Ravid Shwartz-Ziv & Allen Roush

44 ratings