Embodied AI 101

SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Diffusion Transformers


Listen Later

A 2.6B-parameter open-source world model that generates coherent 720p, minute-long videos with precise 6-DoF camera control on a single GPU using a Hybrid Linear Diffusion Transformer + Gated DeltaNet for long-context efficiency. Targets controllable physics simulation.
...more
View all episodesView all episodes
Download on the App Store

Embodied AI 101By Shaoqing Tan