RoboPapers

Ep#12: VaViM and VaVAM: Autonomous Driving through Video Generative Modeling


Listen Later

How can world models be used for training autonomous driving? Learn by watching this episode with Florent Bartoccioni!

We explores the potential of large-scale generative video models to enhance autonomous driving capabilities, introducing an open-source autoregressive video model (VaViM) and a companion video-action model (VaVAM). VaViM is a simple autoregressive model that predicts frames using spatio-temporal token sequences, while VaVAM leverages the learned representations to generate driving trajectories through imitation learning. Together, they offer a complete perception-to-action pipeline.

Project Site

Original Post on X

ArXiV



This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit robopapers.substack.com
...more
View all episodesView all episodes
Download on the App Store

RoboPapersBy Chris Paxton and Michael Cho