August 14, 2025

Ep#12: VaViM and VaVAM: Autonomous Driving through Video Generative Modeling

1 hour 8 minutes

How can world models be used for training autonomous driving? Learn by watching this episode with Florent Bartoccioni!

We explores the potential of large-scale generative video models to enhance autonomous driving capabilities, introducing an open-source autoregressive video model (VaViM) and a companion video-action model (VaVAM). VaViM is a simple autoregressive model that predicts frames using spatio-temporal token sequences, while VaVAM leverages the learned representations to generate driving trajectories through imitation learning. Together, they offer a complete perception-to-action pipeline.

Project Site

Original Post on X

ArXiV

This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit robopapers.substack.com

...more

View all episodes

By Chris Paxton and Michael Cho

August 14, 2025

Ep#12: VaViM and VaVAM: Autonomous Driving through Video Generative Modeling

1 hour 8 minutes

How can world models be used for training autonomous driving? Learn by watching this episode with Florent Bartoccioni!

Project Site

Original Post on X

ArXiV

This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit robopapers.substack.com

...more

Share Ep#12: VaViM and VaVAM: Autonomous Driving through Video Generative Modeling

Sign up to save your podcasts

Ep#12: VaViM and VaVAM: Autonomous Driving through Video Generative Modeling

Ep#12: VaViM and VaVAM: Autonomous Driving through Video Generative Modeling