Embodied AI 101

MolmoBot: A Vision-Language Model for Zero-Shot Robot Manipulation


Listen Later

Vision-language model (VLM) for zero-shot robot manipulation, trained entirely in simulation without real-world data; achieves 79.2% success rate on real-world tabletop tasks, outperforming π₀.₅ baseline at 39.2%.
...more
View all episodesView all episodes
Download on the App Store

Embodied AI 101By Shaoqing Tan