Embodied AI 101

Generative Depth Supervision for Embodied Vision-Language Models


Listen Later

Vision-language model that adds generative depth prediction during pre-training for physical grounding; achieves SOTA on embodied benchiments and transfers directly to real-robot tasks.
...more
View all episodesView all episodes
Download on the App Store

Embodied AI 101By Shaoqing Tan