CollabVR: Collaborative Video Reasoning with Vision-Language and Video Generation Models
Closed-loop framework coupling Vision-Language Models with Video Generation Models at step-level granularity. Mitigates long-horizon drift and mid-clip errors in goal-directed video reasoning for robotic planning.
CollabVR: Collaborative Video Reasoning with Vision-Language and Video Generation Models
Closed-loop framework coupling Vision-Language Models with Video Generation Models at step-level granularity. Mitigates long-horizon drift and mid-clip errors in goal-directed video reasoning for robotic planning.