RoboPapers

Ep#48: VisualMimic: Visual Humanoid Loco-Manipulation via Motion Tracking and Generation


Listen Later

Robots must often be able to move around and interact with objects in previously-unseen environments to be useful. And the interaction part is important; to do this, they must be able to perceive and interact with the world using onboard sensing.

Enter VisualMimic. Shaofeng Yin and Yanjie Ze show us how to use visual sim-to-real to train diverse loco-manipulation tasks, which can even handle diverse outdoor environments.

Learn more in Episode #48 of RoboPapers today, hosted by Michael Cho and Chris Paxton.

Abstract:

Humanoid loco-manipulation in unstructured environments demands tight integration of egocentric perception and whole-body control. However, existing approaches either depend on external motion capture systems or fail to generalize across diverse tasks. We introduce VisualMimic, a visual sim-to-real framework that unifies egocentric vision with hierarchical whole-body control for humanoid robots. VisualMimic combines a task-agnostic low-level keypoint tracker -- trained from human motion data via a teacher-student scheme -- with a task-specific high-level policy that generates keypoint commands from visual and proprioceptive input. To ensure stable training, we inject noise into the low-level policy and clip high-level actions using human motion statistics. VisualMimic enables zero-shot transfer of visuomotor policies trained in simulation to real humanoid robots, accomplishing a wide range of loco-manipulation tasks such as box lifting, pushing, football dribbling, and kicking. Beyond controlled laboratory settings, our policies also generalize robustly to outdoor environments. Videos are available at: this https URL .

Project Page: https://visualmimic.github.io/

ArXiV: https://arxiv.org/abs/2509.20322



This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit robopapers.substack.com
...more
View all episodesView all episodes
Download on the App Store

RoboPapersBy Chris Paxton and Michael Cho