RoboPapers

Ep#31: Vision in Action: Learning Active Perception from Human Demonstrations


Listen Later

Most robots are fixed in one location, with cameras at the correct location to solve whatever their task is going to be. This makes setting up the camera in the correct location a key part of task setup; it also makes the task unnecessarily difficult. Ideally, robots would move their camera around intelligently in order to gather all the information they need to perform a task.

In “Vision in Action,” the authors look at how to use a flexible 6-DoF “neck” to move around and gather the information necessary to perform the task, based on what a human operator is actually looking at.

Learn more by watching Episode #31 of RoboPapers with Haoyu Xiong, co-hosted by Michael Cho and Chris Paxton.

Abstract:

We present Vision in Action (ViA), an active perception system for bimanual robot manipulation. ViA learns task-relevant active perceptual strategies (e.g., searching, tracking, and focusing) directly from human demonstrations. On the hardware side, ViA employs a simple yet effective 6-DoF robotic neck to enable flexible, human-like head movements. To capture human active perception strategies, we design a VR-based teleoperation interface that creates a shared observation space between the robot and the human operator. To mitigate VR motion sickness caused by latency in the robot’s physical movements, the interface uses an intermediate 3D scene representation, enabling real-time view rendering on the operator side while asynchronously updating the scene with the robot’s latest observations. Together, these design elements enable the learning of robust visuomotor policies for three complex, multi-stage bimanual manipulation tasks involving visual occlusions, significantly outperforming baseline systems.

Project Page

ArXiV



This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit robopapers.substack.com
...more
View all episodesView all episodes
Download on the App Store

RoboPapersBy Chris Paxton and Michael Cho