From Passive Observer to Active Critic: Reinforcement Learning Elicits Process Reasoning for Robotic Manipulation
Uses reinforcement learning to improve process reasoning capabilities in robotic manipulation policies, shifting the model from passive observation to active critique.
From Passive Observer to Active Critic: Reinforcement Learning Elicits Process Reasoning for Robotic Manipulation
Uses reinforcement learning to improve process reasoning capabilities in robotic manipulation policies, shifting the model from passive observation to active critique.