LIBERO-Para: Paraphrase Robustness in Robotic Manipulation
Reveals paraphrase fragility in VLAs causing 22-52% success drops due to task misidentification. Introduces PRIDE metric weighting success by paraphrase difficulty on LIBERO benchmark manipulation tasks.
LIBERO-Para: Paraphrase Robustness in Robotic Manipulation
Reveals paraphrase fragility in VLAs causing 22-52% success drops due to task misidentification. Introduces PRIDE metric weighting success by paraphrase difficulty on LIBERO benchmark manipulation tasks.