Seventy3

【第108期】PAE:能够自主学习新的网页导航技能


Listen Later

Seventy3: 用NotebookLM将论文生成播客,让大家跟着AI一起进步。

今天的主题是:Proposer-Agent-Evaluator(PAE): Autonomous Skill Discovery For Foundation Model Internet Agents

Summary

This research introduces Proposer-Agent-Evaluator (PAE), a novel system that enables foundation model agents to autonomously learn new skills for web navigation. PAE leverages a context-aware task proposer to suggest tasks, an agent policy to execute them, and an autonomous evaluator to provide feedback via reinforcement learning. Experiments on challenging real-world and simulated websites demonstrate PAE's effectiveness, resulting in significant improvements in zero-shot generalization compared to existing methods, achieving state-of-the-art performance among open-source models. The system's design, based on the asymmetric capabilities of large language models, contributes to more robust and adaptable AI agents. The researchers open-sourced their code and models to encourage further exploration.

本研究提出了Proposer-Agent-Evaluator (PAE) 系统,这是一种新型系统,使基础模型代理能够自主学习新的网页导航技能。PAE 利用一个上下文感知任务提议器来建议任务,通过代理策略执行这些任务,并由自主评估器通过强化学习提供反馈。在真实世界和模拟网站上的实验表明,PAE 显著提升了零样本泛化能力,相较于现有方法达到了开源模型中的最新性能。该系统基于大型语言模型的非对称能力设计,增强了 AI 代理的鲁棒性和适应性。研究团队开源了其代码和模型,以鼓励进一步探索。

原文链接:https://arxiv.org/abs/2412.13194

...more
View all episodesView all episodes
Download on the App Store

Seventy3By 任雨山