December 31, 2024

【第92期】Agentless：软件开发的Agent

26 minutes

Seventy3: 用NotebookLM将论文生成播客，让大家跟着AI一起进步。

今天的主题是：Agentless: Demystifying LLM-based Software Engineering Agents

Summary

This research paper introduces AGENTLESS, a novel approach to automated software development that eschews complex autonomous agents. Instead, AGENTLESS employs a simpler three-phase process: localization, repair, and patch validation, leveraging large language models (LLMs) for each phase. The authors benchmark AGENTLESS against existing agent-based systems on SWE-bench Lite, demonstrating surprisingly high performance and low cost. They further analyze SWE-bench Lite, identifying problematic issues and creating a refined dataset, SWE-bench Lite-S, for more robust evaluation. Finally, the study highlights AGENTLESS's adoption by OpenAI and its superior performance on their SWE-bench Verified benchmark.

这篇研究论文介绍了 AGENTLESS，一种新颖的自动化软件开发方法，摒弃了复杂的自主智能体（autonomous agents）。相反，AGENTLESS 采用一个更简单的三阶段流程：定位、修复和补丁验证，并在每个阶段中利用大型语言模型（LLMs）。作者在 SWE-bench Lite 基准上对 AGENTLESS 与现有基于智能体的系统进行了对比，结果显示出其出乎意料的高性能和低成本。此外，他们对 SWE-bench Lite 进行了深入分析，识别出其中的问题，并构建了一个经过优化的数据集 SWE-bench Lite-S，以实现更稳健的评估。最后，研究强调了 AGENTLESS 被 OpenAI 采用，并在他们的 SWE-bench Verified 基准上表现出优越的性能。

原文链接：https://arxiv.org/abs/2407.01489

...more