Learning GenAI via SOTA Papers

EP154: [FS-Researcher] Giving AI agents a file system


Listen Later

The paper "FS-Researcher: Test-Time Scaling for Long-Horizon Research Tasks with File-System-Based Agents" introduces a novel dual-agent framework designed to overcome the context window limitations of large language models (LLMs) during complex, long-horizon research tasks.

The core innovation of FS-Researcher is the use of a persistent, file-system-based workspace that serves as an external memory. This allows the agents to store and organize information far exceeding a standard model's context limit. The framework operates in two distinct stages:

  • Context Builder: Acts as a "digital librarian" that browses the internet, takes structured notes, and archives raw sources into a hierarchical knowledge base.
  • Report Writer: Composes the final report section by section, using the knowledge base as its sole source of facts.

The Role of the File System

The workspace utilizes control files (such as todos, checklists, and logs) to track progress and coordinate between agent sessions. This structure enables iterative refinement, where agents can revisit and fix errors across multiple sessions, mirroring a human-like research workflow.

Performance and Scaling

Experimental results on benchmarks like DeepResearch Bench and DeepConsult show that FS-Researcher achieves state-of-the-art (SOTA) quality compared to both proprietary and open-source systems.

A major finding of the paper is the validation of test-time scaling: there is a positive correlation between the quality of the final report and the computation (rounds of iterations) allocated to the Context Builder. As more rounds are invested in building the knowledge base, the resulting reports become more evidence-grounded and comprehensive.

...more
View all episodesView all episodes
Download on the App Store

Learning GenAI via SOTA PapersBy Yun Wu