π  ToolGen: Unified Tool Retrieval and Calling via Generation
This research paper introduces ToolGen, a novel framework that enables LLMs to directly access and utilize external tools by representing each tool as a unique token within the model's vocabulary. ToolGen addresses the limitations of traditional tool retrieval methods, which often rely on separate retrieval mechanisms and are constrained by context length. The paper describes a three-stage training process for ToolGen, consisting of tool memorization, retrieval training, and end-to-end agent tuning, which allows LLMs to learn and utilize a vast number of tools effectively and efficiently. Experimental results demonstrate that ToolGen outperforms existing approaches in both tool retrieval and autonomous task completion, highlighting its potential to revolutionize AI agent capabilities.
π Link to paper
π Check their Github