April 22, 2026

EP160: [AgentSys] Securing AI agents with hierarchical memory

26 minutes

The paper introduces AGENTSYS, a novel framework designed to protect Large Language Model (LLM) agents from indirect prompt injection (IPI) attacks through explicit hierarchical memory management. Conventional LLM agents are vulnerable because they indiscriminately accumulate all tool outputs and reasoning traces in their context window, allowing malicious instructions to persist across multiple reasoning steps and degrading decision-making through verbose, non-essential content.

Key features of the AGENTSYS architecture include:

Hierarchical Isolation: The system organizes agents into a tree structure where a main agent spawns short-lived worker agents for tool invocations.
Memory Management: Raw external data and subtask reasoning traces are confined to isolated worker contexts and never enter the main agent's memory.
Schema-Validated Communication: The main agent defines a specific "intent" (a JSON-like schema) for each tool call, and worker agents distill raw outputs into compact, validated return values that must pass a syntactic gate.
Mediated Recursion: Any recursive tool calls within subtasks are gated by an LLM-based validator and a sanitize-restart mechanism to handle potentially adversarial content.

Evaluations on benchmarks like AgentDojo and ASB show that AGENTSYS achieves state-of-the-art security, reaching a 0.78% attack success rate (ASR) on AgentDojo while improving benign utility (64.36% compared to 63.54% for undefended baselines). By keeping the main agent's working memory clean and focused, AGENTSYS effectively prevents attack persistence and utility degradation in complex, multi-step workflows.

...more

View all episodes

By Yun Wu

April 22, 2026

EP160: [AgentSys] Securing AI agents with hierarchical memory

26 minutes

Key features of the AGENTSYS architecture include:

Hierarchical Isolation: The system organizes agents into a tree structure where a main agent spawns short-lived worker agents for tool invocations.
Memory Management: Raw external data and subtask reasoning traces are confined to isolated worker contexts and never enter the main agent's memory.
Schema-Validated Communication: The main agent defines a specific "intent" (a JSON-like schema) for each tool call, and worker agents distill raw outputs into compact, validated return values that must pass a syntactic gate.
Mediated Recursion: Any recursive tool calls within subtasks are gated by an LLM-based validator and a sanitize-restart mechanism to handle potentially adversarial content.

...more

Share EP160: [AgentSys] Securing AI agents with hierarchical memory

Sign up to save your podcasts

EP160: [AgentSys] Securing AI agents with hierarchical memory

EP160: [AgentSys] Securing AI agents with hierarchical memory