March 28, 2026

EP135: [SoK] Curing AI Amnesia with Agentic Skills

22 minutes

"SoK: Agentic Skills — Beyond Tool Use in LLM Agents" provides a comprehensive systematization of how Large Language Model (LLM) agents utilize "agentic skills." Unlike simple tools or one-off plans, agentic skills are reusable, callable procedural modules that allow agents to reliably execute complex, long-horizon workflows across multiple tasks.

The paper's key contributions include:

A Formal Definition and Lifecycle: The authors formally define an agentic skill using a four-tuple framework: applicability conditions, an executable policy, termination criteria, and a reusable interface. They also map out a complete seven-stage lifecycle for these skills, spanning from initial discovery and practice to storage, execution, and evaluation.
Dual Taxonomies: The paper introduces two complementary taxonomies to classify the landscape of agentic skills. The first outlines seven system-level design patterns, detailing how skills are packaged and executed (e.g., metadata-driven progressive disclosure, executable code-as-skill, self-evolving libraries, and marketplace distribution). The second taxonomy categorizes skills based on their representation (e.g., natural language, code, policy, or hybrid) and their operational scope (e.g., web navigation, software engineering, or robotics).
Security and Governance: Highlighting the severe vulnerabilities of skill-based agents—such as prompt injection and supply-chain attacks—the paper proposes a four-tier trust model to manage execution privileges safely. This analysis is grounded in a real-world case study of the "ClawHavoc" campaign, where nearly 1,200 malicious skills infiltrated an agent marketplace to exfiltrate sensitive user data, including cryptocurrency wallets and API keys.
Evaluation and Efficacy: The authors survey deterministic evaluation frameworks, anchored by evidence from the SkillsBench benchmark. This empirical data demonstrates that high-quality, curated skills can significantly boost agent success rates (by an average of 16.2 percentage points), whereas self-generated skills often degrade performance because they can encode incorrect or overly specific behaviors.

...more

View all episodes

By Yun Wu

March 28, 2026

EP135: [SoK] Curing AI Amnesia with Agentic Skills

22 minutes

The paper's key contributions include:

A Formal Definition and Lifecycle: The authors formally define an agentic skill using a four-tuple framework: applicability conditions, an executable policy, termination criteria, and a reusable interface. They also map out a complete seven-stage lifecycle for these skills, spanning from initial discovery and practice to storage, execution, and evaluation.
Dual Taxonomies: The paper introduces two complementary taxonomies to classify the landscape of agentic skills. The first outlines seven system-level design patterns, detailing how skills are packaged and executed (e.g., metadata-driven progressive disclosure, executable code-as-skill, self-evolving libraries, and marketplace distribution). The second taxonomy categorizes skills based on their representation (e.g., natural language, code, policy, or hybrid) and their operational scope (e.g., web navigation, software engineering, or robotics).
Security and Governance: Highlighting the severe vulnerabilities of skill-based agents—such as prompt injection and supply-chain attacks—the paper proposes a four-tier trust model to manage execution privileges safely. This analysis is grounded in a real-world case study of the "ClawHavoc" campaign, where nearly 1,200 malicious skills infiltrated an agent marketplace to exfiltrate sensitive user data, including cryptocurrency wallets and API keys.
Evaluation and Efficacy: The authors survey deterministic evaluation frameworks, anchored by evidence from the SkillsBench benchmark. This empirical data demonstrates that high-quality, curated skills can significantly boost agent success rates (by an average of 16.2 percentage points), whereas self-generated skills often degrade performance because they can encode incorrect or overly specific behaviors.

...more

Share EP135: [SoK] Curing AI Amnesia with Agentic Skills

Sign up to save your podcasts

EP135: [SoK] Curing AI Amnesia with Agentic Skills

EP135: [SoK] Curing AI Amnesia with Agentic Skills