April 02, 2026

AI Agent Traps and Prompt Injection

This episode explores why AI agents become a fundamentally different security problem once language models can browse the web, read email, call tools, store memory, and act inside real software environments. It explains prompt injection as the core boundary failure, showing how webpages, emails, retrieved notes, or API responses can be mistaken for trusted instructions, turning ordinary content into an attack vector with real operational consequences. The discussion then sharpens the distinction between one-off prompt attacks and more systemic failures such as memory poisoning and multi-agent compromise, where corrupted state can persist across sessions or spread through delegated workflows. A listener would find it interesting because it frames agent safety as a concrete systems-security challenge, not just a model-behavior quirk, and clarifies why greater capability also widens the blast radius of failure.

Sources:

1. AI Agent Traps and Prompt Injection

/tmp/submission-source-_s144w4z.txt

2. Ignore Previous Prompt: Attack Techniques For Language Models — Fábio Perez, Ian Ribeiro, 2022

https://scholar.google.com/scholar?q=Ignore+Previous+Prompt:+Attack+Techniques+For+Language+Models

3. Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection — Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, Mario Fritz, 2023

https://scholar.google.com/scholar?q=Not+what+you've+signed+up+for:+Compromising+Real-World+LLM-Integrated+Applications+with+Indirect+Prompt+Injection

4. Prompt Injection attack against LLM-integrated Applications — Yi Liu, Gelei Deng, Yuekang Li, Kailong Wang, Tianwei Zhang, Yepang Liu, Haoyu Wang, Yan Zheng, Yang Liu, 2023

https://scholar.google.com/scholar?q=Prompt+Injection+attack+against+LLM-integrated+Applications

5. Prompt Injection Attacks and Defenses in LLM-Integrated Applications — Yupei Liu, Yuqi Jia, Runpeng Geng, Jinyuan Jia, Neil Zhenqiang Gong, 2023

https://scholar.google.com/scholar?q=Prompt+Injection+Attacks+and+Defenses+in+LLM-Integrated+Applications

6. AI Agents Under Threat: A Survey of Key Security Challenges and Future Pathways — Zehang Deng, Yongjian Guo, Changzhou Han, Wanlun Ma, Junwu Xiong, Sheng Wen, Yang Xiang, 2024

https://scholar.google.com/scholar?q=AI+Agents+Under+Threat:+A+Survey+of+Key+Security+Challenges+and+Future+Pathways

7. Open Challenges in Multi-Agent Security: Towards Secure Systems of Interacting AI Agents — Christian Schroeder de Witt, 2025

https://scholar.google.com/scholar?q=Open+Challenges+in+Multi-Agent+Security:+Towards+Secure+Systems+of+Interacting+AI+Agents

8. Red-Teaming LLM Multi-Agent Systems via Communication Attacks — Pengfei He, Yupin Lin, Shen Dong, Han Xu, Yue Xing, Hui Liu, 2025

https://scholar.google.com/scholar?q=Red-Teaming+LLM+Multi-Agent+Systems+via+Communication+Attacks

9. G-Safeguard: A Topology-Guided Security Lens and Treatment on LLM-based Multi-agent Systems — Shilong Wang, Guibin Zhang, Miao Yu, Guancheng Wan, Fanci Meng, Chongye Guo, Kun Wang, Yang Wang, 2025

https://scholar.google.com/scholar?q=G-Safeguard:+A+Topology-Guided+Security+Lens+and+Treatment+on+LLM-based+Multi-agent+Systems

10. InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents — Qiusi Zhan, Zhixiang Liang, Zifan Ying, Daniel Kang, 2024

https://scholar.google.com/scholar?q=InjecAgent:+Benchmarking+Indirect+Prompt+Injections+in+Tool-Integrated+Large+Language+Model+Agents

11. Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents — Hanrong Zhang, Jingyuan Huang, Kai Mei, Yifei Yao, Zhenting Wang, Chenlu Zhan, Hongwei Wang, Yongfeng Zhang, 2024

https://scholar.google.com/scholar?q=Agent+Security+Bench+(ASB):+Formalizing+and+Benchmarking+Attacks+and+Defenses+in+LLM-based+Agents

12. AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases — Zhaorun Chen, Zhen Xiang, Chaowei Xiao, Dawn Song, Bo Li, 2024

https://scholar.google.com/scholar?q=AgentPoison:+Red-teaming+LLM+Agents+via+Poisoning+Memory+or+Knowledge+Bases

13. Prompt Infection: LLM-to-LLM Prompt Injection within Multi-Agent Systems — Donghyun Lee, Mo Tiwari, 2024

https://scholar.google.com/scholar?q=Prompt+Infection:+LLM-to-LLM+Prompt+Injection+within+Multi-Agent+Systems

14. Multi-Agent Systems Execute Arbitrary Malicious Code — Harold Triedman, Rishi D. Jha, Vitaly Shmatikov, 2025

https://scholar.google.com/scholar?q=Multi-Agent+Systems+Execute+Arbitrary+Malicious+Code

15. Prompt Injection Attacks on Large Language Models: A Survey of Attack Methods, Root Causes, and Defense Strategies — approx. survey by prompt-injection/security researchers, 2025

https://scholar.google.com/scholar?q=Prompt+Injection+Attacks+on+Large+Language+Models:+A+Survey+of+Attack+Methods,+Root+Causes,+and+Defense+Strategies

16. Prompting for LLM Security and RAG: A Survey from Zero-Shot to Automatic Prompt Optimization (APO) and Prompt-Injection Defenses — approx. security/RAG survey authors, 2025

https://scholar.google.com/scholar?q=Prompting+for+LLM+Security+and+RAG:+A+Survey+from+Zero-Shot+to+Automatic+Prompt+Optimization+(APO)+and+Prompt-Injection+Defenses

17. Veriguard: Enhancing LLM Agent Safety via Verified Code Generation — approx. systems/security authors, 2025

https://scholar.google.com/scholar?q=Veriguard:+Enhancing+LLM+Agent+Safety+via+Verified+Code+Generation

18. Enforcement Agents: Enhancing Accountability and Resilience in Multi-Agent AI Frameworks — approx. multi-agent safety authors, 2025

https://scholar.google.com/scholar?q=Enforcement+Agents:+Enhancing+Accountability+and+Resilience+in+Multi-Agent+AI+Frameworks

19. Monitoring LLM-Based Multi-Agent Systems Against Corruptions via Node Evaluation — approx. multi-agent monitoring authors, 2025

https://scholar.google.com/scholar?q=Monitoring+LLM-Based+Multi-Agent+Systems+Against+Corruptions+via+Node+Evaluation

20. Enhancing Robustness of LLM-Driven Multi-Agent Systems Through Randomized Smoothing — approx. robustness/safety authors, 2025

https://scholar.google.com/scholar?q=Enhancing+Robustness+of+LLM-Driven+Multi-Agent+Systems+Through+Randomized+Smoothing

21. Assessing and Enhancing the Robustness of LLM-Based Multi-Agent Systems Through Chaos Engineering — approx. systems robustness authors, 2025

https://scholar.google.com/scholar?q=Assessing+and+Enhancing+the+Robustness+of+LLM-Based+Multi-Agent+Systems+Through+Chaos+Engineering

22. AI Post Transformers: Memory in the Age of AI Agents: Forms, Functions, Dynamics — Hal Turing & Dr. Ada Shannon, 2026

https://podcast.do-not-panic.com/episodes/2026-03-16-memory-in-the-age-of-ai-agents-forms-fun-5abc60.mp3

23. AI Post Transformers: NeurIPS 2025: Agentic Plan Caching: Test-Time Memory for Fast and Cost-Efficient LLM Agents — Hal Turing & Dr. Ada Shannon, 2026

https://podcast.do-not-panic.com/episodes/neurips-2025-agentic-plan-caching-test-time-memory-for-fast-and-cost-efficient-l/

24. AI Post Transformers: ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory — Hal Turing & Dr. Ada Shannon, 2025

https://podcast.do-not-panic.com/episodes/reasoningbank-scaling-agent-self-evolving-with-reasoning-memory/

25. AI Post Transformers: Qwen3Guard: Streaming Three-Way Safety Classification for LLMs — Hal Turing & Dr. Ada Shannon, 2026

https://podcast.do-not-panic.com/episodes/2026-03-16-qwen3guard-streaming-three-way-safety-cl-26b0ef.mp3

Interactive Visualization: AI Agent Traps and Prompt Injection

...more

View all episodes

By mcgrof