March 17, 2026

Memory in the Age of AI Agents: Forms, Functions, Dynamics

34 minutes

This episode examines a comprehensive survey paper that proposes a new framework for understanding memory in AI agent systems. The authors challenge traditional cognitive psychology categories (like short-term versus long-term memory) and instead organize agent memory along three dimensions: form (how memory is stored—as tokens, parameters, or latent vectors), function (what memory represents—factual knowledge, past experiences, or working state), and dynamics (how memory is created, updated, and retrieved over time). The discussion clarifies how agent memory differs from LLM knowledge, RAG systems, and simple prompt engineering, emphasizing that agent memory is fundamentally about stateful, task-specific information that persists and evolves across interactions. Listeners interested in building more sophisticated AI systems will find valuable distinctions between related concepts that are often conflated in practice.

Sources:

1. Memory in the Age of AI Agents — Yuyang Hu, Shichun Liu, Yanwei Yue, Guibin Zhang, Boyang Liu, Fangyi Zhu, Jiahang Lin, Honglin Guo, Shihan Dou, Zhiheng Xi, Senjie Jin, Jiejun Tan, Yanbin Yin, Jiongnan Liu, Zeyu Zhang, Zhongxiang Sun, Yutao Zhu, Hao Sun, Boci Peng, Zhenrong Cheng, Xuanbo Fan, Jiaxin Guo, Xinlei Yu, Zhenhong Zhou, Zewen Hu, Jiahao Huo, Junhao Wang, Yuwei Niu, Yu Wang, Zhenfei Yin, Xiaobin Hu, Yue Liao, Qiankun Li, Kun Wang, Wangchunshu Zhou, Yixin Liu, Dawei Cheng, Qi Zhang, Tao Gui, Shirui Pan, Yan Zhang, Philip Torr, Zhicheng Dou, Ji-Rong Wen, Xuanjing Huang, Yu-Gang Jiang, Shuicheng Yan, 2025

http://arxiv.org/abs/2512.13564v2

2. Generative Agents: Interactive Simulacra of Human Behavior — Park, O'Brien, Cai, Morris, Liang, Bernstein, 2023

https://scholar.google.com/scholar?q=Generative+Agents:+Interactive+Simulacra+of+Human+Behavior

3. MemGPT: Towards LLMs as Operating Systems — Packer, Fang, Patil, Wooders, Stoica, Gonzalez, 2023

https://scholar.google.com/scholar?q=MemGPT:+Towards+LLMs+as+Operating+Systems

4. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks — Lewis, Perez, Piktus, Petroni, Karpukhin, Goyal, Küttler, Lewis, Yih, Rocktäschel, Riedel, Kiela, 2020

https://scholar.google.com/scholar?q=Retrieval-Augmented+Generation+for+Knowledge-Intensive+NLP+Tasks

5. In-Context Learning and Induction Heads — Olsson, Elhage, Nanda, Joseph, Drain, Bau, Schiefer, Ndousse, Henighan, Lovitt, Chen, Kaplan, Anthropic, 2022

https://scholar.google.com/scholar?q=In-Context+Learning+and+Induction+Heads

6. LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models — Chen, Borgeaud, Mensch, Sutskever, Sifre, Schofield, Shazeer, Lazaridou, de Freitas, others, 2023

https://scholar.google.com/scholar?q=LongLoRA:+Efficient+Fine-tuning+of+Long-Context+Large+Language+Models

7. Lost in the Middle: How Language Models Use Long Contexts — Liu, Iter, Xu, Yuksekgonul, Zou, others, 2023

https://scholar.google.com/scholar?q=Lost+in+the+Middle:+How+Language+Models+Use+Long+Contexts

8. Recursively Summarizing Books with Human Feedback — Wu, Ouyang, Ziegler, Stiennon, Lowe, Leike, Christiano, 2021

https://scholar.google.com/scholar?q=Recursively+Summarizing+Books+with+Human+Feedback

9. Low-Rank Adaptation of Large Language Models (LoRA) — Hu, Shen, Wallis, Allen-Zhu, Li, Wang, Chen, 2021

https://scholar.google.com/scholar?q=Low-Rank+Adaptation+of+Large+Language+Models+(LoRA)

10. Overcoming Catastrophic Forgetting in Neural Networks (EWC) — Kirkpatrick, Pascanu, Rabinowitz, Veness, Desjardins, Rusu, Milan, others, 2017

https://scholar.google.com/scholar?q=Overcoming+Catastrophic+Forgetting+in+Neural+Networks+(EWC)

11. Model-Agnostic Meta-Learning (MAML) — Finn, Abbeel, Levine, 2017

https://scholar.google.com/scholar?q=Model-Agnostic+Meta-Learning+(MAML)

12. Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context — Dai, Yang, Yang, Carbonell, Le, Salakhutdinov, 2019

https://scholar.google.com/scholar?q=Transformer-XL:+Attentive+Language+Models+Beyond+a+Fixed-Length+Context

13. Memorizing Transformers — Wu, Rabe, Hutchins, Szlam, 2022

https://scholar.google.com/scholar?q=Memorizing+Transformers

14. Retrieval-Enhanced Transformer (RETRO) — Borgeaud, Mensch, Hoffmann, Cai, Rutherford, Millican, van den Driessche, others, 2022

https://scholar.google.com/scholar?q=Retrieval-Enhanced+Transformer+(RETRO)

15. CommNet: Learning Multiagent Communication with Backpropagation — Sukhbaatar, Szlam, Fergus, 2016

https://scholar.google.com/scholar?q=CommNet:+Learning+Multiagent+Communication+with+Backpropagation

16. Learning to Communicate with Deep Multi-Agent Reinforcement Learning — Foerster, Assael, de Freitas, Whiteson, 2016

https://scholar.google.com/scholar?q=Learning+to+Communicate+with+Deep+Multi-Agent+Reinforcement+Learning

17. Emergent Tool Use from Multi-Agent Autocurricula — Baker, Kanitscheider, Markov, Wu, Powell, McGrew, Mordatch (OpenAI), 2020

https://scholar.google.com/scholar?q=Emergent+Tool+Use+from+Multi-Agent+Autocurricula

18. Reflexion: Language Agents with Verbal Reinforcement Learning — Shinn et al., 2023

https://scholar.google.com/scholar?q=Reflexion:+Language+Agents+with+Verbal+Reinforcement+Learning

19. RET-LLM: Towards a General Read-Write Memory for Large Language Models — Modarressi et al., 2024

https://scholar.google.com/scholar?q=RET-LLM:+Towards+a+General+Read-Write+Memory+for+Large+Language+Models

20. Voyager: An Open-Ended Embodied Agent with Large Language Models — Wang et al., 2023

https://scholar.google.com/scholar?q=Voyager:+An+Open-Ended+Embodied+Agent+with+Large+Language+Models

21. In-Context Retrieval-Augmented Language Models — Ram et al., 2023

https://scholar.google.com/scholar?q=In-Context+Retrieval-Augmented+Language+Models

22. Scaling Laws for Associative Memories — Ramsauer et al., 2021

https://scholar.google.com/scholar?q=Scaling+Laws+for+Associative+Memories

23. Do Machine Learning Models Memorize or Generalize? — Feldman, 2020

https://scholar.google.com/scholar?q=Do+Machine+Learning+Models+Memorize+or+Generalize?

24. LongTableBench: benchmarking long-context table reasoning across real-world formats and domains — approximate, 2024-2025

https://scholar.google.com/scholar?q=LongTableBench:+benchmarking+long-context+table+reasoning+across+real-world+formats+and+domains

25. Reasoning-Focused Evaluation of Efficient Long-Context Inference Techniques — approximate, 2024-2025

https://scholar.google.com/scholar?q=Reasoning-Focused+Evaluation+of+Efficient+Long-Context+Inference+Techniques

26. Cognitive Workspace: Active Memory Management for LLMs—An Empirical Study of Functional Infinite Context — approximate, 2024-2025

https://scholar.google.com/scholar?q=Cognitive+Workspace:+Active+Memory+Management+for+LLMs—An+Empirical+Study+of+Functional+Infinite+Context

27. Continual learning and catastrophic forgetting — approximate, 2020s

https://scholar.google.com/scholar?q=Continual+learning+and+catastrophic+forgetting

28. ESSENTIAL: Episodic and Semantic Memory Integration for Video Class-Incremental Learning — approximate, 2024-2025

https://scholar.google.com/scholar?q=ESSENTIAL:+Episodic+and+Semantic+Memory+Integration+for+Video+Class-Incremental+Learning

29. Adaptive compression as a unifying framework for episodic and semantic memory — approximate, 2020s

https://scholar.google.com/scholar?q=Adaptive+compression+as+a+unifying+framework+for+episodic+and+semantic+memory

30. MAG: Memory Augmented Knowledge Extraction Generation for Large Language Models — approximate, 2024-2025

https://scholar.google.com/scholar?q=MAG:+Memory+Augmented+Knowledge+Extraction+Generation+for+Large+Language+Models

31. Gradient Descent at Inference Time for LLM Reasoning

https://podcast.do-not-panic.com/episodes/2026-03-10-gradient-descent-at-inference-time-for-l-20617d.mp3

32. LLM Agents Reason About Code Without Running It

https://podcast.do-not-panic.com/episodes/2026-03-15-llm-agents-reason-about-code-without-run-2a1876.mp3

33. Emergent Cooperation in Self-Interested Multi-Agent AI

https://podcast.do-not-panic.com/episodes/2026-03-13-emergent-cooperation-in-self-interested-9c0b4c.mp3

34. 50x KV Cache Compression in Seconds via Attention Matching

https://podcast.do-not-panic.com/episodes/2026-03-09-50x-kv-cache-compression-in-seconds-via-9402c1.mp3

35. DualPath Breaks Storage Bandwidth Bottleneck in Agentic Inference

https://podcast.do-not-panic.com/episodes/2026-03-07-dualpath-breaks-storage-bandwidth-bottle-bc9a82.mp3

36. NVIDIA Nemotron 3 Hybrid SSM Transformer Architecture

https://podcast.do-not-panic.com/episodes/2026-03-07-nvidia-nemotron-3-hybrid-ssm-transformer-f9a91b.mp3

37. Structured State Space Duality Unifies Transformers and SSMs

https://podcast.do-not-panic.com/episodes/2026-03-07-structured-state-space-duality-unifies-t-bb2659.mp3

38. Why CARTRIDGE Works: Keys as Routers in KV Caches

https://podcast.do-not-panic.com/episodes/2026-03-07-why-cartridge-works-keys-as-routers-in-k-887d13.mp3

Interactive Visualization: Memory in the Age of AI Agents: Forms, Functions, Dynamics

...more

View all episodes

By mcgrof