AI Post Transformers

Memory in the Age of AI Agents: Forms, Functions, Dynamics


Listen Later

This episode examines a comprehensive survey paper that proposes a new framework for understanding memory in AI agent systems. The authors challenge traditional cognitive psychology categories (like short-term versus long-term memory) and instead organize agent memory along three dimensions: form (how memory is stored—as tokens, parameters, or latent vectors), function (what memory represents—factual knowledge, past experiences, or working state), and dynamics (how memory is created, updated, and retrieved over time). The discussion clarifies how agent memory differs from LLM knowledge, RAG systems, and simple prompt engineering, emphasizing that agent memory is fundamentally about stateful, task-specific information that persists and evolves across interactions. Listeners interested in building more sophisticated AI systems will find valuable distinctions between related concepts that are often conflated in practice.
Sources:
1. Memory in the Age of AI Agents — Yuyang Hu, Shichun Liu, Yanwei Yue, Guibin Zhang, Boyang Liu, Fangyi Zhu, Jiahang Lin, Honglin Guo, Shihan Dou, Zhiheng Xi, Senjie Jin, Jiejun Tan, Yanbin Yin, Jiongnan Liu, Zeyu Zhang, Zhongxiang Sun, Yutao Zhu, Hao Sun, Boci Peng, Zhenrong Cheng, Xuanbo Fan, Jiaxin Guo, Xinlei Yu, Zhenhong Zhou, Zewen Hu, Jiahao Huo, Junhao Wang, Yuwei Niu, Yu Wang, Zhenfei Yin, Xiaobin Hu, Yue Liao, Qiankun Li, Kun Wang, Wangchunshu Zhou, Yixin Liu, Dawei Cheng, Qi Zhang, Tao Gui, Shirui Pan, Yan Zhang, Philip Torr, Zhicheng Dou, Ji-Rong Wen, Xuanjing Huang, Yu-Gang Jiang, Shuicheng Yan, 2025
http://arxiv.org/abs/2512.13564v2
2. Generative Agents: Interactive Simulacra of Human Behavior — Park, O'Brien, Cai, Morris, Liang, Bernstein, 2023
https://scholar.google.com/scholar?q=Generative+Agents:+Interactive+Simulacra+of+Human+Behavior
3. MemGPT: Towards LLMs as Operating Systems — Packer, Fang, Patil, Wooders, Stoica, Gonzalez, 2023
https://scholar.google.com/scholar?q=MemGPT:+Towards+LLMs+as+Operating+Systems
4. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks — Lewis, Perez, Piktus, Petroni, Karpukhin, Goyal, Küttler, Lewis, Yih, Rocktäschel, Riedel, Kiela, 2020
https://scholar.google.com/scholar?q=Retrieval-Augmented+Generation+for+Knowledge-Intensive+NLP+Tasks
5. In-Context Learning and Induction Heads — Olsson, Elhage, Nanda, Joseph, Drain, Bau, Schiefer, Ndousse, Henighan, Lovitt, Chen, Kaplan, Anthropic, 2022
https://scholar.google.com/scholar?q=In-Context+Learning+and+Induction+Heads
6. LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models — Chen, Borgeaud, Mensch, Sutskever, Sifre, Schofield, Shazeer, Lazaridou, de Freitas, others, 2023
https://scholar.google.com/scholar?q=LongLoRA:+Efficient+Fine-tuning+of+Long-Context+Large+Language+Models
7. Lost in the Middle: How Language Models Use Long Contexts — Liu, Iter, Xu, Yuksekgonul, Zou, others, 2023
https://scholar.google.com/scholar?q=Lost+in+the+Middle:+How+Language+Models+Use+Long+Contexts
8. Recursively Summarizing Books with Human Feedback — Wu, Ouyang, Ziegler, Stiennon, Lowe, Leike, Christiano, 2021
https://scholar.google.com/scholar?q=Recursively+Summarizing+Books+with+Human+Feedback
9. Low-Rank Adaptation of Large Language Models (LoRA) — Hu, Shen, Wallis, Allen-Zhu, Li, Wang, Chen, 2021
https://scholar.google.com/scholar?q=Low-Rank+Adaptation+of+Large+Language+Models+(LoRA)
10. Overcoming Catastrophic Forgetting in Neural Networks (EWC) — Kirkpatrick, Pascanu, Rabinowitz, Veness, Desjardins, Rusu, Milan, others, 2017
https://scholar.google.com/scholar?q=Overcoming+Catastrophic+Forgetting+in+Neural+Networks+(EWC)
11. Model-Agnostic Meta-Learning (MAML) — Finn, Abbeel, Levine, 2017
https://scholar.google.com/scholar?q=Model-Agnostic+Meta-Learning+(MAML)
12. Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context — Dai, Yang, Yang, Carbonell, Le, Salakhutdinov, 2019
https://scholar.google.com/scholar?q=Transformer-XL:+Attentive+Language+Models+Beyond+a+Fixed-Length+Context
13. Memorizing Transformers — Wu, Rabe, Hutchins, Szlam, 2022
https://scholar.google.com/scholar?q=Memorizing+Transformers
14. Retrieval-Enhanced Transformer (RETRO) — Borgeaud, Mensch, Hoffmann, Cai, Rutherford, Millican, van den Driessche, others, 2022
https://scholar.google.com/scholar?q=Retrieval-Enhanced+Transformer+(RETRO)
15. CommNet: Learning Multiagent Communication with Backpropagation — Sukhbaatar, Szlam, Fergus, 2016
https://scholar.google.com/scholar?q=CommNet:+Learning+Multiagent+Communication+with+Backpropagation
16. Learning to Communicate with Deep Multi-Agent Reinforcement Learning — Foerster, Assael, de Freitas, Whiteson, 2016
https://scholar.google.com/scholar?q=Learning+to+Communicate+with+Deep+Multi-Agent+Reinforcement+Learning
17. Emergent Tool Use from Multi-Agent Autocurricula — Baker, Kanitscheider, Markov, Wu, Powell, McGrew, Mordatch (OpenAI), 2020
https://scholar.google.com/scholar?q=Emergent+Tool+Use+from+Multi-Agent+Autocurricula
18. Reflexion: Language Agents with Verbal Reinforcement Learning — Shinn et al., 2023
https://scholar.google.com/scholar?q=Reflexion:+Language+Agents+with+Verbal+Reinforcement+Learning
19. RET-LLM: Towards a General Read-Write Memory for Large Language Models — Modarressi et al., 2024
https://scholar.google.com/scholar?q=RET-LLM:+Towards+a+General+Read-Write+Memory+for+Large+Language+Models
20. Voyager: An Open-Ended Embodied Agent with Large Language Models — Wang et al., 2023
https://scholar.google.com/scholar?q=Voyager:+An+Open-Ended+Embodied+Agent+with+Large+Language+Models
21. In-Context Retrieval-Augmented Language Models — Ram et al., 2023
https://scholar.google.com/scholar?q=In-Context+Retrieval-Augmented+Language+Models
22. Scaling Laws for Associative Memories — Ramsauer et al., 2021
https://scholar.google.com/scholar?q=Scaling+Laws+for+Associative+Memories
23. Do Machine Learning Models Memorize or Generalize? — Feldman, 2020
https://scholar.google.com/scholar?q=Do+Machine+Learning+Models+Memorize+or+Generalize?
24. LongTableBench: benchmarking long-context table reasoning across real-world formats and domains — approximate, 2024-2025
https://scholar.google.com/scholar?q=LongTableBench:+benchmarking+long-context+table+reasoning+across+real-world+formats+and+domains
25. Reasoning-Focused Evaluation of Efficient Long-Context Inference Techniques — approximate, 2024-2025
https://scholar.google.com/scholar?q=Reasoning-Focused+Evaluation+of+Efficient+Long-Context+Inference+Techniques
26. Cognitive Workspace: Active Memory Management for LLMs—An Empirical Study of Functional Infinite Context — approximate, 2024-2025
https://scholar.google.com/scholar?q=Cognitive+Workspace:+Active+Memory+Management+for+LLMs—An+Empirical+Study+of+Functional+Infinite+Context
27. Continual learning and catastrophic forgetting — approximate, 2020s
https://scholar.google.com/scholar?q=Continual+learning+and+catastrophic+forgetting
28. ESSENTIAL: Episodic and Semantic Memory Integration for Video Class-Incremental Learning — approximate, 2024-2025
https://scholar.google.com/scholar?q=ESSENTIAL:+Episodic+and+Semantic+Memory+Integration+for+Video+Class-Incremental+Learning
29. Adaptive compression as a unifying framework for episodic and semantic memory — approximate, 2020s
https://scholar.google.com/scholar?q=Adaptive+compression+as+a+unifying+framework+for+episodic+and+semantic+memory
30. MAG: Memory Augmented Knowledge Extraction Generation for Large Language Models — approximate, 2024-2025
https://scholar.google.com/scholar?q=MAG:+Memory+Augmented+Knowledge+Extraction+Generation+for+Large+Language+Models
31. Gradient Descent at Inference Time for LLM Reasoning
https://podcast.do-not-panic.com/episodes/2026-03-10-gradient-descent-at-inference-time-for-l-20617d.mp3
32. LLM Agents Reason About Code Without Running It
https://podcast.do-not-panic.com/episodes/2026-03-15-llm-agents-reason-about-code-without-run-2a1876.mp3
33. Emergent Cooperation in Self-Interested Multi-Agent AI
https://podcast.do-not-panic.com/episodes/2026-03-13-emergent-cooperation-in-self-interested-9c0b4c.mp3
34. 50x KV Cache Compression in Seconds via Attention Matching
https://podcast.do-not-panic.com/episodes/2026-03-09-50x-kv-cache-compression-in-seconds-via-9402c1.mp3
35. DualPath Breaks Storage Bandwidth Bottleneck in Agentic Inference
https://podcast.do-not-panic.com/episodes/2026-03-07-dualpath-breaks-storage-bandwidth-bottle-bc9a82.mp3
36. NVIDIA Nemotron 3 Hybrid SSM Transformer Architecture
https://podcast.do-not-panic.com/episodes/2026-03-07-nvidia-nemotron-3-hybrid-ssm-transformer-f9a91b.mp3
37. Structured State Space Duality Unifies Transformers and SSMs
https://podcast.do-not-panic.com/episodes/2026-03-07-structured-state-space-duality-unifies-t-bb2659.mp3
38. Why CARTRIDGE Works: Keys as Routers in KV Caches
https://podcast.do-not-panic.com/episodes/2026-03-07-why-cartridge-works-keys-as-routers-in-k-887d13.mp3
Interactive Visualization: Memory in the Age of AI Agents: Forms, Functions, Dynamics
...more
View all episodesView all episodes
Download on the App Store

AI Post TransformersBy mcgrof