Tyler Barnes, founding engineer at Mastra, introduces Observational Memory. It is a new memory system for AI agents that achieves state-of-the-art results on LongMemEval with a completely stable context window.
Unlike semantic recall (which uses RAG and invalidates prompt caching), Observational Memory compresses conversations into dense observations while maintaining a stable, fully cacheable context.
The result: 94.87% accuracy on LongMemEval with GPT-5 mini. This is the highest score recorded by any memory system to date.
In this conversation, Tyler explains how the system works, why it outperforms raw context, and how you can integrate it into your agents in under 20 minutes. We also dive into the research, the benchmarks, and what's next for Observational Memory.
Observational Memory Launch Blog: https://mastra.ai/blog/observational-memory
Full Research Breakdown: https://mastra.ai/research/observational-memory
Tyler Barnes on X: https://x.com/tylbar
Tyler's Announcement Post (Feb 9 ): https://x.com/tylbar/status/2020924183979397512
Mastra: https://mastra.ai
Learn Mastra in the world's first MCP-Based Course: https://mastra.ai/course
Principles of Building AI Agents (Book): https://mastra.ai/book
Patterns for Building AI Agents (New Book): https://mastra.ai/blog/patterns-book https://docs.google.com/forms/d/e/1FAIpQLSduJjc515f6RZJqtkR2ByqJZrB0iP8B7SUKnjjZE9IajH_I8w/viewform
Mastra is an open-source TypeScript framework designed for building and shipping AI-powered applications and agents with minimal friction. It supports the full lifecycle of agent development—from prototype to production. You can integrate it with frontend and backend stacks (e.g., React, Next.js, Node) or run agents as standalone services. If you’re a JavaScript or TypeScript developer looking to build an agentic or AI-powered product without starting from first principles, Mastra provides the scaffolding, tools, and integrations to accelerate that process.
00:26 – The Origin Story
01:14 – Previous Memory Systems: Semantic Recall vs Working Memory
02:23 – How Observational Memory Works
03:52 – Human-Inspired Memory System
06:11 – Buffered Observations
06:32 – Research & Benchmarks
10:34 – Live Demo
13:57 – No More Compaction Hell
15:08 – Performance & Cost Benefits
16:42 – Shipped Code vs Research Papers
17:33 – Future Roadmap & Next Ideas