AI Post Transformers

Recursive Language Models for Arbitrarily Long Prompts


Listen Later

This episode explores a 2026 MIT CSAIL paper on “Recursive Language Models,” which argues that handling very long prompts may be better framed as a systems problem than a bigger-context-window problem. It explains the distinction between hard context overflow and “context rot,” where models technically fit long inputs but increasingly fail to use them reliably, challenging the assumption that larger windows automatically mean better memory. The discussion connects this idea to inference-time compute scaling, chain-of-thought, tree search, and agentic AI, showing how models can iteratively inspect external information, use tools, and update state instead of forcing everything through a single forward pass. Listeners would find it interesting because it offers a concrete alternative to the current long-context arms race and suggests a different path for building more capable, reliable language systems.
Sources:
1. Recursive Language Models — Alex L. Zhang, Tim Kraska, Omar Khattab, 2025
http://arxiv.org/abs/2512.24601
2. Theoretical expressive power of reasoning models — Merrill & Sabharwal, 2024
https://scholar.google.com/scholar?q=Theoretical+expressive+power+of+reasoning+models
3. Context Rot — Hong et al., 2025
https://scholar.google.com/scholar?q=Context+Rot
4. Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval — Khattab et al., 2021
https://scholar.google.com/scholar?q=Baleen:+Robust+Multi-Hop+Reasoning+at+Scale+via+Condensed+Retrieval
5. OpenAI context compaction work — OpenAI, 2025
https://scholar.google.com/scholar?q=OpenAI+context+compaction+work
6. Smith long-context compaction work — Smith, 2025
https://scholar.google.com/scholar?q=Smith+long-context+compaction+work
7. Wu et al. task-specific long-context methods — Wu et al., 2021
https://scholar.google.com/scholar?q=Wu+et+al.+task-specific+long-context+methods
8. Wu et al. context compaction / long-context scaffolding — Wu et al., 2025
https://scholar.google.com/scholar?q=Wu+et+al.+context+compaction+/+long-context+scaffolding
9. Anthropic self-delegation / sub-agent work — Anthropic, 2025
https://scholar.google.com/scholar?q=Anthropic+self-delegation+/+sub-agent+work
10. Schroeder et al. self-delegation work — Schroeder et al., 2025
https://scholar.google.com/scholar?q=Schroeder+et+al.+self-delegation+work
11. Sun et al. self-delegation work — Sun et al., 2025
https://scholar.google.com/scholar?q=Sun+et+al.+self-delegation+work
12. Deep research benchmark/work — Chen et al., 2025
https://scholar.google.com/scholar?q=Deep+research+benchmark/work
13. Information aggregation benchmark/work — Bertsch et al., 2025
https://scholar.google.com/scholar?q=Information+aggregation+benchmark/work
14. Code repository understanding benchmark/work — Bai et al., 2025
https://scholar.google.com/scholar?q=Code+repository+understanding+benchmark/work
15. Qwen3 technical report — Yang et al., 2025
https://scholar.google.com/scholar?q=Qwen3+technical+report
16. Core Context Aware Transformers for Long Context Language Modeling — approx. 2024 long-context transformer authors, 2024
https://scholar.google.com/scholar?q=Core+Context+Aware+Transformers+for+Long+Context+Language+Modeling
17. Challenges in Deploying Long-Context Transformers: A Theoretical Peak Performance Analysis — approx. 2024 systems/theory authors, 2024
https://scholar.google.com/scholar?q=Challenges+in+Deploying+Long-Context+Transformers:+A+Theoretical+Peak+Performance+Analysis
18. Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models — approx. 2024 dialogue-memory authors, 2024
https://scholar.google.com/scholar?q=Recursively+Summarizing+Enables+Long-Term+Dialogue+Memory+in+Large+Language+Models
19. Augmenting Language Models with Long-Term Memory — approx. LONGMEM authors, 2024
https://scholar.google.com/scholar?q=Augmenting+Language+Models+with+Long-Term+Memory
20. Retrieval Augmented Generation or Long-Context LLMs? A Comprehensive Study and Hybrid Approach — approx. 2024/2025 comparative-study authors, 2024/2025
https://scholar.google.com/scholar?q=Retrieval+Augmented+Generation+or+Long-Context+LLMs?+A+Comprehensive+Study+and+Hybrid+Approach
21. LongRAG: Enhancing Retrieval-Augmented Generation with Long-Context LLMs — approx. LongRAG authors, 2024/2025
https://scholar.google.com/scholar?q=LongRAG:+Enhancing+Retrieval-Augmented+Generation+with+Long-Context+LLMs
22. Let's (Not) Just Put Things in Context: Test-Time Training for Long-Context LLMs — approx. 2025 test-time-training authors, 2025
https://scholar.google.com/scholar?q=Let's+(Not)+Just+Put+Things+in+Context:+Test-Time+Training+for+Long-Context+LLMs
23. Z1: Efficient Test-Time Scaling with Code — approx. Z1 authors, 2025
https://scholar.google.com/scholar?q=Z1:+Efficient+Test-Time+Scaling+with+Code
24. A Survey on Test-Time Scaling in Large Language Models: What, How, Where, and How Well? — approx. 2025 survey authors, 2025
https://scholar.google.com/scholar?q=A+Survey+on+Test-Time+Scaling+in+Large+Language+Models:+What,+How,+Where,+and+How+Well?
25. AI Post Transformers: NVIDIA: TTT-E2E: Unlocking Long-Context Learning via End-to-End Test-Time Training — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/nvidia-ttt-e2e-unlocking-long-context-learning-via-end-to-end-test-time-training/
26. AI Post Transformers: Generalist Reward Modeling with Inference-Time Scaling — Hal Turing & Dr. Ada Shannon, 2025
https://podcast.do-not-panic.com/episodes/generalist-reward-modeling-with-inference-time-scaling/
27. AI Post Transformers: AI Agent Traps and Prompt Injection — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-04-02-ai-agent-traps-and-prompt-injection-7ce4ba.mp3
28. AI Post Transformers: Agentic AI and the Next Intelligence Explosion — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-03-28-agentic-ai-and-the-next-intelligence-exp-d06561.mp3
29. AI Post Transformers: Experiential Reinforcement Learning: Internalizing Reflection for Better Policy Training — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/experiential-reinforcement-learning-internalizing-reflection-for-better-policy-t/
Interactive Visualization: Recursive Language Models for Arbitrarily Long Prompts
...more
View all episodesView all episodes
Download on the App Store

AI Post TransformersBy mcgrof