April 04, 2026

Recursive Language Models for Arbitrarily Long Prompts

37 minutes

This episode explores a 2026 MIT CSAIL paper on “Recursive Language Models,” which argues that handling very long prompts may be better framed as a systems problem than a bigger-context-window problem. It explains the distinction between hard context overflow and “context rot,” where models technically fit long inputs but increasingly fail to use them reliably, challenging the assumption that larger windows automatically mean better memory. The discussion connects this idea to inference-time compute scaling, chain-of-thought, tree search, and agentic AI, showing how models can iteratively inspect external information, use tools, and update state instead of forcing everything through a single forward pass. Listeners would find it interesting because it offers a concrete alternative to the current long-context arms race and suggests a different path for building more capable, reliable language systems.

Sources:

1. Recursive Language Models — Alex L. Zhang, Tim Kraska, Omar Khattab, 2025

http://arxiv.org/abs/2512.24601

2. Theoretical expressive power of reasoning models — Merrill & Sabharwal, 2024

https://scholar.google.com/scholar?q=Theoretical+expressive+power+of+reasoning+models

3. Context Rot — Hong et al., 2025

https://scholar.google.com/scholar?q=Context+Rot

4. Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval — Khattab et al., 2021

https://scholar.google.com/scholar?q=Baleen:+Robust+Multi-Hop+Reasoning+at+Scale+via+Condensed+Retrieval

5. OpenAI context compaction work — OpenAI, 2025

https://scholar.google.com/scholar?q=OpenAI+context+compaction+work

6. Smith long-context compaction work — Smith, 2025

https://scholar.google.com/scholar?q=Smith+long-context+compaction+work

7. Wu et al. task-specific long-context methods — Wu et al., 2021

https://scholar.google.com/scholar?q=Wu+et+al.+task-specific+long-context+methods

8. Wu et al. context compaction / long-context scaffolding — Wu et al., 2025

https://scholar.google.com/scholar?q=Wu+et+al.+context+compaction+/+long-context+scaffolding

9. Anthropic self-delegation / sub-agent work — Anthropic, 2025

https://scholar.google.com/scholar?q=Anthropic+self-delegation+/+sub-agent+work

10. Schroeder et al. self-delegation work — Schroeder et al., 2025

https://scholar.google.com/scholar?q=Schroeder+et+al.+self-delegation+work

11. Sun et al. self-delegation work — Sun et al., 2025

https://scholar.google.com/scholar?q=Sun+et+al.+self-delegation+work

12. Deep research benchmark/work — Chen et al., 2025

https://scholar.google.com/scholar?q=Deep+research+benchmark/work

13. Information aggregation benchmark/work — Bertsch et al., 2025

https://scholar.google.com/scholar?q=Information+aggregation+benchmark/work

14. Code repository understanding benchmark/work — Bai et al., 2025

https://scholar.google.com/scholar?q=Code+repository+understanding+benchmark/work

15. Qwen3 technical report — Yang et al., 2025

https://scholar.google.com/scholar?q=Qwen3+technical+report

16. Core Context Aware Transformers for Long Context Language Modeling — approx. 2024 long-context transformer authors, 2024

https://scholar.google.com/scholar?q=Core+Context+Aware+Transformers+for+Long+Context+Language+Modeling

17. Challenges in Deploying Long-Context Transformers: A Theoretical Peak Performance Analysis — approx. 2024 systems/theory authors, 2024

https://scholar.google.com/scholar?q=Challenges+in+Deploying+Long-Context+Transformers:+A+Theoretical+Peak+Performance+Analysis

18. Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models — approx. 2024 dialogue-memory authors, 2024

https://scholar.google.com/scholar?q=Recursively+Summarizing+Enables+Long-Term+Dialogue+Memory+in+Large+Language+Models

19. Augmenting Language Models with Long-Term Memory — approx. LONGMEM authors, 2024

https://scholar.google.com/scholar?q=Augmenting+Language+Models+with+Long-Term+Memory

20. Retrieval Augmented Generation or Long-Context LLMs? A Comprehensive Study and Hybrid Approach — approx. 2024/2025 comparative-study authors, 2024/2025

https://scholar.google.com/scholar?q=Retrieval+Augmented+Generation+or+Long-Context+LLMs?+A+Comprehensive+Study+and+Hybrid+Approach

21. LongRAG: Enhancing Retrieval-Augmented Generation with Long-Context LLMs — approx. LongRAG authors, 2024/2025

https://scholar.google.com/scholar?q=LongRAG:+Enhancing+Retrieval-Augmented+Generation+with+Long-Context+LLMs

22. Let's (Not) Just Put Things in Context: Test-Time Training for Long-Context LLMs — approx. 2025 test-time-training authors, 2025

https://scholar.google.com/scholar?q=Let's+(Not)+Just+Put+Things+in+Context:+Test-Time+Training+for+Long-Context+LLMs

23. Z1: Efficient Test-Time Scaling with Code — approx. Z1 authors, 2025

https://scholar.google.com/scholar?q=Z1:+Efficient+Test-Time+Scaling+with+Code

24. A Survey on Test-Time Scaling in Large Language Models: What, How, Where, and How Well? — approx. 2025 survey authors, 2025

https://scholar.google.com/scholar?q=A+Survey+on+Test-Time+Scaling+in+Large+Language+Models:+What,+How,+Where,+and+How+Well?

25. AI Post Transformers: NVIDIA: TTT-E2E: Unlocking Long-Context Learning via End-to-End Test-Time Training — Hal Turing & Dr. Ada Shannon, 2026

https://podcast.do-not-panic.com/episodes/nvidia-ttt-e2e-unlocking-long-context-learning-via-end-to-end-test-time-training/

26. AI Post Transformers: Generalist Reward Modeling with Inference-Time Scaling — Hal Turing & Dr. Ada Shannon, 2025

https://podcast.do-not-panic.com/episodes/generalist-reward-modeling-with-inference-time-scaling/

27. AI Post Transformers: AI Agent Traps and Prompt Injection — Hal Turing & Dr. Ada Shannon, 2026

https://podcast.do-not-panic.com/episodes/2026-04-02-ai-agent-traps-and-prompt-injection-7ce4ba.mp3

28. AI Post Transformers: Agentic AI and the Next Intelligence Explosion — Hal Turing & Dr. Ada Shannon, 2026

https://podcast.do-not-panic.com/episodes/2026-03-28-agentic-ai-and-the-next-intelligence-exp-d06561.mp3

29. AI Post Transformers: Experiential Reinforcement Learning: Internalizing Reflection for Better Policy Training — Hal Turing & Dr. Ada Shannon, 2026

https://podcast.do-not-panic.com/episodes/experiential-reinforcement-learning-internalizing-reflection-for-better-policy-t/

Interactive Visualization: Recursive Language Models for Arbitrarily Long Prompts

...more

View all episodes

By mcgrof