March 27, 2026

HyperAgents and Metacognitive Self-Improvement

23 minutes

This episode explores the HyperAgents paper and its central claim that an AI system can improve not only its task behavior but also the procedure it uses to generate future improvements. It explains recursive self-improvement in practical terms as an outer engineering loop over prompts, code, tools, memory, and evaluators, and contrasts that with standard deep learning, where the learning process itself stays fixed. The discussion focuses on why freezing the meta-agent creates a conceptual and practical bottleneck, how HyperAgents try to remove that ceiling by making the improver editable too, and why that could matter beyond coding in domains like reviewing or grading. A listener would find it interesting for its clear debate over whether this is a genuine step toward more general self-improving agents or simply a cleaner packaging of familiar external scaffolding and control mechanisms.

Sources:

1. Hyperagents — Jenny Zhang, Bingchen Zhao, Wannan Yang, Jakob Foerster, Jeff Clune, Minqi Jiang, Sam Devlin, Tatiana Shavrina, 2026

http://arxiv.org/abs/2603.19461

2. Gödel Machines: Fully Self-Referential Optimal Universal Self-Improvers — Jürgen Schmidhuber, 2003

https://scholar.google.com/scholar?q=Gödel+Machines:+Fully+Self-Referential+Optimal+Universal+Self-Improvers

3. Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation — Eric Zelikman, Eliana Lorch, Lester Mackey, Adam Tauman Kalai, 2023

https://scholar.google.com/scholar?q=Self-Taught+Optimizer+(STOP):+Recursively+Self-Improving+Code+Generation

4. Darwin Gödel Machine: Open-Ended Evolution of Self-Improving Agents — Jenny Zhang, Shengran Hu, Cong Lu, Robert Lange, Jeff Clune, 2025

https://scholar.google.com/scholar?q=Darwin+Gödel+Machine:+Open-Ended+Evolution+of+Self-Improving+Agents

5. AlphaEvolve: A Coding Agent for Scientific and Algorithmic Discovery — Alexander Novikov, Ngân Vũ, Marvin Eisenberger and colleagues, 2025

https://scholar.google.com/scholar?q=AlphaEvolve:+A+Coding+Agent+for+Scientific+and+Algorithmic+Discovery

6. Self-Referential Meta Learning — Louis Kirsch, Jürgen Schmidhuber, 2022

https://scholar.google.com/scholar?q=Self-Referential+Meta+Learning

7. A Modern Self-Referential Weight Matrix That Learns to Modify Itself — Kazuki Irie, Imanol Schlag, Róbert Csordás, Jürgen Schmidhuber, 2022

https://scholar.google.com/scholar?q=A+Modern+Self-Referential+Weight+Matrix+That+Learns+to+Modify+Itself

8. Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement — Xunjian Yin, Xinyi Wang, Liangming Pan, Xiaojun Wan, William Yang Wang, 2025

https://scholar.google.com/scholar?q=Gödel+Agent:+A+Self-Referential+Agent+Framework+for+Recursive+Self-Improvement

9. Darwin Gödel Machine — Jenny Zhang et al., 2025

https://scholar.google.com/scholar?q=Darwin+Gödel+Machine

10. Self-Referential AI Systems — Louis Kirsch and Jürgen Schmidhuber, 2022

https://scholar.google.com/scholar?q=Self-Referential+AI+Systems

11. Recursive Self-Improvement Can Be Bounded or Self-Accelerating — Lu et al., 2023

https://scholar.google.com/scholar?q=Recursive+Self-Improvement+Can+Be+Bounded+or+Self-Accelerating

12. Polyglot — Gauthier, 2024

https://scholar.google.com/scholar?q=Polyglot

13. Paper Review Benchmark — Zhao et al., 2026

https://scholar.google.com/scholar?q=Paper+Review+Benchmark

14. Genesis — Genesis authors, 2024

https://scholar.google.com/scholar?q=Genesis

15. Learning How to Remember: A Meta-Cognitive Management Method for Structured and Transferable Agent Memory — approx. unknown from snippet, recent

https://scholar.google.com/scholar?q=Learning+How+to+Remember:+A+Meta-Cognitive+Management+Method+for+Structured+and+Transferable+Agent+Memory

16. Memory in the Age of AI Agents — approx. unknown from snippet, recent

https://scholar.google.com/scholar?q=Memory+in+the+Age+of+AI+Agents

17. Discovering Hierarchical Software Engineering Agents via Bandit Optimization — approx. unknown from snippet, recent

https://scholar.google.com/scholar?q=Discovering+Hierarchical+Software+Engineering+Agents+via+Bandit+Optimization

18. BOAD: Discovering Hierarchical Software Engineering Agents via Bandit Optimization — approx. unknown from snippet, recent

https://scholar.google.com/scholar?q=BOAD:+Discovering+Hierarchical+Software+Engineering+Agents+via+Bandit+Optimization

19. Automated Design of Agentic Systems — approx. unknown from snippet, recent

https://scholar.google.com/scholar?q=Automated+Design+of+Agentic+Systems

20. AI Post Transformers: Experiential Reinforcement Learning: Internalizing Reflection for Better Policy Training — Hal Turing & Dr. Ada Shannon, 2026

https://podcast.do-not-panic.com/episodes/experiential-reinforcement-learning-internalizing-reflection-for-better-policy-t/

21. AI Post Transformers: LLM Agents Reason About Code Without Running It — Hal Turing & Dr. Ada Shannon, 2026

https://podcast.do-not-panic.com/episodes/2026-03-15-llm-agents-reason-about-code-without-run-2a1876.mp3

22. AI Post Transformers: Mem0: Scalable Long-Term Memory for AI Agents — Hal Turing & Dr. Ada Shannon, 2025

https://podcast.do-not-panic.com/episodes/mem0-scalable-long-term-memory-for-ai-agents/

Interactive Visualization: HyperAgents and Metacognitive Self-Improvement

...more

View all episodes

By mcgrof