AI Post Transformers

Kosmos AI Scientist for Autonomous Discovery


Listen Later

This episode explores a 2025 paper on “Kosmos,” an AI scientist designed to carry out long-horizon research by combining literature search, hypothesis generation, code-based data analysis, and persistent memory. The discussion argues that the real innovation is not a smarter standalone language model, but a software architecture that uses agentic workflows and a structured “world model” to preserve evidence, hypotheses, and task state across many steps. It also clarifies key distinctions often blurred in AI discourse, separating AI for scientific discovery from standard deep learning, and distinguishing this kind of world model from the latent simulators used in reinforcement learning. Listeners would find it interesting for its grounded look at what it would actually take for AI to function like a junior computational scientist—and where the genuine advances may lie beyond hype.
Sources:
1. Kosmos: An AI Scientist for Autonomous Discovery — Ludovico Mitchener, Angela Yiu, Benjamin Chang, Mathieu Bourdenx, Tyler Nadolski, Arvis Sulovari, Eric C. Landsness, Daniel L. Barabasi, Siddharth Narayanan, Nicky Evans, Shriya Reddy, Martha Foiani, Aizad Kamal, Leah P. Shriver, Fang Cao, Asmamaw T. Wassie, Jon M. Laurent, Edwin Melville-Green, Mayk Caldas, Albert Bou, Kaleigh F. Roberts, Sladjana Zagorac, Timothy C. Orr, Miranda E. Orr, Kevin J. Zwezdaryk, Ali E. Ghareeb, Laurie McCoy, Bruna Gomes, Euan A. Ashley, Karen E. Duff, Tonio Buonassisi, Tom Rainforth, Randall J. Bateman, Michael Skarlinski, Samuel G. Rodriques, Michaela M. Hinks, Andrew D. White, 2025
http://arxiv.org/abs/2511.02824
2. The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery — Chris Lu, Cong Lu, Robert Tjarko Lange, Jakob Foerster, Jeff Clune, David Ha and collaborators at Sakana AI, 2024
https://scholar.google.com/scholar?q=The+AI+Scientist:+Towards+Fully+Automated+Open-Ended+Scientific+Discovery
3. Towards an AI co-scientist — Google Research collaborators including teams working on Gemini-based scientific reasoning systems, 2025
https://scholar.google.com/scholar?q=Towards+an+AI+co-scientist
4. Robin: an agentic system for automating scientific discovery in therapeutics — Andrew D. White, Samuel G. Rodriques and collaborators, 2024
https://scholar.google.com/scholar?q=Robin:+an+agentic+system+for+automating+scientific+discovery+in+therapeutics
5. Autonomous chemical research with large language models — Various groups; a representative line includes LLM-driven chemistry agents integrating planning, literature, and lab or simulation tools, 2023-2025
https://scholar.google.com/scholar?q=Autonomous+chemical+research+with+large+language+models
6. Robin — Not fully specified in the excerpt; cited as [1] and described as the authors' previous system, Unknown from excerpt
https://scholar.google.com/scholar?q=Robin
7. The AI Scientist — Sakana AI team; cited as [2], Likely 2024
https://scholar.google.com/scholar?q=The+AI+Scientist
8. AI co-scientist — Google team; cited as [3], Likely 2025
https://scholar.google.com/scholar?q=AI+co-scientist
9. Virtual Lab — Cited as [4]; exact authors not given in excerpt, Unknown from excerpt
https://scholar.google.com/scholar?q=Virtual+Lab
10. Edison Scientific data analysis agent — Cited as [5]; exact authors not given in excerpt, Unknown from excerpt
https://scholar.google.com/scholar?q=Edison+Scientific+data+analysis+agent
11. Edison Scientific literature search agent — Cited as [6]; exact authors not given in excerpt, Unknown from excerpt
https://scholar.google.com/scholar?q=Edison+Scientific+literature+search+agent
12. Planner Matters! An Efficient and Memory-Augmented Multi-agent Framework for Long-horizon GUI Planning — approx. recent multi-agent/planning paper, authors unclear from snippet, 2024/2025
https://scholar.google.com/scholar?q=Planner+Matters!+An+Efficient+and+Memory-Augmented+Multi-agent+Framework+for+Long-horizon+GUI+Planning
13. Memory-Driven Agent Planning for Long-Horizon Tasks via Hierarchical Encoding and Dynamic Retrieval — approx. recent agent-memory paper, authors unclear from snippet, 2024/2025
https://scholar.google.com/scholar?q=Memory-Driven+Agent+Planning+for+Long-Horizon+Tasks+via+Hierarchical+Encoding+and+Dynamic+Retrieval
14. Optimus-1: Hybrid multimodal memory empowered agents excel in long-horizon tasks — approx. Optimus-1 authors, exact names unclear, 2024
https://scholar.google.com/scholar?q=Optimus-1:+Hybrid+multimodal+memory+empowered+agents+excel+in+long-horizon+tasks
15. Hallucination mitigation for retrieval-augmented large language models: a review — approx. review authors unclear, 2024/2025
https://scholar.google.com/scholar?q=Hallucination+mitigation+for+retrieval-augmented+large+language+models:+a+review
16. Grounding fallacies misrepresenting scientific publications in evidence — approx. authors unclear from snippet, 2024/2025
https://scholar.google.com/scholar?q=Grounding+fallacies+misrepresenting+scientific+publications+in+evidence
17. Zero-shot scientific claim verification using LLMs and citation text — approx. authors unclear from snippet, 2024/2025
https://scholar.google.com/scholar?q=Zero-shot+scientific+claim+verification+using+LLMs+and+citation+text
18. Learning fine-grained grounded citations for attributed large language models — approx. authors unclear from snippet, 2024/2025
https://scholar.google.com/scholar?q=Learning+fine-grained+grounded+citations+for+attributed+large+language+models
19. The cost of dynamic reasoning: Demystifying AI agents and test-time scaling from an AI infrastructure perspective — approx. authors unclear from snippet, 2025
https://scholar.google.com/scholar?q=The+cost+of+dynamic+reasoning:+Demystifying+AI+agents+and+test-time+scaling+from+an+AI+infrastructure+perspective
20. The illusion of diminishing returns: Measuring long horizon execution in LLMs — approx. authors unclear from snippet, 2024/2025
https://scholar.google.com/scholar?q=The+illusion+of+diminishing+returns:+Measuring+long+horizon+execution+in+LLMs
21. AI Post Transformers: Agentic AI and the Next Intelligence Explosion — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-03-28-agentic-ai-and-the-next-intelligence-exp-d06561.mp3
22. AI Post Transformers: Mem0: Scalable Long-Term Memory for AI Agents — Hal Turing & Dr. Ada Shannon, 2025
https://podcast.do-not-panic.com/episodes/mem0-scalable-long-term-memory-for-ai-agents/
23. AI Post Transformers: LeWorldModel: Stable Joint-Embedding World Models from Pixels — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-03-25-leworldmodel-stable-joint-embedding-worl-650f9f.mp3
24. AI Post Transformers: Hallucination to Truth: A Review of Fact-Checking and Factuality Evaluation in Large Language Model — Hal Turing & Dr. Ada Shannon, 2025
https://podcast.do-not-panic.com/episodes/hallucination-to-truth-a-review-of-fact-checking-and-factuality-evaluation-in-la/
25. AI Post Transformers: MetaGraph: knowledge graphs from financial NLP — Hal Turing & Dr. Ada Shannon, 2025
https://podcast.do-not-panic.com/episodes/metagraph-knowledge-graphs-from-financial-nlp/
26. AI Post Transformers: Survey of Emerging Topics in AI and Robotics — Hal Turing & Dr. Ada Shannon, 2025
https://podcast.do-not-panic.com/episodes/survey-of-emerging-topics-in-ai-and-robotics/
27. AI Post Transformers: The Endless Gym: Training Terminal Agents — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/the-endless-gym-training-terminal-agents/
28. AI Post Transformers: Bloom: an open source tool for automated behavioral evaluations — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/bloom-an-open-source-tool-for-automated-behavioral-evaluations/
Interactive Visualization: Kosmos AI Scientist for Autonomous Discovery
...more
View all episodesView all episodes
Download on the App Store

AI Post TransformersBy mcgrof