AI Post Transformers

By mcgrof

AI-generated podcast where hosts Hal Turing and Dr. Ada Shannon discuss the latest research papers and reports in machine learning, AI systems, and optimization. Featuring honest critical analysis, pr... more

· Technology

Download on the App Store

Download on the App Store

Get it on Google Play

FAQs about AI Post Transformers:

How many episodes does AI Post Transformers have?

The podcast currently has 559 episodes available.

AI Post Transformers episodes:

April 23, 2026 TokenDance for Multi-Agent KV Cache Sharing
This episode explores TokenDance, a systems approach for serving many LLM-based agents more efficiently by collectively sharing transformer KV caches across synchronized conversation rounds. It explains why multi-agent workloads are fundamentally different from ordinary chat serving: agents persist across rounds, accumulate large KV caches, and often follow an “all-gather” pattern where each agent receives a mostly shared prompt plus its own private history, making standard prefix-based cache reuse ineffective. The discussion argues that the key innovation is shifting cache reuse from individual requests to the entire round of agents as a collective object, enabling memory savings and better scalability on the same GPU. Listeners interested in agent systems, inference infrastructure, and practical bottlenecks beyond model architecture will find it compelling for its concrete diagnosis of memory management as the real constraint.
Sources:
1. TokenDance: Scaling Multi-Agent LLM Serving via Collective KV Cache Sharing — Zhuohang Bian, Feiyang Wu, Chengrui Zhang, Hangcheng Dong, Yun Liang, Youwei Zhuo, 2026
http://arxiv.org/abs/2604.03143
2. TokenDance: Scaling Multi-Agent LLM Serving via Collective KV Cache Sharing — Zhuohang Bian, Feiyang Wu, Chengrui Zhang, Hangcheng Dong, Yun Liang, Youwei Zhuo, 2026
https://scholar.google.com/scholar?q=TokenDance:+Scaling+Multi-Agent+LLM+Serving+via+Collective+KV+Cache+Sharing
3. Efficient Memory Management for Large Language Model Serving with PagedAttention — Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Hao Zhang, et al., 2023
https://scholar.google.com/scholar?q=Efficient+Memory+Management+for+Large+Language+Model+Serving+with+PagedAttention
4. SGLang: Efficient Execution of Structured Language Model Programs — Lianmin Zheng, Weizhe Chen, Ying Sheng, Tianqi Chen, Ion Stoica, and collaborators, 2024
https://scholar.google.com/scholar?q=SGLang:+Efficient+Execution+of+Structured+Language+Model+Programs
5. Parrot: Efficient Serving of LLM-based Applications with Semantic Variable — Xiangyao Yu and collaborators, 2024
https://scholar.google.com/scholar?q=Parrot:+Efficient+Serving+of+LLM-based+Applications+with+Semantic+Variable
6. vLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention — Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, et al., 2023
https://scholar.google.com/scholar?q=vLLM:+Easy,+Fast,+and+Cheap+LLM+Serving+with+PagedAttention
7. SGLang — SGLang team / related authors as cited in the paper, 2024
https://scholar.google.com/scholar?q=SGLang
8. Parrot — Authors as cited in the paper, 2024
https://scholar.google.com/scholar?q=Parrot
9. Autellix — Authors as cited in the paper, 2024
https://scholar.google.com/scholar?q=Autellix
10. Tokencake — Authors as cited in the paper, 2024
https://scholar.google.com/scholar?q=Tokencake
11. Generative Agents: Interactive Simulacra of Human Behavior — Joon Sung Park, Joseph O'Brien, Carrie Cai, Meredith Ringel Morris, Percy Liang, Michael S. Bernstein, 2023
https://scholar.google.com/scholar?q=Generative+Agents:+Interactive+Simulacra+of+Human+Behavior
12. Position-independent KV-cache reuse papers cited as [10, 34-36] — Authors as cited in the paper, 2024-2026
https://scholar.google.com/scholar?q=Position-independent+KV-cache+reuse+papers+cited+as+[10,+34-36]
13. OpenClaw — Authors as cited in the paper, 2024
https://scholar.google.com/scholar?q=OpenClaw
14. MoltBook — Authors as cited in the paper, 2024
https://scholar.google.com/scholar?q=MoltBook
15. DynTaskMAS: A Dynamic Task Graph-Driven Framework for Asynchronous and Parallel LLM-Based Multi-Agent Systems — approx. recent multi-agent systems authors, 2024/2025
https://scholar.google.com/scholar?q=DynTaskMAS:+A+Dynamic+Task+Graph-Driven+Framework+for+Asynchronous+and+Parallel+LLM-Based+Multi-Agent+Systems
16. Kairos: Low-Latency Multi-Agent Serving with Shared LLMs and Excessive Loads in the Public Cloud — approx. recent systems authors, 2024/2025
https://scholar.google.com/scholar?q=Kairos:+Low-Latency+Multi-Agent+Serving+with+Shared+LLMs+and+Excessive+Loads+in+the+Public+Cloud
17. CacheSlide: Unlocking Cross Position-Aware KV Cache Reuse for Accelerating LLM Serving — approx. recent LLM serving authors, 2024/2025
https://scholar.google.com/scholar?q=CacheSlide:+Unlocking+Cross+Position-Aware+KV+Cache+Reuse+for+Accelerating+LLM+Serving
18. Where Matters More Than What: Decoding-Aligned KV Cache Compression via Position-Aware Pseudo Queries — approx. recent KV compression authors, 2024/2025
https://scholar.google.com/scholar?q=Where+Matters+More+Than+What:+Decoding-Aligned+KV+Cache+Compression+via+Position-Aware+Pseudo+Queries
19. KVLink: Accelerating Large Language Models via Efficient KV Cache Reuse — approx. recent KV reuse authors, 2024/2025
https://scholar.google.com/scholar?q=KVLink:+Accelerating+Large+Language+Models+via+Efficient+KV+Cache+Reuse
20. HyperRAG: Enhancing Quality-Efficiency Tradeoffs in Retrieval-Augmented Generation with Reranker KV-Cache Reuse — approx. recent RAG authors, 2024/2025
https://scholar.google.com/scholar?q=HyperRAG:+Enhancing+Quality-Efficiency+Tradeoffs+in+Retrieval-Augmented+Generation+with+Reranker+KV-Cache+Reuse
21. ProphetKV: User-Query-Driven Selective Recomputation for Efficient KV Cache Reuse in Retrieval-Augmented Generation — approx. recent RAG/KV authors, 2024/2025
https://scholar.google.com/scholar?q=ProphetKV:+User-Query-Driven+Selective+Recomputation+for+Efficient+KV+Cache+Reuse+in+Retrieval-Augmented+Generation
22. Eigen Attention: Attention in Low-Rank Space for KV Cache Compression — approx. recent KV compression authors, 2024/2025
https://scholar.google.com/scholar?q=Eigen+Attention:+Attention+in+Low-Rank+Space+for+KV+Cache+Compression
23. PALU: KV-Cache Compression with Low-Rank Projection — approx. recent systems/ML authors, 2024/2025
https://scholar.google.com/scholar?q=PALU:+KV-Cache+Compression+with+Low-Rank+Projection
24. LORC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy — approx. recent KV compression authors, 2024/2025
https://scholar.google.com/scholar?q=LORC:+Low-Rank+Compression+for+LLMs+KV+Cache+with+a+Progressive+Compression+Strategy
25. AI Post Transformers: CacheSlide: Position-Aware KV Cache Reuse for Agent LLMs — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-03-16-cacheslide-position-aware-kv-cache-reuse-cd59c7.mp3
26. AI Post Transformers: ContiguousKV for Faster LLM Prefill KV Reuse — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-04-20-contiguouskv-for-faster-llm-prefill-kv-r-59f545.mp3
27. AI Post Transformers: KV Cache TTL for Multi-Turn Agent Scheduling — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-04-09-kv-cache-ttl-for-multi-turn-agent-schedu-996bf1.mp3
28. AI Post Transformers: Continuous Batching for LLM Inference: Throughput and Latency Gains — Hal Turing & Dr. Ada Shannon, 2025
https://podcast.do-not-panic.com/episodes/continuous-batching-for-llm-inference-throughput-and-latency-gains/
29. AI Post Transformers: Speculative Decoding in Real vLLM Serving — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-04-04-speculative-decoding-in-real-vllm-servin-6f4e2b.mp3
30. AI Post Transformers: Splitwise: Phase-Split LLM Inference — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-03-26-splitwise-phase-split-llm-inference-e8945b.mp3
31. AI Post Transformers: FengHuang for Rack-Scale LLM Inference Memory — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-04-12-fenghuang-for-rack-scale-llm-inference-m-62708e.mp3
32. AI Post Transformers: From Prefix Cache to Fusion RAG Cache: Accelerating LLM Inference in Retrieval-Augmented Generation — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-03-22-from-prefix-cache-to-fusion-rag-9c5d39.mp3
Interactive Visualization: TokenDance for Multi-Agent KV Cache Sharing
...more
21min
April 23, 2026 Agentic Aggregation for Long-Horizon AI Tasks
This episode explores a Princeton paper on whether multiple long-running, tool-using AI agent trajectories can be combined more effectively by an “aggregator agent” that selectively inspects the full traces, rather than by simple answer voting or compressed summaries. It explains why aggregation gets much harder for long-horizon agentic tasks like web research, navigation, and software repair, where useful evidence is scattered across search queries, tool calls, observations, and partial plans instead of ending in a neat final answer. The discussion situates the work against self-consistency, repeated sampling, ReAct, and Tree of Thoughts, arguing that the real novelty is not parallel rollouts themselves but how to reason over archived trajectories after the runs are complete. Listeners would find it interesting because it gets at a practical bottleneck in scaling AI performance at inference time: where extra compute should be spent, and how to recover the one crucial clue buried inside a pile of messy agent logs.
Sources:
1. Agentic Aggregation for Parallel Scaling of Long-Horizon Agentic Tasks — Yoonsang Lee, Howard Yen, Xi Ye, Danqi Chen, 2026
http://arxiv.org/abs/2604.11753
2. Self-Consistency Improves Chain of Thought Reasoning in Language Models — Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc Le, Ed Chi, Sharan Narang, Aakanksha Chowdhery, Denny Zhou, 2023
https://scholar.google.com/scholar?q=Self-Consistency+Improves+Chain+of+Thought+Reasoning+in+Language+Models
3. Tree of Thoughts: Deliberate Problem Solving with Large Language Models — Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas Griffiths, Yuan Cao, Karthik Narasimhan, 2023
https://scholar.google.com/scholar?q=Tree+of+Thoughts:+Deliberate+Problem+Solving+with+Large+Language+Models
4. Large Language Monkeys: Scaling Inference Compute with Repeated Sampling — Charlie Snell and collaborators, 2024
https://scholar.google.com/scholar?q=Large+Language+Monkeys:+Scaling+Inference+Compute+with+Repeated+Sampling
5. Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters — Anonymous/OpenAI-aligned line of work often associated with inference scaling discussions; exact authorship depends on version, 2024
https://scholar.google.com/scholar?q=Scaling+LLM+Test-Time+Compute+Optimally+can+be+More+Effective+than+Scaling+Model+Parameters
6. ReAct: Synergizing Reasoning and Acting in Language Models — Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, Yuan Cao, 2023
https://scholar.google.com/scholar?q=ReAct:+Synergizing+Reasoning+and+Acting+in+Language+Models
7. WebArena: A Realistic Web Environment for Building Autonomous Agents — Shuyan Zhou, Frank F. Xu, Hao Zhu, Xuhui Zhou, et al., 2024
https://scholar.google.com/scholar?q=WebArena:+A+Realistic+Web+Environment+for+Building+Autonomous+Agents
8. GAIA: a benchmark for General AI Assistants — Grégoire Mialon and collaborators, 2023
https://scholar.google.com/scholar?q=GAIA:+a+benchmark+for+General+AI+Assistants
9. SWE-bench: Can Language Models Resolve Real-World GitHub Issues? — John Yang, Carlos E. Jimenez, Alexander Wettig, Shiyue Deng, et al., 2024
https://scholar.google.com/scholar?q=SWE-bench:+Can+Language+Models+Resolve+Real-World+GitHub+Issues?
10. Toolformer: Language Models Can Teach Themselves to Use Tools — Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Eric Hambro, Luke Zettlemoyer, Jason Weston, Mike Lewis, 2023
https://scholar.google.com/scholar?q=Toolformer:+Language+Models+Can+Teach+Themselves+to+Use+Tools
11. MRKL Systems: A Modular, Neuro-Symbolic Architecture That Combines Large Language Models, External Knowledge Sources and Discrete Reasoning — A. Karpas, Y. Levine, Y. M. Jang, et al., 2022
https://scholar.google.com/scholar?q=MRKL+Systems:+A+Modular,+Neuro-Symbolic+Architecture+That+Combines+Large+Language+Models,+External+Knowledge+Sources+and+Discrete+Reasoning
12. Gorilla: Large Language Model Connected with Massive APIs — Patil, Zhang, Wang, et al., 2023
https://scholar.google.com/scholar?q=Gorilla:+Large+Language+Model+Connected+with+Massive+APIs
13. Best-of-N Test-Time Scaling — Charlie Snell, et al., 2025
https://scholar.google.com/scholar?q=Best-of-N+Test-Time+Scaling
14. Inference-Time Scaling for Generalist Reward Modeling / Search-based test-time scaling works cited as Brown et al. 2024, Welleck et al. 2024, Muennighoff et al. 2025, Zhao et al. 2025 — Various, 2024-2025
https://scholar.google.com/scholar?q=Inference-Time+Scaling+for+Generalist+Reward+Modeling+/+Search-based+test-time+scaling+works+cited+as+Brown+et+al.+2024,+Welleck+et+al.+2024,+Muennighoff+et+al.+2025,+Zhao+et+al.+2025
15. BrowseComp — Jason Wei, et al., 2025
https://scholar.google.com/scholar?q=BrowseComp
16. HLE — Phan, et al., 2025
https://scholar.google.com/scholar?q=HLE
17. WebDancer or WebWalker-style web navigation/agent benchmarks and newer deep research benchmarks such as DeepResearch Bench — Various, 2024-2026
https://scholar.google.com/scholar?q=WebDancer+or+WebWalker-style+web+navigation/agent+benchmarks+and+newer+deep+research+benchmarks+such+as+DeepResearch+Bench
18. Reflexion: Language Agents with Verbal Reinforcement Learning — Noah Shinn, Federico Cassano, et al., 2023
https://scholar.google.com/scholar?q=Reflexion:+Language+Agents+with+Verbal+Reinforcement+Learning
19. Language Agent Tree Search / Planning with MCTS-style LLM agents — Various, 2023-2025
https://scholar.google.com/scholar?q=Language+Agent+Tree+Search+/+Planning+with+MCTS-style+LLM+agents
20. iMAD: Intelligent Multi-Agent Debate for Efficient and Accurate LLM Inference — approx. 2025 multi-agent debate authors, 2025
https://scholar.google.com/scholar?q=iMAD:+Intelligent+Multi-Agent+Debate+for+Efficient+and+Accurate+LLM+Inference
21. GroupDebate: Enhancing the Efficiency of Multi-Agent Debate Using Group Discussion — approx. 2024/2025 multi-agent debate authors, 2024/2025
https://scholar.google.com/scholar?q=GroupDebate:+Enhancing+the+Efficiency+of+Multi-Agent+Debate+Using+Group+Discussion
22. Improving Multi-Agent Debate with Sparse Communication Topology — approx. 2024/2025 multi-agent debate authors, 2024/2025
https://scholar.google.com/scholar?q=Improving+Multi-Agent+Debate+with+Sparse+Communication+Topology
23. VeriGuard: Enhancing LLM Agent Safety via Verified Code Generation — approx. 2025 verification/safety authors, 2025
https://scholar.google.com/scholar?q=VeriGuard:+Enhancing+LLM+Agent+Safety+via+Verified+Code+Generation
24. Verifiability-First Agents: Provable Observability and Lightweight Audit Agents for Controlling Autonomous LLM Systems — approx. 2025 agent verification authors, 2025
https://scholar.google.com/scholar?q=Verifiability-First+Agents:+Provable+Observability+and+Lightweight+Audit+Agents+for+Controlling+Autonomous+LLM+Systems
25. AI Post Transformers: DeepResearch Arena: Benchmarking LLMs' Research Abilities — Hal Turing & Dr. Ada Shannon, 2025
https://podcast.do-not-panic.com/episodes/deepresearch-arena-benchmarking-llms-research-abilities/
26. AI Post Transformers: Experimental Comparison of Agentic and Enhanced RAG — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-04-14-experimental-comparison-of-agentic-and-e-37d8bc.mp3
27. AI Post Transformers: Adaptive Test-Time Scaling with World Models for Visual Spatial Reasoning — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/adaptive-test-time-scaling-with-world-models-for-visual-spatial-reasoning/
28. AI Post Transformers: Generalist Reward Modeling with Inference-Time Scaling — Hal Turing & Dr. Ada Shannon, 2025
https://podcast.do-not-panic.com/episodes/generalist-reward-modeling-with-inference-time-scaling/
29. AI Post Transformers: Bloom: an open source tool for automated behavioral evaluations — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/bloom-an-open-source-tool-for-automated-behavioral-evaluations/
Interactive Visualization: Agentic Aggregation for Long-Horizon AI Tasks
...more
20min
April 22, 2026 Efficient KV Cache Sharing for Multi-LoRA Agents
This episode explores a systems paper on making multi-agent LLM setups far more efficient by sharing most of the KV cache across agents that use the same base model with different LoRA adapters. It explains the core argument: for a shared long context, the backbone model’s hidden states are nearly identical across agents, while most role-specific differences come from LoRA’s low-rank adapter outputs, making it possible to store one shared base cache plus tiny agent-specific low-rank caches. The discussion breaks down how LoRA’s down- and up-projection structure enables this cache design, why “shared-A” multi-LoRA expands what can be shared, and how a custom Flash-LoRA-Attention kernel reconstructs adapter effects efficiently at inference time. Listeners would find it interesting because it connects transformer math to a concrete bottleneck in real agent systems—long prompts, repeated prefills, and exploding GPU memory—and examines whether the reported gains come from the cache-sharing idea itself, the kernel engineering, or both.
Sources:
1. LRAgent: Efficient KV Cache Sharing for Multi-LoRA LLM Agents — Hyesung Jeon, Hyeongju Ha, Jae-Joon Kim, 2026
http://arxiv.org/abs/2602.01053
2. LoRA: Low-Rank Adaptation of Large Language Models — Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen, 2022
https://scholar.google.com/scholar?q=LoRA:+Low-Rank+Adaptation+of+Large+Language+Models
3. FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness — Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, Christopher Ré, 2022
https://scholar.google.com/scholar?q=FlashAttention:+Fast+and+Memory-Efficient+Exact+Attention+with+IO-Awareness
4. FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning — Tri Dao, 2023
https://scholar.google.com/scholar?q=FlashAttention-2:+Faster+Attention+with+Better+Parallelism+and+Work+Partitioning
5. S-LoRA: Serving Thousands of Concurrent LoRA Adapters — Zhen Wang and collaborators, 2023
https://scholar.google.com/scholar?q=S-LoRA:+Serving+Thousands+of+Concurrent+LoRA+Adapters
6. MiLoRA: Efficient Serving for Multiple LoRA Adapters — Xia et al., 2024
https://scholar.google.com/scholar?q=MiLoRA:+Efficient+Serving+for+Multiple+LoRA+Adapters
7. MELoRA: Mini-Ensemble Low-Rank Adapters for Parameter-Efficient Fine-Tuning — Tian et al., 2024
https://scholar.google.com/scholar?q=MELoRA:+Mini-Ensemble+Low-Rank+Adapters+for+Parameter-Efficient+Fine-Tuning
8. Multi-Head Latent Attention — Ji et al. / DeepSeek-AI team, 2025
https://scholar.google.com/scholar?q=Multi-Head+Latent+Attention
9. ReAct: Synergizing Reasoning and Acting in Language Models — Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, Yuan Cao, 2023
https://scholar.google.com/scholar?q=ReAct:+Synergizing+Reasoning+and+Acting+in+Language+Models
10. Tree of Thoughts: Deliberate Problem Solving with Large Language Models — Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas Griffiths, Yuan Cao, Karthik Narasimhan, 2023
https://scholar.google.com/scholar?q=Tree+of+Thoughts:+Deliberate+Problem+Solving+with+Large+Language+Models
11. KV Packet: Recomputation-Free Context-Independent KV Caching for LLMs — approx. unknown from snippet, recent/2025-2026
https://scholar.google.com/scholar?q=KV+Packet:+Recomputation-Free+Context-Independent+KV+Caching+for+LLMs
12. Kvshare: An LLM Service System with Efficient and Effective Multi-Tenant KV Cache Reuse — approx. unknown from snippet, recent/2025-2026
https://scholar.google.com/scholar?q=Kvshare:+An+LLM+Service+System+with+Efficient+and+Effective+Multi-Tenant+KV+Cache+Reuse
13. Improving the Serving Performance of Multi-LoRA Large Language Models via Efficient LoRA and KV Cache Management — approx. unknown from snippet, recent/2025-2026
https://scholar.google.com/scholar?q=Improving+the+Serving+Performance+of+Multi-LoRA+Large+Language+Models+via+Efficient+LoRA+and+KV+Cache+Management
14. AIRA: Activation-Informed Low-Rank Adaptation for Large Models — approx. unknown from snippet, recent/2025-2026
https://scholar.google.com/scholar?q=AIRA:+Activation-Informed+Low-Rank+Adaptation+for+Large+Models
15. Activation-guided Low-Rank Parameter Adaptation for Efficient Model Fine-Tuning — approx. unknown from snippet, recent/2025-2026
https://scholar.google.com/scholar?q=Activation-guided+Low-Rank+Parameter+Adaptation+for+Efficient+Model+Fine-Tuning
16. Capacity and Redundancy Trade-offs in Multi-Task Learning — approx. unknown from snippet, recent/2025-2026
https://scholar.google.com/scholar?q=Capacity+and+Redundancy+Trade-offs+in+Multi-Task+Learning
17. Align, Don't Divide: Revisiting the LoRA Architecture in Multi-Task Learning — approx. unknown from snippet, recent/2025-2026
https://scholar.google.com/scholar?q=Align,+Don't+Divide:+Revisiting+the+LoRA+Architecture+in+Multi-Task+Learning
18. AI Post Transformers: Doc-to-LoRA: Internalizing Context as LoRA — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-03-29-doc-to-lora-internalizing-context-as-lor-8dd5ec.mp3
19. AI Post Transformers: FAST26: Bidaw: Enhancing Key-Value Caching for Interactive LLM Serving via Bidirectional — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/fast26-bidaw-enhancing-key-value-caching-for-interactive-llm-serving-via-bidirec/
20. AI Post Transformers: Quest: Query-Aware Sparsity for Efficient LLM Inference — Hal Turing & Dr. Ada Shannon, 2025
https://podcast.do-not-panic.com/episodes/quest-query-aware-sparsity-for-efficient-llm-inference/
21. AI Post Transformers: Prefill-as-a-Service for Cross-Datacenter KV Cache — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-04-19-prefill-as-a-service-for-cross-datacente-7560be.mp3
22. AI Post Transformers: Splitwise: Phase-Split LLM Inference — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-03-26-splitwise-phase-split-llm-inference-e8945b.mp3
23. AI Post Transformers: TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-03-25-turboquant-online-vector-quantiz-1967b7.mp3
24. AI Post Transformers: FengHuang for Rack-Scale LLM Inference Memory — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-04-12-fenghuang-for-rack-scale-llm-inference-m-62708e.mp3
Interactive Visualization: Efficient KV Cache Sharing for Multi-LoRA Agents
...more
17min
April 22, 2026 CacheBlend for Fast RAG Serving
This episode explores a systems paper on speeding up retrieval-augmented generation by reusing KV caches for frequently repeated retrieved documents, even when those documents are not exact prompt prefixes. It explains why long RAG prompts make prefill the main latency bottleneck, why standard prefix caching only helps in narrow cases, and why naive non-prefix cache reuse can hurt quality by ignoring cross-chunk attention between the query and retrieved passages. The discussion centers on CacheBlend’s core argument: selectively recomputing only the parts of a reused chunk that need updated context could preserve answer quality while significantly improving time-to-first-token. Listeners would find it interesting for its practical focus on the tradeoff between real-world serving speed and faithful multi-document reasoning, rather than on new model architectures.
Sources:
1. CacheBlend: Fast Large Language Model Serving for RAG with Cached Knowledge Fusion — Jiayi Yao, Hanchen Li, Yuhan Liu, Siddhant Ray, Yihua Cheng, Qizheng Zhang, Kuntai Du, Shan Lu, Junchen Jiang, 2024
http://arxiv.org/abs/2405.16444
2. Prompt Cache: Modular Attention Reuse for Low-Latency Inference — Yao Fu, et al., 2024
https://scholar.google.com/scholar?q=Prompt+Cache:+Modular+Attention+Reuse+for+Low-Latency+Inference
3. CacheGen: KV Cache Compression and Streaming for Fast Large Language Model Serving — Junxian He, et al., 2024
https://scholar.google.com/scholar?q=CacheGen:+KV+Cache+Compression+and+Streaming+for+Fast+Large+Language+Model+Serving
4. RadixAttention for Efficient KV Cache Sharing in LLM Serving — LMSYS / SGLang authors, 2024
https://scholar.google.com/scholar?q=RadixAttention+for+Efficient+KV+Cache+Sharing+in+LLM+Serving
5. vLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention — Woosuk Kwon, et al., 2023
https://scholar.google.com/scholar?q=vLLM:+Easy,+Fast,+and+Cheap+LLM+Serving+with+PagedAttention
6. Memorizing Transformers — Angeliki Lazaridou, et al., 2022
https://scholar.google.com/scholar?q=Memorizing+Transformers
7. FlashAttention — Tri Dao, et al., 2022
https://scholar.google.com/scholar?q=FlashAttention
8. A Survey on Retrieval-Augmented Text Generation — Zhiheng Gao, et al., 2024
https://scholar.google.com/scholar?q=A+Survey+on+Retrieval-Augmented+Text+Generation
9. Kvlink: Accelerating Large Language Models via Efficient KV Cache Reuse — approx. recent systems/LLM serving authors, 2024/2025
https://scholar.google.com/scholar?q=Kvlink:+Accelerating+Large+Language+Models+via+Efficient+KV+Cache+Reuse
10. An Experimental Study of KV Cache Reuse Strategies in Chunk-Level Caching Systems — approx. recent systems authors, 2024/2025
https://scholar.google.com/scholar?q=An+Experimental+Study+of+KV+Cache+Reuse+Strategies+in+Chunk-Level+Caching+Systems
11. Efficient Streaming Language Models with Attention Sinks — Xiao et al. / approximate, 2024
https://scholar.google.com/scholar?q=Efficient+Streaming+Language+Models+with+Attention+Sinks
12. Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation — approx. survey authors, 2024/2025
https://scholar.google.com/scholar?q=Attention+Sink+in+Transformers:+A+Survey+on+Utilization,+Interpretation,+and+Mitigation
13. Long Context vs. RAG for LLMs: An Evaluation and Revisits — approx. recent RAG evaluation authors, 2024
https://scholar.google.com/scholar?q=Long+Context+vs.+RAG+for+LLMs:+An+Evaluation+and+Revisits
14. Long-Context LLMs Meet RAG: Overcoming Challenges for Long Inputs in RAG — approx. recent RAG authors, 2024
https://scholar.google.com/scholar?q=Long-Context+LLMs+Meet+RAG:+Overcoming+Challenges+for+Long+Inputs+in+RAG
15. KV Cache Offloading for Context-Intensive Tasks — approx. recent systems authors, 2024/2025
https://scholar.google.com/scholar?q=KV+Cache+Offloading+for+Context-Intensive+Tasks
16. KVSwap: Disk-Aware KV Cache Offloading for Long-Context On-Device Inference — approx. recent systems authors, 2024/2025
https://scholar.google.com/scholar?q=KVSwap:+Disk-Aware+KV+Cache+Offloading+for+Long-Context+On-Device+Inference
17. AI Post Transformers: Episode: From Prefix Cache to Fusion RAG Cache: Accelerating LLM Inference in Retrieval-Augmented Generation — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-03-22-from-prefix-cache-to-fusion-rag-9c5d39.mp3
18. AI Post Transformers: CacheSlide: Position-Aware KV Cache Reuse for Agent LLMs — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-03-16-cacheslide-position-aware-kv-cache-reuse-cd59c7.mp3
19. AI Post Transformers: Prefill-as-a-Service for Cross-Datacenter KV Cache — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-04-19-prefill-as-a-service-for-cross-datacente-7560be.mp3
20. AI Post Transformers: KVSwap for Disk-Aware Long-Context On-Device Inference — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-04-16-kvswap-for-disk-aware-long-context-on-de-f3c15e.mp3
21. AI Post Transformers: FengHuang for Rack-Scale LLM Inference Memory — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-04-12-fenghuang-for-rack-scale-llm-inference-m-62708e.mp3
22. AI Post Transformers: Speculative Decoding in Real vLLM Serving — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-04-04-speculative-decoding-in-real-vllm-servin-6f4e2b.mp3
Interactive Visualization: CacheBlend for Fast RAG Serving
...more
0min
April 22, 2026 Distilling Multi-Agent Reasoning into a Single LLM
This episode explores a 2026 paper on AgentArk, which asks whether the reasoning gains of multi-agent LLM systems can be compressed into a single model, reducing the latency, token cost, and orchestration burden of running a “committee” of models at inference time. It explains multi-agent systems as setups where multiple model instances debate, critique, and revise one another, arguing that their real advantage comes less from the visible agent structure and more from iterative conflict-and-refinement dynamics that expose errors and improve reasoning. The discussion also breaks down the paper’s distillation framework—from outcome-based supervision to trajectory-based augmentation and process-aware distillation with process reward models that score intermediate reasoning steps, not just final answers. Listeners would find it interesting because it connects a major practical AI deployment problem—how to keep reasoning quality without paying for expensive test-time compute—to a concrete research attempt to internalize deliberation into one cheaper model.
Sources:
1. AgentArk: Distilling Multi-Agent Intelligence into a Single LLM Agent — Yinyi Luo, Yiqiao Jin, Weichen Yu, Mengqi Zhang, Srijan Kumar, Xiaoxiao Li, Weijie Xu, Xin Chen, Jindong Wang, 2026
http://arxiv.org/abs/2602.03955
2. Training Language Models to Self-Correct via Reinforcement Learning — Chen et al., 2025
https://scholar.google.com/scholar?q=Training+Language+Models+to+Self-Correct+via+Reinforcement+Learning
3. Debate Helps or Not? The Impact of Multi-Agent Structure Perturbation on LLM Reasoning — Kim et al., 2025
https://scholar.google.com/scholar?q=Debate+Helps+or+Not?+The+Impact+of+Multi-Agent+Structure+Perturbation+on+LLM+Reasoning
4. Systematic Study of Orchestration Strategies for Multi-Agent LLM Reasoning — Ke et al., 2026
https://scholar.google.com/scholar?q=Systematic+Study+of+Orchestration+Strategies+for+Multi-Agent+LLM+Reasoning
5. Improving Multi-Agent Debate with Critique and Revision for LLM Reasoning — Lan et al., 2024
https://scholar.google.com/scholar?q=Improving+Multi-Agent+Debate+with+Critique+and+Revision+for+LLM+Reasoning
6. Multi-Agent Consensus Reasoning with Large Language Models — Chen et al., 2024
https://scholar.google.com/scholar?q=Multi-Agent+Consensus+Reasoning+with+Large+Language+Models
7. MAD: Multi-Agent Debate with Large Language Models — Du et al., 2023
https://scholar.google.com/scholar?q=MAD:+Multi-Agent+Debate+with+Large+Language+Models
8. Reflexion: Language Agents with Verbal Reinforcement Learning — Shinn et al., 2023
https://scholar.google.com/scholar?q=Reflexion:+Language+Agents+with+Verbal+Reinforcement+Learning
9. STaR: Self-Taught Reasoner Bootstrapping Reasoning with Reasoning — Zelikman et al., 2022
https://scholar.google.com/scholar?q=STaR:+Self-Taught+Reasoner+Bootstrapping+Reasoning+with+Reasoning
10. Revisiting Multi-Agent Debate as Test-Time Scaling: When Does Multi-Agent Help? — approx. 2025 authors unclear from snippet, 2025
https://scholar.google.com/scholar?q=Revisiting+Multi-Agent+Debate+as+Test-Time+Scaling:+When+Does+Multi-Agent+Help?
11. Revisiting multi-agent debate as test-time scaling: A systematic study of conditional effectiveness — approx. 2025 authors unclear from snippet, 2025
https://scholar.google.com/scholar?q=Revisiting+multi-agent+debate+as+test-time+scaling:+A+systematic+study+of+conditional+effectiveness
12. How to Steal Reasoning Without Reasoning Traces — approx. 2024/2025 authors unclear from snippet, 2024/2025
https://scholar.google.com/scholar?q=How+to+Steal+Reasoning+Without+Reasoning+Traces
13. Sample, Don't Search: Rethinking Test-Time Alignment for Language Models — approx. 2025 authors unclear from snippet, 2025
https://scholar.google.com/scholar?q=Sample,+Don't+Search:+Rethinking+Test-Time+Alignment+for+Language+Models
14. A survey on test-time scaling in large language models: What, how, where, and how well? — approx. 2025 survey authors unclear from snippet, 2025
https://scholar.google.com/scholar?q=A+survey+on+test-time+scaling+in+large+language+models:+What,+how,+where,+and+how+well?
15. Optimizing the Last Mile: Test-Time Compute Strategies for Next-Generation Language Models — approx. 2025 authors unclear from snippet, 2025
https://scholar.google.com/scholar?q=Optimizing+the+Last+Mile:+Test-Time+Compute+Strategies+for+Next-Generation+Language+Models
16. Symbolic mixture-of-experts: Adaptive skill-based routing for heterogeneous reasoning — approx. 2025 authors unclear from snippet, 2025
https://scholar.google.com/scholar?q=Symbolic+mixture-of-experts:+Adaptive+skill-based+routing+for+heterogeneous+reasoning
17. AI Post Transformers: Simple Self-Distillation for Better Code Generation — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-04-02-simple-self-distillation-for-better-code-cc88e0.mp3
18. AI Post Transformers: Learning to Reason with 13 Parameters — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-04-14-learning-to-reason-with-13-parameters-54c87f.mp3
19. AI Post Transformers: Speculative Decoding in Real vLLM Serving — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-04-04-speculative-decoding-in-real-vllm-servin-6f4e2b.mp3
Interactive Visualization: Distilling Multi-Agent Reasoning into a Single LLM
...more
0min
April 22, 2026 Benchmarking Test-Time Scaling for General LLM Agents
This episode explores a paper that tests whether general LLM agents remain effective when search, coding, reasoning, and API/tool-use tasks are mixed together under one shared prompt, interface, and tool set rather than optimized benchmark-specific setups. It explains how the benchmark is built by unifying tasks from BrowseComp, WebVoyager, SWE-Bench Verified, Terminal-Bench, MathHay, Tau2-Bench, and MCP-Bench, forcing agents to infer the task type and select tools without domain-specific cues. The discussion highlights the paper’s core argument that conventional benchmarks can overstate capability by pre-structuring the environment, while a general setting better reflects real user requests and exposes weaknesses in planning, tool choice, and adaptation. Listeners would find it interesting for its clear look at test-time scaling in agents—giving the same model more turns or parallel attempts—and for its broader challenge to how agent intelligence should be evaluated.
Sources:
1. Benchmark Test-Time Scaling of General LLM Agents — Xiaochuan Li, Ryan Ming, Pranav Setlur, Abhijay Paladugu, Andy Tang, Hao Kang, Shuai Shao, Rong Jin, Chenyan Xiong, 2026
http://arxiv.org/abs/2602.18998
2. SWE-Bench — Jimenez et al., 2023
https://scholar.google.com/scholar?q=SWE-Bench
3. Terminal-Bench — Aleithan et al., 2024
https://scholar.google.com/scholar?q=Terminal-Bench
4. BrowseComp — presumably cited in paper; exact citation not provided in excerpt, 2024/2025
https://scholar.google.com/scholar?q=BrowseComp
5. Mind2Web — Deng/He et al. or benchmark authors cited as Wei et al. 2025 / He et al. 2024 in excerpt context, 2024/2025
https://scholar.google.com/scholar?q=Mind2Web
6. WebVoyager — Zhou et al., 2023
https://scholar.google.com/scholar?q=WebVoyager
7. Tau2-Bench — not specified in excerpt, likely 2025/2026
https://scholar.google.com/scholar?q=Tau2-Bench
8. MCP-Bench — not specified in excerpt, likely 2025/2026
https://scholar.google.com/scholar?q=MCP-Bench
9. Self-Consistency Improves Chain of Thought Reasoning in Language Models — Wang et al., 2022
https://scholar.google.com/scholar?q=Self-Consistency+Improves+Chain+of+Thought+Reasoning+in+Language+Models
10. Training Verifiers to Solve Math Word Problems — Cobbe et al., 2021
https://scholar.google.com/scholar?q=Training+Verifiers+to+Solve+Math+Word+Problems
11. Let's Verify Step by Step — Lightman et al., 2023
https://scholar.google.com/scholar?q=Let's+Verify+Step+by+Step
12. Quiet-STaR / test-time reasoning scaling related work — Zelikman et al., 2024
https://scholar.google.com/scholar?q=Quiet-STaR+/+test-time+reasoning+scaling+related+work
13. Snell et al. test-time scaling work — Snell et al., 2024
https://scholar.google.com/scholar?q=Snell+et+al.+test-time+scaling+work
14. Toolformer — Schick et al., 2023
https://scholar.google.com/scholar?q=Toolformer
15. Gorilla / APIBench-style tool-use work — Patil et al., 2024
https://scholar.google.com/scholar?q=Gorilla+/+APIBench-style+tool-use+work
16. Beyond the Context Window: A Cost-Performance Analysis of Fact-Based Memory vs. Long-Context LLMs for Persistent Agents — approx. unknown from snippet, 2025/2026
https://scholar.google.com/scholar?q=Beyond+the+Context+Window:+A+Cost-Performance+Analysis+of+Fact-Based+Memory+vs.+Long-Context+LLMs+for+Persistent+Agents
17. Memory in the Age of AI Agents — approx. unknown from snippet, 2025/2026
https://scholar.google.com/scholar?q=Memory+in+the+Age+of+AI+Agents
18. Toward Conversational Agents with Context and Time Sensitive Long-Term Memory — approx. unknown from snippet, 2025/2026
https://scholar.google.com/scholar?q=Toward+Conversational+Agents+with+Context+and+Time+Sensitive+Long-Term+Memory
19. When LLM Judge Scores Look Good but Best-of-N Decisions Fail — approx. unknown from snippet, 2025/2026
https://scholar.google.com/scholar?q=When+LLM+Judge+Scores+Look+Good+but+Best-of-N+Decisions+Fail
20. When to Solve, When to Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoning — approx. unknown from snippet, 2025/2026
https://scholar.google.com/scholar?q=When+to+Solve,+When+to+Verify:+Compute-Optimal+Problem+Solving+and+Generative+Verification+for+LLM+Reasoning
21. Scalable Best-of-N Selection for Large Language Models via Self-Certainty — approx. unknown from snippet, 2025/2026
https://scholar.google.com/scholar?q=Scalable+Best-of-N+Selection+for+Large+Language+Models+via+Self-Certainty
22. AgentClinic: A Multimodal Agent Benchmark to Evaluate AI in Simulated Clinical Environments — approx. unknown from snippet, 2025/2026
https://scholar.google.com/scholar?q=AgentClinic:+A+Multimodal+Agent+Benchmark+to+Evaluate+AI+in+Simulated+Clinical+Environments
23. DABStep: Data Agent Benchmark for Multi-Step Reasoning — approx. unknown from snippet, 2025/2026
https://scholar.google.com/scholar?q=DABStep:+Data+Agent+Benchmark+for+Multi-Step+Reasoning
24. GTA1: GUI Test-Time Scaling Agent — approx. unknown from snippet, 2025/2026
https://scholar.google.com/scholar?q=GTA1:+GUI+Test-Time+Scaling+Agent
25. AI Post Transformers: SkillsBench for Evaluating Agent Skills — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-04-14-skillsbench-for-evaluating-agent-skills-58bb1e.mp3
26. AI Post Transformers: MEMSEARCHER: Reinforcement Learning for LLM Memory Management — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-04-04-memsearcher-reinforcement-learning-for-l-e9ad84.mp3
27. AI Post Transformers: Memory Sparse Attention for 100M-Token Scaling — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-04-07-memory-sparse-attention-for-100m-token-s-377cff.mp3
28. AI Post Transformers: IMO-Bench for Robust Mathematical Reasoning — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-04-04-imo-bench-for-robust-mathematical-reason-143489.mp3
29. AI Post Transformers: Simple Self-Distillation for Better Code Generation — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-04-02-simple-self-distillation-for-better-code-cc88e0.mp3
Interactive Visualization: Benchmarking Test-Time Scaling for General LLM Agents
...more
0min
April 22, 2026 TUMIX Multi-Agent Test-Time Scaling with Tools
This episode explores TUMIX, a test-time scaling framework that turns a single strong language model into a team of specialized agents with different tool-use strategies, including plain-text reasoning, code execution, search, and hybrids. It explains the paper’s core argument that better reasoning may come not from simply sampling one model more times, but from diversifying computational pathways and letting those agents iteratively refine each other under roughly cost-matched settings. The discussion situates TUMIX within prior work on inference-time compute, program-aided reasoning, and tool-using agents, while also probing whether the approach is genuinely novel or mostly a systems-level formalization of practices already emerging in industry. Listeners would find it interesting for its concrete framing of a major open question in AI: how to orchestrate tools and agent diversity to improve reasoning without exploding latency and cost.
Sources:
1. TUMIX: Multi-Agent Test-Time Scaling with Tool-Use Mixture — Yongchao Chen, Jiefeng Chen, Rui Meng, Ji Yin, Na Li, Chuchu Fan, Chi Wang, Tomas Pfister, Jinsung Yoon, 2025
http://arxiv.org/abs/2510.01279
2. PAL: Program-aided Language Models — Luyu Gao, Shafiq Joty, Caiming Xiong, Irwin King, Steven C. H. Hoi, 2022
https://scholar.google.com/scholar?q=PAL:+Program-aided+Language+Models
3. ReAct: Synergizing Reasoning and Acting in Language Models — Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, Yuan Cao, 2023
https://scholar.google.com/scholar?q=ReAct:+Synergizing+Reasoning+and+Acting+in+Language+Models
4. Mixture-of-Agents Enhances Large Language Model Capabilities — Chi Wang, Xun Wang, Silvio Savarese, Caiming Xiong, Diyi Yang and collaborators, 2024
https://scholar.google.com/scholar?q=Mixture-of-Agents+Enhances+Large+Language+Model+Capabilities
5. Search-Augmented Factuality in Language Models: Challenges and Opportunities for Retrieval-Grounded Generation — Representative survey literature; e.g., researchers across academia and industry on retrieval-augmented and search-grounded generation, 2023-2025
https://scholar.google.com/scholar?q=Search-Augmented+Factuality+in+Language+Models:+Challenges+and+Opportunities+for+Retrieval-Grounded+Generation
6. Automatic Prompt Engineer — Tristan Zhou, Shuyan Zhou, Tianyi Zhou, Jacob Andreas, Jason Wei and collaborators, 2022
https://scholar.google.com/scholar?q=Automatic+Prompt+Engineer
7. DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines — Omar Khattab, Keshav Santhanam, and collaborators, 2023
https://scholar.google.com/scholar?q=DSPy:+Compiling+Declarative+Language+Model+Calls+into+Self-Improving+Pipelines
8. TextGrad: Automatic 'Differentiation' via Text — Chandar Lab and collaborators, 2024
https://scholar.google.com/scholar?q=TextGrad:+Automatic+'Differentiation'+via+Text
9. ADAS: Automated Design of Agentic Systems — Researchers working on LLM-based workflow and agent search, including recent 2024-2025 agentic-systems optimization efforts, 2024
https://scholar.google.com/scholar?q=ADAS:+Automated+Design+of+Agentic+Systems
10. Self-MoA — Li et al., 2025
https://scholar.google.com/scholar?q=Self-MoA
11. Symbolic-MoE — Unknown from excerpt, 2025
https://scholar.google.com/scholar?q=Symbolic-MoE
12. DEI — Unknown from excerpt, 2025
https://scholar.google.com/scholar?q=DEI
13. SciMaster — Unknown from excerpt, 2025
https://scholar.google.com/scholar?q=SciMaster
14. GSA — Unknown from excerpt, 2025
https://scholar.google.com/scholar?q=GSA
15. Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters — Brown et al., 2024
https://scholar.google.com/scholar?q=Scaling+LLM+Test-Time+Compute+Optimally+can+be+More+Effective+than+Scaling+Model+Parameters
16. Language Models Can Solve Computer Tasks — Madaan et al., 2022
https://scholar.google.com/scholar?q=Language+Models+Can+Solve+Computer+Tasks
17. Program-of-Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks — Chen et al., 2022
https://scholar.google.com/scholar?q=Program-of-Thoughts+Prompting:+Disentangling+Computation+from+Reasoning+for+Numerical+Reasoning+Tasks
18. Humanity's Last Exam — Phan et al., 2025
https://scholar.google.com/scholar?q=Humanity's+Last+Exam
19. GPQA: A Graduate-Level Google-Proof Q&A Benchmark — Rein et al., 2024
https://scholar.google.com/scholar?q=GPQA:+A+Graduate-Level+Google-Proof+Q&A+Benchmark
20. OpenAI/Gemini Deep Research comparison paper or report — Comanici et al., 2025
https://scholar.google.com/scholar?q=OpenAI/Gemini+Deep+Research+comparison+paper+or+report
21. DeepSeek-R1 or related RL reasoning paper — Guo et al., 2025
https://scholar.google.com/scholar?q=DeepSeek-R1+or+related+RL+reasoning+paper
22. Recent work showing Code Interpreter underuse in OpenAI models — Chen et al., 2024
https://scholar.google.com/scholar?q=Recent+work+showing+Code+Interpreter+underuse+in+OpenAI+models
23. Simple Test-Time Scaling — unclear from snippet, 2024/2025
https://scholar.google.com/scholar?q=Simple+Test-Time+Scaling
24. Faster and Better LLMs via Latency-Aware Test-Time Scaling — unclear from snippet, 2024/2025
https://scholar.google.com/scholar?q=Faster+and+Better+LLMs+via+Latency-Aware+Test-Time+Scaling
25. Thought Calibration: Efficient and Confident Test-Time Scaling — unclear from snippet, 2024/2025
https://scholar.google.com/scholar?q=Thought+Calibration:+Efficient+and+Confident+Test-Time+Scaling
26. Reasoning Aware Self-Consistency: Leveraging Reasoning Paths for Efficient LLM Sampling — unclear from snippet, 2024/2025
https://scholar.google.com/scholar?q=Reasoning+Aware+Self-Consistency:+Leveraging+Reasoning+Paths+for+Efficient+LLM+Sampling
27. Latent Self-Consistency for Reliable Majority-Set Selection in Short- and Long-Answer Reasoning — unclear from snippet, 2024/2025
https://scholar.google.com/scholar?q=Latent+Self-Consistency+for+Reliable+Majority-Set+Selection+in+Short-+and+Long-Answer+Reasoning
28. Universal Self-Consistency for Large Language Model Generation — unclear from snippet, 2024/2025
https://scholar.google.com/scholar?q=Universal+Self-Consistency+for+Large+Language+Model+Generation
29. The Hidden Strength of Disagreement: Unraveling the Consensus-Diversity Tradeoff in Adaptive Multi-Agent Systems — unclear from snippet, 2024/2025
https://scholar.google.com/scholar?q=The+Hidden+Strength+of+Disagreement:+Unraveling+the+Consensus-Diversity+Tradeoff+in+Adaptive+Multi-Agent+Systems
30. Stay Focused: Problem Drift in Multi-Agent Debate — unclear from snippet, 2024/2025
https://scholar.google.com/scholar?q=Stay+Focused:+Problem+Drift+in+Multi-Agent+Debate
31. Why Do Multi-Agent LLM Systems Fail? — unclear from snippet, 2024/2025
https://scholar.google.com/scholar?q=Why+Do+Multi-Agent+LLM+Systems+Fail?
32. LLM-Based Agents for Tool Learning: A Survey — W. Xu et al., 2024/2025
https://scholar.google.com/scholar?q=LLM-Based+Agents+for+Tool+Learning:+A+Survey
33. AI Post Transformers: Multiagent Debate Improves Language Model Reasoning — Hal Turing & Dr. Ada Shannon, 2025
https://podcast.do-not-panic.com/episodes/multiagent-debate-improves-language-model-reasoning/
34. AI Post Transformers: Experimental Comparison of Agentic and Enhanced RAG — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-04-14-experimental-comparison-of-agentic-and-e-37d8bc.mp3
35. AI Post Transformers: MEMSEARCHER: Reinforcement Learning for LLM Memory Management — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-04-04-memsearcher-reinforcement-learning-for-l-e9ad84.mp3
36. AI Post Transformers: Generalist Reward Modeling with Inference-Time Scaling — Hal Turing & Dr. Ada Shannon, 2025
https://podcast.do-not-panic.com/episodes/generalist-reward-modeling-with-inference-time-scaling/
Interactive Visualization: TUMIX Multi-Agent Test-Time Scaling with Tools
...more
0min
April 22, 2026 Test-time Scaling for Multi-Agent Collaborative Reasoning
This episode explores whether multi-agent systems can benefit from test-time scaling in the same way single models do, focusing on a 2025 paper that combines learned collaborative reasoning with runtime orchestration. It explains the paper’s core setup: a model trained on 500 carefully curated multi-agent reasoning traces (M500) and a separate “CEO” controller that coordinates specialized agents such as planners, critics, and verifiers. The discussion highlights the paper’s central argument that stronger performance may require both better reasoning models and better coordination policies, while also questioning whether the gains justify the added complexity and compute compared with simpler single-agent approaches. Listeners would find it interesting for its clear breakdown of a major emerging AI debate: when collaboration between models is genuinely useful, and when it becomes an expensive “group project” with little payoff.
Sources:
1. Two Heads are Better Than One: Test-time Scaling of Multi-agent Collaborative Reasoning — Can Jin, Hongwu Peng, Qixin Zhang, Yujin Tang, Dimitris N. Metaxas, Tong Che, 2025
http://arxiv.org/abs/2504.09772
2. AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors — Guo et al., 2023
https://scholar.google.com/scholar?q=AgentVerse:+Facilitating+Multi-Agent+Collaboration+and+Exploring+Emergent+Behaviors
3. DeepSeek-R1 — DeepSeek-AI et al., 2025
https://scholar.google.com/scholar?q=DeepSeek-R1
4. MATH-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations — Wang et al., 2024
https://scholar.google.com/scholar?q=MATH-Shepherd:+Verify+and+Reinforce+LLMs+Step-by-step+without+Human+Annotations
5. Self-Consistency Improves Chain of Thought Reasoning in Language Models — Wang et al., 2023
https://scholar.google.com/scholar?q=Self-Consistency+Improves+Chain+of+Thought+Reasoning+in+Language+Models
6. Tree of Thoughts: Deliberate Problem Solving with Large Language Models — Yao et al., 2023
https://scholar.google.com/scholar?q=Tree+of+Thoughts:+Deliberate+Problem+Solving+with+Large+Language+Models
7. AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation — Wu et al., 2023
https://scholar.google.com/scholar?q=AutoGen:+Enabling+Next-Gen+LLM+Applications+via+Multi-Agent+Conversation
8. CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society — Li et al., 2023
https://scholar.google.com/scholar?q=CAMEL:+Communicative+Agents+for+"Mind"+Exploration+of+Large+Language+Model+Society
9. MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework — Hong et al., 2024
https://scholar.google.com/scholar?q=MetaGPT:+Meta+Programming+for+A+Multi-Agent+Collaborative+Framework
10. The Agent Company — Xu et al., 2024
https://scholar.google.com/scholar?q=The+Agent+Company
11. SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering — Yang et al., 2024
https://scholar.google.com/scholar?q=SWE-agent:+Agent-Computer+Interfaces+Enable+Automated+Software+Engineering
12. Benchmark Test-Time Scaling of General LLM Agents — unknown from snippet, 2025
https://scholar.google.com/scholar?q=Benchmark+Test-Time+Scaling+of+General+LLM+Agents
13. Scaling LLM Test-Time Compute Optimally Can Be More Effective Than Scaling Parameters for Reasoning — unknown from snippet, 2024/2025
https://scholar.google.com/scholar?q=Scaling+LLM+Test-Time+Compute+Optimally+Can+Be+More+Effective+Than+Scaling+Parameters+for+Reasoning
14. CONSENSAGENT: Towards Efficient and Effective Consensus in Multi-Agent LLM Interactions Through Sycophancy Mitigation — unknown from snippet, 2025
https://scholar.google.com/scholar?q=CONSENSAGENT:+Towards+Efficient+and+Effective+Consensus+in+Multi-Agent+LLM+Interactions+Through+Sycophancy+Mitigation
15. LLM-Based Multi-agent Systems: Frameworks, Evaluation, Open Challenges, and Research Frontiers — unknown from snippet, 2024/2025
https://scholar.google.com/scholar?q=LLM-Based+Multi-agent+Systems:+Frameworks,+Evaluation,+Open+Challenges,+and+Research+Frontiers
16. Multi-agent Coordination Across Diverse Applications: A Survey — unknown from snippet, 2024/2025
https://scholar.google.com/scholar?q=Multi-agent+Coordination+Across+Diverse+Applications:+A+Survey
17. Decentralized Multi-Agent Goal Assignment for Path Planning Using Large Language Models — unknown from snippet, 2024/2025
https://scholar.google.com/scholar?q=Decentralized+Multi-Agent+Goal+Assignment+for+Path+Planning+Using+Large+Language+Models
18. AI Post Transformers: Agentic AI and the Next Intelligence Explosion — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-03-28-agentic-ai-and-the-next-intelligence-exp-d06561.mp3
19. AI Post Transformers: MetaScale: Test-Time Scaling with Evolving Meta-Thoughts — Hal Turing & Dr. Ada Shannon, 2025
https://podcast.do-not-panic.com/episodes/metascale-test-time-scaling-with-evolving-meta-thoughts/
20. AI Post Transformers: Simple Self-Distillation for Better Code Generation — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-04-02-simple-self-distillation-for-better-code-cc88e0.mp3
21. AI Post Transformers: Generalist Reward Modeling with Inference-Time Scaling — Hal Turing & Dr. Ada Shannon, 2025
https://podcast.do-not-panic.com/episodes/generalist-reward-modeling-with-inference-time-scaling/
22. AI Post Transformers: Nemotron 3 Super Hybrid Mamba-Transformer MoE — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-04-19-nemotron-3-super-hybrid-mamba-transforme-31ac75.mp3
23. AI Post Transformers: SkillsBench for Evaluating Agent Skills — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-04-14-skillsbench-for-evaluating-agent-skills-58bb1e.mp3
Interactive Visualization: Test-time Scaling for Multi-Agent Collaborative Reasoning
...more
0min
April 22, 2026 Program Synthesis with Large Language Models
This episode explores a 2021 Google Research paper on whether large language models can synthesize short Python programs directly from natural-language descriptions, moving beyond code autocomplete into true program synthesis. It explains why this is difficult in general-purpose languages, contrasts classical search-based synthesis with transformer-based generation, and highlights the paper’s emphasis on execution-based evaluation, where code must actually run and pass tests rather than merely resemble reference solutions. The discussion covers the MBPP and MathQA-Python benchmarks, the effects of model scale from 244 million to 137 billion parameters, and the finding that larger models improve substantially, with the biggest model solving 59.6% of MBPP in a few-shot setting and fine-tuning on just 374 examples adding roughly 10 points. Listeners would find it interesting for its clear look at an early turning point when code LLMs began to show measurable, testable synthesis ability rather than just fluent code-like text.
Sources:
1. Program Synthesis with Large Language Models — Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Michael Terry, Quoc Le, Charles Sutton, 2021
http://arxiv.org/abs/2108.07732
2. Program Synthesis — Sumit Gulwani, Oleksandr Polozov, Rishabh Singh, 2017
https://scholar.google.com/scholar?q=Program+Synthesis
3. Neural Program Synthesis: A Survey — Michele Vallecorsa, Luca Quartana, Luca Pasquale and others, 2022
https://scholar.google.com/scholar?q=Neural+Program+Synthesis:+A+Survey
4. Program Synthesis with Large Language Models — Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Michael Terry, Quoc Le, Charles Sutton, 2021
https://scholar.google.com/scholar?q=Program+Synthesis+with+Large+Language+Models
5. A Survey on Neural Code Intelligence: From Program Representation to Program Synthesis — Uri Alon, Miltiadis Allamanis, Marc Brockschmidt and others, 2024
https://scholar.google.com/scholar?q=A+Survey+on+Neural+Code+Intelligence:+From+Program+Representation+to+Program+Synthesis
6. Evaluating Large Language Models Trained on Code — Mark Chen, Jerry Tworek, Heewoo Jun, et al., 2021
https://scholar.google.com/scholar?q=Evaluating+Large+Language+Models+Trained+on+Code
7. Language Models are Few-Shot Learners — Tom B. Brown, Benjamin Mann, Nick Ryder, et al., 2020
https://scholar.google.com/scholar?q=Language+Models+are+Few-Shot+Learners
8. CuBERT: BERT Models for Python Source Code Understanding — Rahul Kanade, Petros Maniatis, Gogul Balakrishnan, Kensen Shi, 2020
https://scholar.google.com/scholar?q=CuBERT:+BERT+Models+for+Python+Source+Code+Understanding
9. CodeBERT: A Pre-Trained Model for Programming and Natural Languages — Zhangyin Feng, Daya Guo, Duyu Tang, et al., 2020
https://scholar.google.com/scholar?q=CodeBERT:+A+Pre-Trained+Model+for+Programming+and+Natural+Languages
10. PyMT5: Multi-mode Translation of Natural Language and Python Code with Transformers — Colin Clement, Dawn Drain, Aakanksha S. Bhatia, et al., 2020
https://scholar.google.com/scholar?q=PyMT5:+Multi-mode+Translation+of+Natural+Language+and+Python+Code+with+Transformers
11. DeepCoder: Learning to Write Programs — Matej Balog, Alexander L. Gaunt, Marc Brockschmidt, et al., 2017
https://scholar.google.com/scholar?q=DeepCoder:+Learning+to+Write+Programs
12. RobustFill: Neural Program Learning under Noisy I/O — Rishabh Singh, Abhishek Gulwani, 2017
https://scholar.google.com/scholar?q=RobustFill:+Neural+Program+Learning+under+Noisy+I/O
13. DreamCoder: Bootstrapping Inductive Program Synthesis with Wake-Sleep Library Learning — Kevin Ellis, Catherine Wong, Maxwell Nye, Mathias Sablé-Meyer, Lucas Morales, Luke Hewitt, Josh Tenenbaum, Armando Solar-Lezama, 2021
https://scholar.google.com/scholar?q=DreamCoder:+Bootstrapping+Inductive+Program+Synthesis+with+Wake-Sleep+Library+Learning
14. Learning to Infer Graphics Programs from Hand-Drawn Images — Augustus Odena, Charles Sutton, 2020
https://scholar.google.com/scholar?q=Learning+to+Infer+Graphics+Programs+from+Hand-Drawn+Images
15. MathQA: Towards Interpretable Math Word Problem Solving with Operation-Based Formalisms — Aida Amini, Saeideh Bakhshi, Sivan Ray Choi, et al., 2019
https://scholar.google.com/scholar?q=MathQA:+Towards+Interpretable+Math+Word+Problem+Solving+with+Operation-Based+Formalisms
16. Allamanis et al. 2018 Survey on Machine Learning for Code — Miltiadis Allamanis, Earl T. Barr, Premkumar Devanbu, Charles Sutton, 2018
https://scholar.google.com/scholar?q=Allamanis+et+al.+2018+Survey+on+Machine+Learning+for+Code
17. Chain-of-Code: Reasoning with a Language Model-Augmented Code Emulator — Li et al. (approx.), 2024
https://scholar.google.com/scholar?q=Chain-of-Code:+Reasoning+with+a+Language+Model-Augmented+Code+Emulator
18. OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement — Zhang et al. (approx.), 2024
https://scholar.google.com/scholar?q=OpenCodeInterpreter:+Integrating+Code+Generation+with+Execution+and+Refinement
19. CodePRM: Execution Feedback-Enhanced Process Reward Model for Code Generation — Wang et al. (approx.), 2024
https://scholar.google.com/scholar?q=CodePRM:+Execution+Feedback-Enhanced+Process+Reward+Model+for+Code+Generation
20. CodeMonkeys: Scaling Test-Time Compute for Software Engineering — anonymous/uncertain from snippet, 2024 or 2025
https://scholar.google.com/scholar?q=CodeMonkeys:+Scaling+Test-Time+Compute+for+Software+Engineering
21. AI Post Transformers: CODEGEN: Open Language Model for Code Synthesis — Hal Turing & Dr. Ada Shannon, Fri,
https://podcast.do-not-panic.com/episodes/codegen-open-language-model-for-code-synthesis/
22. AI Post Transformers: Simple Self-Distillation for Better Code Generation — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-04-02-simple-self-distillation-for-better-code-cc88e0.mp3
23. AI Post Transformers: CWM: Code Generation with World Models — Hal Turing & Dr. Ada Shannon, Sat,
https://podcast.do-not-panic.com/episodes/cwm-code-generation-with-world-models/
24. AI Post Transformers: CodeI/O: Reasoning Patterns Through Code Input-Output Prediction — Hal Turing & Dr. Ada Shannon, Tue,
https://podcast.do-not-panic.com/episodes/codeio-reasoning-patterns-through-code-input-output-prediction/
Interactive Visualization: Program Synthesis with Large Language Models
...more
0min
April 22, 2026 DreamerV3 World Models Across 150 Tasks
This episode explores DreamerV3, a world-model reinforcement learning system that claims to use one main configuration across more than 150 tasks spanning Atari, ProcGen, DMLab, robot control, visual control, BSuite, and Minecraft. It explains how world models work—learning compact environment dynamics so an agent can train on imagined futures—and why that approach is appealing for sample efficiency but historically difficult because agents can overfit to inaccurate “fantasy” dynamics. The discussion highlights the paper’s central argument that robust world-model design may reduce the need for domain-specific retuning, while also stressing that “fixed hyperparameters” does not eliminate all domain engineering such as wrappers, action discretization, and evaluation choices. Listeners would find it interesting for its clear look at a major RL unification attempt, including why the results matter for scaling, sparse-reward tasks, and expensive real-world settings like robotics.
Sources:
1. Mastering Diverse Domains through World Models — Danijar Hafner, Jurgis Pasukonis, Jimmy Ba, Timothy Lillicrap, 2023
http://arxiv.org/abs/2301.04104
2. Mastering Atari with Discrete World Models — Danijar Hafner, Timothy Lillicrap, Mohammad Norouzi, Jimmy Ba, 2021
https://scholar.google.com/scholar?q=Mastering+Atari+with+Discrete+World+Models
3. Mastering Visual Continuous Control: Improved Data-Efficient Reinforcement Learning with Dreamer — Danijar Hafner, Timothy Lillicrap, Ian Fischer, Ruben Villegas, David Ha, Honglak Lee, James Davidson, 2020
https://scholar.google.com/scholar?q=Mastering+Visual+Continuous+Control:+Improved+Data-Efficient+Reinforcement+Learning+with+Dreamer
4. Learning Latent Dynamics for Planning from Pixels — Danijar Hafner, Timothy Lillicrap, Jimmy Ba, Mohammad Norouzi, 2019
https://scholar.google.com/scholar?q=Learning+Latent+Dynamics+for+Planning+from+Pixels
5. MuZero — Julian Schrittwieser, Ioannis Antonoglou, Thomas Hubert, et al., 2020
https://scholar.google.com/scholar?q=MuZero
6. IRIS: Efficient Video Pretraining for Reinforcement Learning — various authors as cited by the paper, 2023
https://scholar.google.com/scholar?q=IRIS:+Efficient+Video+Pretraining+for+Reinforcement+Learning
7. Temporal Difference Models / TD-MPC / TD-MPC2 — various authors including Nicklas Hansen and colleagues, 2022-2024
https://scholar.google.com/scholar?q=Temporal+Difference+Models+/+TD-MPC+/+TD-MPC2
8. MineRL BASALT / VPT-related Minecraft works — various authors including OpenAI and MineRL participants, 2021-2022
https://scholar.google.com/scholar?q=MineRL+BASALT+/+VPT-related+Minecraft+works
9. DrQ-v2 — Ilya Kostrikov, Denis Yarats, Rob Fergus, 2021
https://scholar.google.com/scholar?q=DrQ-v2
10. R2D2 — Steven Kapturowski, Georg Ostrovski, John Quan, et al., 2019
https://scholar.google.com/scholar?q=R2D2
11. STORM: Efficient Stochastic Transformer-based World Models for Reinforcement Learning — approx. Guo et al., 2023/2024
https://scholar.google.com/scholar?q=STORM:+Efficient+Stochastic+Transformer-based+World+Models+for+Reinforcement+Learning
12. Improving Transformer World Models for Data-Efficient RL — approx. recent 2023/2024 RL world-model authors, 2023/2024
https://scholar.google.com/scholar?q=Improving+Transformer+World+Models+for+Data-Efficient+RL
13. GIRL: Generative Imagination Reinforcement Learning via Information-Theoretic Hallucination Control — approx. recent MBRL authors, 2024/2025
https://scholar.google.com/scholar?q=GIRL:+Generative+Imagination+Reinforcement+Learning+via+Information-Theoretic+Hallucination+Control
14. Normalization Enhances Generalization in Visual Reinforcement Learning — approx. recent visual RL authors, 2024/2025
https://scholar.google.com/scholar?q=Normalization+Enhances+Generalization+in+Visual+Reinforcement+Learning
15. Understanding the Mechanisms of Fast Hyperparameter Transfer — approx. recent hyperparameter-transfer authors, 2024/2025
https://scholar.google.com/scholar?q=Understanding+the+Mechanisms+of+Fast+Hyperparameter+Transfer
16. Completed Hyperparameter Transfer across Modules, Width, Depth, Batch and Duration — approx. recent hyperparameter-transfer authors, 2024/2025
https://scholar.google.com/scholar?q=Completed+Hyperparameter+Transfer+across+Modules,+Width,+Depth,+Batch+and+Duration
17. AI Post Transformers: LeWorldModel: Stable Joint-Embedding World Models from Pixels — Hal Turing & Dr. Ada Shannon, 2026
https://podcast.do-not-panic.com/episodes/2026-03-25-leworldmodel-stable-joint-embedding-worl-650f9f.mp3
18. AI Post Transformers: Zero-Shot Context Generalization in Reinforcement Learning from Few Training Contexts — Hal Turing & Dr. Ada Shannon, Tue,
https://podcast.do-not-panic.com/episodes/zero-shot-context-generalization-in-reinforcement-learning-from-few-training-con/
19. AI Post Transformers: Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning — Hal Turing & Dr. Ada Shannon, Fri,
https://podcast.do-not-panic.com/episodes/contrastive-behavioral-similarity-embeddings-for-generalization-in-reinforcement/
20. AI Post Transformers: HyperController: Fast, Stable Reinforcement Learning Hyperparameter Optimization — Hal Turing & Dr. Ada Shannon, Fri,
https://podcast.do-not-panic.com/episodes/hypercontroller-fast-stable-reinforcement-learning-hyperparameter-optimization/
Interactive Visualization: DreamerV3 World Models Across 150 Tasks
...more
0min

FAQs about AI Post Transformers:

How many episodes does AI Post Transformers have?

The podcast currently has 559 episodes available.