AI Odyssey

By Anlie Arnaudy, Daniel Herbera and Guillaume Fournier

AI Odyssey is your journey through the vast and evolving world of artificial intelligence. Powered by AI, this podcast breaks down both the foundational concepts and the cutting-edge developments in t... more

· Technology

Download on the App Store

Download on the App Store

Get it on Google Play

FAQs about AI Odyssey:

How many episodes does AI Odyssey have?

The podcast currently has 52 episodes available.

AI Odyssey episodes:

October 24, 2025 The Vision Hack: How a Picture Solved AI's Biggest Memory Problem
The biggest bottleneck for AIs handling massive documents—the context window—just got a radical fix. DeepSeek AI's DeepSeek-GOCR uses a counterintuitive trick: it turns text into an image to compress it by up to 10 times without losing accuracy. That means your AI can suddenly read the equivalent of 20 million tokens (entire codebases or legal troves) efficiently! This episode dives into the elegant vision-based solution, the power of its Mixture of Experts architecture, and why some experts believe all AI input should become an image.
Original Research: DeepSeek-GOCR is a breakthrough by the DeepSeek AI team.
Content generated with the help of Google's NotebookLM.
Link to the Original Research Paper: https://deepseek.ai/blog/deepseek-ocr-context-compression

...more
15min
October 22, 2025 Smarter Agents, Less Budget: Reinforcement Learning with Tree Search
Training AI agents using Reinforcement Learning (RL) to handle complex, multi-turn tasks is notoriously difficult.Traditional methods face two major hurdles: high computational costs (generating numerous interaction scenarios, or "rollouts," is expensive) and sparse supervision (rewards are only given at the very end of a task, making it hard for the agent to learn which specific steps were useful).
In this episode, we explore "Tree Search for LLM Agent Reinforcement Learning," by researchers from Xiamen University, AMAP (Alibaba Group), and the Southern University of Science and Technology. They introduce a novel approach called Tree-GRPO (Tree-based Group Relative Policy Optimization) that fundamentally changes how agents explore possibilities.
Tree-GRPO replaces inefficient "chain-based" sampling with a tree-search strategy. By allowing different trajectories to share common prefixes (the initial steps of an interaction), the method significantly increases the number of scenarios explored within the same budget. Crucially, the tree structure allows the system to derive step-by-step "process supervision signals," even when only the final outcome reward is available. The results demonstrate superior performance over traditional methods, with some models achieving better results using only a quarter of the training budget.
📄 Paper: Tree Search for LLM Agent Reinforcement Learning https://arxiv.org/abs/2509.21240

...more
1min
October 11, 2025 Beyond the AI Agent Builders Hype
Everyone's talking about AI agents that can automate complex tasks. But what happens when a cool demo meets the real world? We dive into hard-won, and often surprising, lessons from builders on the front lines. Discover why your first strategic choice isn't about a tool, but an entire ecosystem; why more agents can actually make things worse; and why the most critical skill is shifting from "prompt engineering" to "context engineering." This episode cuts through the noise to reveal what it really takes to build reliable AI agents that deliver value.
...more
15min
October 04, 2025 AI That Quietly Helps: Overhearing Agents
In this IA Odyssey episode, we unpack “overhearing agents”—AI systems that listen to human activity (audio, text, or video) and step in only when help is useful, like surfacing a diagram during a class discussion, prepping trail options while a family plans a hike, or pulling case notes in a medical consult.
While conversational AI (like chatbots) requires direct user engagement, overhearing agents continuously monitor ambient activities, such as human-to-human conversations, and intervene only to provide contextual assistance without interruption. Examples include silently providing data during a medical consultation or scheduling meetings as colleagues discuss availability.
The paper introduces a clear taxonomy for how these agents activate: always-on, user-initiated, post-hoc analysis, or rule-based triggers. This framework helps developers think about when and how an AI should “step in” without becoming intrusive.
Original paper: https://arxiv.org/pdf/2509.16325
Credits: Episode notes synthesized with Google’s NotebookLM to analyze and summarize the paper; all insights credit the original authors.
...more
1min
September 28, 2025 Beyond Single Agents: The Future of Multi-Agent LLMs
Can large language models achieve more when they collaborate instead of working alone? In this episode, we dive into “LLM Multi-Agent Systems: Challenges and Open Problems” by Shanshan Han, Qifan Zhang, Yuhang Yao, Weizhao Jin, and Zhaozhuo Xu.
We explore how multi-agent systems—where AI agents specialize, debate, and share knowledge—can tackle complex problems beyond the reach of a single model. The paper highlights open challenges such as:
• Optimizing task allocation across diverse agents
• Enhancing reasoning through debates and iterative loops
• Managing layered context and memory across multiple agents
• Ensuring security, privacy, and coordination in shared memory systems
We also discuss how these systems could reshape blockchain applications, from fraud detection to smarter contract negotiation.
This episode was generated with the help of Google’s NotebookLM.
Read the full paper here: https://arxiv.org/abs/2402.03578
...more
1min
September 20, 2025 AI's Guessing Game
Ever wondered why AI chatbots sometimes state things with complete confidence, only for you to find out it's completely wrong? This phenomenon, known as "hallucination," is a major roadblock to trusting AI. A recent paper from OpenAI explores why this happens, and the answer is surprisingly simple: we're training them to be good test-takers rather than honest partners.

This description is based on the paper "Why Language Models Hallucinate" by authors Adam Tauman Kalai, Ofir Nachum, Santosh S. Vempala, and Edwin Zhang. Content was generated using Google's NotebookLM.

Link to the original paper: https://openai.com/research/why-language-models-hallucinate

...more
1min
September 13, 2025 From Search Buddy to Personal Agent
Ever feel like your AI assistants don't really get you? We're diving into how AI is moving beyond generic answers to offer truly personalized experiences. This episode explores the journey from Retrieval-Augmented Generation (RAG), a fancy term for AIs that look things up before they speak, to sophisticated AI Agents that can understand your unique needs, plan tasks, and act on your behalf. It's the next step in making AI a genuine partner in our digital lives.
This description was generated using Google's NotebookLM, based on the work of Xiaopeng Li, Pengyue Jia, and their co-authors.
Read the original paper here:
https://arxiv.org/abs/2504.10147
...more
1min
September 08, 2025 Smarter LLM Routing: Balancing Cost and Performance
How can we get the best out of large language models without breaking the budget? This episode dives into Adaptive LLM Routing under Budget Constraints by Pranoy Panda, Raghav Magazine, Chaitanya Devaguptapu, Sho Takemori, and Vishal Sharma. The authors reimagine the problem of choosing the right LLM for each query as a contextual bandit task, learning from user feedback rather than costly full supervision. Their new method, PILOT, combines human preference data with online learning to route queries efficiently—achieving up to 93% of GPT-4’s performance at just 25% of its cost.
We also look at their budget-aware strategy, modeled as a multi-choice knapsack problem, that ensures smarter allocation of expensive queries to stronger models while keeping overall costs low.
Original paper: https://arxiv.org/abs/2508.21141
This podcast description was generated with the help of Google’s NotebookLM.
...more
23min
August 30, 2025 Nano Banana & the Future of Visual Creativity
Google’s latest breakthrough, Gemini 2.5 Flash Image—nicknamed “Nano Banana”—is reshaping what’s possible in digital art and beyond. From keeping characters consistent across scenes to natural-language editing and even blending multiple images, this model is lowering the barrier to creation like never before. Imagine building entire fantasy worlds or accelerating scientific research without the traditional costs and time sinks.
But with this power comes profound questions: How do we handle the risks of fakes, hallucinations, and lost trust in what we see? What happens to human artists when machines can produce in seconds what once took weeks?
In this episode of IA Odyssey, we dive into the promises and perils of Gemini 2.5 Flash Image, exploring how it may democratize creativity, shift the role of artists, and force us all to rethink authenticity in the age of AI.
Original content generated with the help of Google’s NotebookLM.
...more
5min
July 19, 2025 From Agents to Teammates: Building Cohesive AI Squads
Meet the Aime framework—ByteDance’s fresh take on multi-agent systems that lets AI teammates think on their feet instead of following brittle, pre-planned scripts. A dynamic planner keeps adjusting the big picture, an Actor Factory spins up just-right specialist agents on demand, and a shared progress board keeps everyone in sync. In tests ranging from general reasoning (GAIA) to software bug-fixing (SWE-Bench) and live web navigation (WebVoyager), Aime consistently out-performed hand-tuned rivals—showing that flexible, reactive collaboration beats static role-play every time.
This episode of IA Odyssey unpacks how Yexuan Shi and colleagues replace rigid “plan-and-execute” pipelines with fluid teamwork, why it matters for real-world tasks, and where adaptive agent swarms might head next.
Source paper: https://arxiv.org/abs/2507.11988

Content generated with help from Google’s NotebookLM.
...more
16min

FAQs about AI Odyssey:

How many episodes does AI Odyssey have?

The podcast currently has 52 episodes available.