
Sign up to save your podcasts
Or


Tavily built a Deep Research Agent with production in mind. Something they could actually scale. So they did the unsexy work. They went through millions of agent logs, found where tokens were being wasted, and optimized each section of the system.
The result surprised them: they cut token consumption by more than half (!), then tested quality and discovered they topped the DeepResearch Bench without even trying.
In this YAAP episode, Yuval sits down with Dean from Tavily to break down how they built it, what they did differently from the usual top approaches, and which design choices made better results possible with far fewer tokens.
What you’ll learn:
By AI21Tavily built a Deep Research Agent with production in mind. Something they could actually scale. So they did the unsexy work. They went through millions of agent logs, found where tokens were being wasted, and optimized each section of the system.
The result surprised them: they cut token consumption by more than half (!), then tested quality and discovered they topped the DeepResearch Bench without even trying.
In this YAAP episode, Yuval sits down with Dean from Tavily to break down how they built it, what they did differently from the usual top approaches, and which design choices made better results possible with far fewer tokens.
What you’ll learn: