April 09, 2026

How Hex builds AI agents that reason like human data analysts | Izzy Miller, AI Engineer

1 hour 8 minutes

Izzy Miller is an AI engineer at Hex, an AI analytics platform that was one of the first companies to ship data agents to real paying users. Today, Hex runs a multi-agent system with nearly 100K tokens of tools, and Izzy is building a 90-day simulation to evaluate whether those agents actually get smarter over time. In this conversation, he walks through the harness decisions that shaped their architecture, the failure modes Hex is seeing at scale, and what it takes to build an eval that no current model can pass.

We also discuss:

Why data agents are harder to verify than coding agents
Under the hood of Hex’s agents
How Hex is unifying separate agents
Why most eval sets are bad
The 90-day simulation for long-horizon evals
How Izzy went from marketing to AI engineer

References:

Andon Labs
Anthropic
Barry McCardel
ChatGPT
Claude Code
Claude Sonnet 4.6
DBT
GPT-3.5 Turbo
GPT-5.3 Codex Spark
GPT-5.4
Hex
LangChain
LangSmith
Looker
OpenAI
Opus 4.6
Satya Nadella
Snowflake
Vending Machine

Where to find Izzy:

LinkedIn
Twitter/X

Where to find Harrison:

LinkedIn
Twitter/X

Where to find LangChain:

Website
Docs

Send feedback or questions to [email protected]

Timestamps:

01:35 Where Hex's notebook agent started

03:46 The moment Hex knew it was time for agents

07:36 Why data agents are harder to verify than coding agents

09:30 How Hex is unifying separate agents

13:28 Under the hood of the notebook agent

15:41 The harness features that are now holding the agent back

17:41 Why Hex built their own orchestrator

18:59 Managing nearly 100K tokens of tools

20:49 Ephemeral queries and agent behavior trade-offs

24:46 The UX problem with showing agents' thinking

27:28 Why verification is harder than transparency for data agents

31:00 Memory, context conflicts, and collapse modes

34:38 How Hex built their internal eval system

39:29 Why most eval sets are bad

44:30 The 900% quota eval that every model fails

46:55 Model upgrades and the "in distribution" debate

51:34 How Izzy went from marketer to AI engineer

59:59 The 90-day simulation for long-horizon evals

...more

View all episodes

By LangChain

April 09, 2026

How Hex builds AI agents that reason like human data analysts | Izzy Miller, AI Engineer

1 hour 8 minutes

We also discuss:

Why data agents are harder to verify than coding agents
Under the hood of Hex’s agents
How Hex is unifying separate agents
Why most eval sets are bad
The 90-day simulation for long-horizon evals
How Izzy went from marketing to AI engineer

References:

Andon Labs
Anthropic
Barry McCardel
ChatGPT
Claude Code
Claude Sonnet 4.6
DBT
GPT-3.5 Turbo
GPT-5.3 Codex Spark
GPT-5.4
Hex
LangChain
LangSmith
Looker
OpenAI
Opus 4.6
Satya Nadella
Snowflake
Vending Machine

Where to find Izzy:

LinkedIn
Twitter/X

Where to find Harrison:

LinkedIn
Twitter/X

Where to find LangChain:

Website
Docs

Send feedback or questions to [email protected]

Timestamps:

01:35 Where Hex's notebook agent started

03:46 The moment Hex knew it was time for agents

07:36 Why data agents are harder to verify than coding agents

09:30 How Hex is unifying separate agents

13:28 Under the hood of the notebook agent

15:41 The harness features that are now holding the agent back

17:41 Why Hex built their own orchestrator

18:59 Managing nearly 100K tokens of tools

20:49 Ephemeral queries and agent behavior trade-offs

24:46 The UX problem with showing agents' thinking

27:28 Why verification is harder than transparency for data agents

31:00 Memory, context conflicts, and collapse modes

34:38 How Hex built their internal eval system

39:29 Why most eval sets are bad

44:30 The 900% quota eval that every model fails

46:55 Model upgrades and the "in distribution" debate

51:34 How Izzy went from marketer to AI engineer

59:59 The 90-day simulation for long-horizon evals

...more

Share How Hex builds AI agents that reason like human data analysts | Izzy Miller, AI Engineer

Sign up to save your podcasts

How Hex builds AI agents that reason like human data analysts | Izzy Miller, AI Engineer

How Hex builds AI agents that reason like human data analysts | Izzy Miller, AI Engineer