
Sign up to save your podcasts
Or


Alex Ratner co-founded Snorkel AI out of Chris Ré's Stanford lab and helped establish data-centric AI as a field. Today, Snorkel is a $1.3B company shipping thousands of data sets and environments a week to frontier labs and vertical AI teams like Harvey.
In this conversation, he argues our ability to build AI agents has outpaced our ability to measure them. That gap is what's keeping most enterprise agents stuck in demo purgatory.
If you can't measure it, you can't improve it. And you can't deploy it.
In this conversation:
Chapters
(00:00) Intro: Alex Ratner and Snorkel AI
(02:50) What the evaluation gap actually is
(06:05) Moravec's paradox and the jagged frontier
(08:46) Where AI agents fall down in enterprise work
(10:40) Big Law Bench: benchmarking Harvey's legal agents
(12:00) The three axes: input, autonomy horizon, output
(18:31) Snorkel's $3M Open Benchmarks Grant
(22:33) From "janitorial" to epicenter: 15 years of data-centric AI
(29:26) The expert-agentic data era
(34:54) The false dichotomy between data and environments
(40:05) DoorDash Tasks and expert data at scale
Connect with Alex Ratner:
Connect with Conor:
More episodes: https://chainofthought.show
Thanks to Galileo — download their free 165-page guide to mastering multi-agent systems at galileo.ai/mastering-multi-agent-systems
By Conor Bronsdon5
2727 ratings
Alex Ratner co-founded Snorkel AI out of Chris Ré's Stanford lab and helped establish data-centric AI as a field. Today, Snorkel is a $1.3B company shipping thousands of data sets and environments a week to frontier labs and vertical AI teams like Harvey.
In this conversation, he argues our ability to build AI agents has outpaced our ability to measure them. That gap is what's keeping most enterprise agents stuck in demo purgatory.
If you can't measure it, you can't improve it. And you can't deploy it.
In this conversation:
Chapters
(00:00) Intro: Alex Ratner and Snorkel AI
(02:50) What the evaluation gap actually is
(06:05) Moravec's paradox and the jagged frontier
(08:46) Where AI agents fall down in enterprise work
(10:40) Big Law Bench: benchmarking Harvey's legal agents
(12:00) The three axes: input, autonomy horizon, output
(18:31) Snorkel's $3M Open Benchmarks Grant
(22:33) From "janitorial" to epicenter: 15 years of data-centric AI
(29:26) The expert-agentic data era
(34:54) The false dichotomy between data and environments
(40:05) DoorDash Tasks and expert data at scale
Connect with Alex Ratner:
Connect with Conor:
More episodes: https://chainofthought.show
Thanks to Galileo — download their free 165-page guide to mastering multi-agent systems at galileo.ai/mastering-multi-agent-systems

112,191 Listeners