May 05, 2026

Episode 9: Mystery Solved!

59 minutes

This week's episode of "The Odd Couple" is just Caitlin and Hannah as I had to go to Georgetown to talk about Claude Code at a faculty retreat. But before we get going with a description, Hannah mentioned at the start during the ice breaker about the opening theme song to the podcast, and for those that don't recognize the lyrics, that's Mac Miller's "Small Worlds" sung by my two nephews.

So what is this episode about? One of the themes I have been emphasizing in my talks on AI Agents and my substack is that AI Agents have caused a separation between the historic bundling of the production of research and the verification of the results. Since AI Agents are now able to produce so many aspects of the research project autonomously -- that is without much direction from the human researcher -- one of the new tasks of the researcher is to verify them.

If you remember from a few weeks ago, Claude Code had nearly instantly worked up the county-level marriage data into a county panel of marriage rates and marriage counts by year. We brought Hannah Sayre, a recent college graduate and current economic consultant, into the project to help us work through the latter task of "human verification". Had Claude done it correctly? How do we verify that it is correct? And if it is not correct, why was it not correct, and how generalizable is that inaccuracy? Hannah was our eyes and ears, our boots on the ground, as she independently investigated the same question, the same task we gave Claude, to on the back end up help us determine whether Claude had indeed found the same irregularities in the original marriage dataset, and if so, what autonomous decisions had he made. And so in this episode, Hannah walks us through it, and she and Caitlin discuss both those findings, as well as begin the work of conceptualizing the process of verification in a world of AI Agents. While not definitive, this is a chance for others to hear more specifically about this. I at least anticipate that all of us will have to wrestle with verification going forward in ways we were not expecting, and maybe even are not prepared for, at least not universally, and definitely not necessarily if in fact AI Agents shrink the size of the project team members due to automation, and how best to respond to that smaller scale, and therefore, fewer people available to do the actual verification itself.

Scott's Mixtape Substack is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Thanks again for tuning in! We hope you are having as much fun with this as we are!

Get full access to Scott's Mixtape Substack at causalinf.substack.com/subscribe

...more