
Sign up to save your podcasts
Or


In this episode, a16z General Partner Martin Casado sits down with Sujay Jayakar, co-founder and Chief Scientist at Convex, to talk about his team’s latest work benchmarking AI agents on full-stack coding tasks. From designing Fullstack Bench to the quirks of agent behavior, the two dig into what’s actually hard about autonomous software development, and why robust evals—and guardrails like type safety—matter more than ever. They also get tactical: which models perform best for real-world app building? How should developers think about trajectory management and variance across runs? And what changes when you treat your toolchain like part of the prompt? Whether you're a hobbyist developer or building the next generation of AI-powered devtools, Sujay’s systems-level insights are not to be missed.
Drawing from Sujay’s work developing the Fullstack-Bench, they cover:
Learn More:
Introducing Fullstack-Bench
Follow everyone on X:
Sujay Jayakar
Martin Casado
Check out everything a16z is doing with artificial intelligence here, including articles, projects, and more podcasts.
Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.
By a16z4.6
2929 ratings
In this episode, a16z General Partner Martin Casado sits down with Sujay Jayakar, co-founder and Chief Scientist at Convex, to talk about his team’s latest work benchmarking AI agents on full-stack coding tasks. From designing Fullstack Bench to the quirks of agent behavior, the two dig into what’s actually hard about autonomous software development, and why robust evals—and guardrails like type safety—matter more than ever. They also get tactical: which models perform best for real-world app building? How should developers think about trajectory management and variance across runs? And what changes when you treat your toolchain like part of the prompt? Whether you're a hobbyist developer or building the next generation of AI-powered devtools, Sujay’s systems-level insights are not to be missed.
Drawing from Sujay’s work developing the Fullstack-Bench, they cover:
Learn More:
Introducing Fullstack-Bench
Follow everyone on X:
Sujay Jayakar
Martin Casado
Check out everything a16z is doing with artificial intelligence here, including articles, projects, and more podcasts.
Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

1,288 Listeners

530 Listeners

1,088 Listeners

433 Listeners

341 Listeners

228 Listeners

211 Listeners

484 Listeners

133 Listeners

209 Listeners

557 Listeners

125 Listeners

510 Listeners

19 Listeners

41 Listeners