Building Out Loud

Episode 10: Evals with Ed Biden


Listen Later

Building Out Loud: Debugging AI Agents, Evals, and When to Move Beyond Prototyping Tools (with Ed Biden of Hustle Badger)

Randy Silver and Faith Forster are joined by Ed Biden, co-founder of Hustle Badger, to follow up on Faith’s progress building an AI-powered discovery product. 
Faith shares insights from roughly 20 interviews and survey feedback, noting strong resonance with the product’s commercial outcomes value proposition but a clear need to improve trust in the underlying data by getting her “army of agents” working reliably before polishing features. 
They discuss the temptation to immediately implement feedback versus waiting to prioritise patterns, and Faith details practical debugging issues including missing notifications (e.g., running out of Perplexity credits), JSON parsing problems, and OpenAI producing no output. Ed explains four categories of AI building tools (LLMs, workflow builders, prototyping tools, IDEs) plus an emerging fifth category, then introduces evals and observability via traces, clustering failure modes, and automated tests. 
Faith plans to try Langfuse, improve dashboards, rethink agent orchestration with planning agents, and continue toward beta users while balancing automation with team control.

Check out Hustle Badgers Evals course: https://youtu.be/TA9LJJddlNE
Or watch their introductory video on Evals: https://youtu.be/7OcrV7VSvW4

00:00 Welcome and Guest Intro
00:49 Discovery Progress Update
01:33 Resisting Quick Fixes
02:57 Agent Debugging Woes
03:43 Moving Beyond Replit
05:03 AI Tooling Landscape
08:30 Swarms and Subagents
09:16 Evals and Observability
15:05 Building Better Evals
18:21 Faith Next Steps and Roadmap
21:29 Replit vs IDE Comfort
23:05 Wrap Up and Teaser

...more
View all episodesView all episodes
Download on the App Store

Building Out LoudBy Discoveree.app