Humans of Reliability

You Can’t Fix What You Don’t Measure: Observability in the Age of AI with Conor Bronsdon


Listen Later

Only 50% of companies monitor their ML systems. Building observability for AI is not simple: it goes beyond 200 OK pings. In this episode, Sylvain Kalache sits down with Conor Brondsdon (Galileo) to unpack why observability, monitoring, and human feedback are the missing links to make large language model (LLM) reliable in production.

Conor dives into the shift from traditional test-driven development to evaluation-driven development, where metrics like context adherence, completeness, and action advancement replace binary pass-fail checks. He also shares how teams can blend human-in-the-loop feedback, automated guardrails, and small language models to keep AI accurate, compliant, and cost-efficient at scale.

...more
View all episodesView all episodes
Download on the App Store

Humans of ReliabilityBy Rootly