
Sign up to save your podcasts
Or


Interested in being a guest? Email us at [email protected]
Ensuring AI systems actually work as intended might be the most crucial challenge facing developers and enterprises today. As these powerful tools become more embedded in our daily workflows and critical business processes, their non-deterministic nature presents unique reliability challenges unlike anything we've faced with traditional software.
Conor Bronsdon from Galileo.ai offers a compelling framework for understanding and addressing these challenges. The fundamental issue? LLMs don't follow the predictable input-output relationships we've come to expect from software. "They have this massive amount of data they've been trained on... and this is where the magic piece comes in, where they can create and do things outside of your expectations," Conor explains. While this unpredictability enables AI's most impressive capabilities, it also introduces significant risks.
The conversation explores common failure modes organizations encounter when deploying AI in production: tool execution errors, security vulnerabilities, context management problems, and inconsistent content quality. These aren't just theoretical concerns - they're practical challenges facing enterprises like Comcast, JP Morgan, and other Galileo customers working to harness AI reliably at scale.
Rather than treating AI as a mysterious black box, Conor advocates for a structured approach to reliability through evaluation, observation, and guardrails. By using purpose-built small language models that can operate with minimal latency and cost, organizations can implement 100% sampling of AI interactions while protecting against harmful outputs. This creates a continuous improvement cycle where production data feeds back into system refinement.
Perhaps most insightful is Conor's framing of AI as "a junior async digital employee" - highly capable but requiring proper context, feedback, and guidance to perform effectively. This mental model helps bridge the gap between AI's technical capabilities and the practical needs of organizations deploying it. The goal isn't to constrain AI's potential but to channel it productively within appropriate boundaries.
Listen on: Apple Podcasts Spotify
Support the show
More at https://linktr.ee/EvanKirstel
By Evan KirstelInterested in being a guest? Email us at [email protected]
Ensuring AI systems actually work as intended might be the most crucial challenge facing developers and enterprises today. As these powerful tools become more embedded in our daily workflows and critical business processes, their non-deterministic nature presents unique reliability challenges unlike anything we've faced with traditional software.
Conor Bronsdon from Galileo.ai offers a compelling framework for understanding and addressing these challenges. The fundamental issue? LLMs don't follow the predictable input-output relationships we've come to expect from software. "They have this massive amount of data they've been trained on... and this is where the magic piece comes in, where they can create and do things outside of your expectations," Conor explains. While this unpredictability enables AI's most impressive capabilities, it also introduces significant risks.
The conversation explores common failure modes organizations encounter when deploying AI in production: tool execution errors, security vulnerabilities, context management problems, and inconsistent content quality. These aren't just theoretical concerns - they're practical challenges facing enterprises like Comcast, JP Morgan, and other Galileo customers working to harness AI reliably at scale.
Rather than treating AI as a mysterious black box, Conor advocates for a structured approach to reliability through evaluation, observation, and guardrails. By using purpose-built small language models that can operate with minimal latency and cost, organizations can implement 100% sampling of AI interactions while protecting against harmful outputs. This creates a continuous improvement cycle where production data feeds back into system refinement.
Perhaps most insightful is Conor's framing of AI as "a junior async digital employee" - highly capable but requiring proper context, feedback, and guidance to perform effectively. This mental model helps bridge the gap between AI's technical capabilities and the practical needs of organizations deploying it. The goal isn't to constrain AI's potential but to channel it productively within appropriate boundaries.
Listen on: Apple Podcasts Spotify
Support the show
More at https://linktr.ee/EvanKirstel