
Sign up to save your podcasts
Or


This episode is sponsored by AGNTCY. Unlock agents at scale with an open Internet of Agents.
Visit https://agntcy.org/ and add your support. How is Coxwave Redefining AI Evaluation?
In this episode of Eye on AI, host Craig Smith is joined by Yeop Lee, Head of Product at Coxwave. Together they explore how teams move beyond accuracy-only metrics to outcome focused evaluation with Coxwave's Align. We look at how Align measures satisfaction, trust, and task completion across chat, email, and voice, how LLM as judge pairs with human review, and how product teams search conversations to find hidden failure patterns that block adoption.
Learn how leading companies design an evaluation stack that guides prompts, agents, and UX, which pitfalls to avoid when shipping updates, and which metrics matter most for success, including completion rate, CSAT, retention, and cost per resolution. You will also hear how to run experiment tracking with model and prompt change logs, set up governance that prevents regressions, and choose between SaaS and on premise deployments that meet security and compliance needs.
Stay Updated: Craig Smith on X: https://x.com/craigss Eye on A.I. on X: https://x.com/EyeOn_AI
By Craig S. Smith4.7
5555 ratings
This episode is sponsored by AGNTCY. Unlock agents at scale with an open Internet of Agents.
Visit https://agntcy.org/ and add your support. How is Coxwave Redefining AI Evaluation?
In this episode of Eye on AI, host Craig Smith is joined by Yeop Lee, Head of Product at Coxwave. Together they explore how teams move beyond accuracy-only metrics to outcome focused evaluation with Coxwave's Align. We look at how Align measures satisfaction, trust, and task completion across chat, email, and voice, how LLM as judge pairs with human review, and how product teams search conversations to find hidden failure patterns that block adoption.
Learn how leading companies design an evaluation stack that guides prompts, agents, and UX, which pitfalls to avoid when shipping updates, and which metrics matter most for success, including completion rate, CSAT, retention, and cost per resolution. You will also hear how to run experiment tracking with model and prompt change logs, set up governance that prevents regressions, and choose between SaaS and on premise deployments that meet security and compliance needs.
Stay Updated: Craig Smith on X: https://x.com/craigss Eye on A.I. on X: https://x.com/EyeOn_AI

476 Listeners

169 Listeners

344 Listeners

156 Listeners

213 Listeners

97 Listeners

145 Listeners

100 Listeners

162 Listeners

227 Listeners

693 Listeners

283 Listeners

26 Listeners

32 Listeners

40 Listeners