
Sign up to save your podcasts
Or


This episode is sponsored by AGNTCY. Unlock agents at scale with an open Internet of Agents.
Visit https://agntcy.org/ and add your support. How is Coxwave Redefining AI Evaluation?
In this episode of Eye on AI, host Craig Smith is joined by Yeop Lee, Head of Product at Coxwave. Together they explore how teams move beyond accuracy-only metrics to outcome focused evaluation with Coxwave's Align. We look at how Align measures satisfaction, trust, and task completion across chat, email, and voice, how LLM as judge pairs with human review, and how product teams search conversations to find hidden failure patterns that block adoption.
Learn how leading companies design an evaluation stack that guides prompts, agents, and UX, which pitfalls to avoid when shipping updates, and which metrics matter most for success, including completion rate, CSAT, retention, and cost per resolution. You will also hear how to run experiment tracking with model and prompt change logs, set up governance that prevents regressions, and choose between SaaS and on premise deployments that meet security and compliance needs.
Stay Updated: Craig Smith on X: https://x.com/craigss Eye on A.I. on X: https://x.com/EyeOn_AI
By Craig S. Smith4.7
5555 ratings
This episode is sponsored by AGNTCY. Unlock agents at scale with an open Internet of Agents.
Visit https://agntcy.org/ and add your support. How is Coxwave Redefining AI Evaluation?
In this episode of Eye on AI, host Craig Smith is joined by Yeop Lee, Head of Product at Coxwave. Together they explore how teams move beyond accuracy-only metrics to outcome focused evaluation with Coxwave's Align. We look at how Align measures satisfaction, trust, and task completion across chat, email, and voice, how LLM as judge pairs with human review, and how product teams search conversations to find hidden failure patterns that block adoption.
Learn how leading companies design an evaluation stack that guides prompts, agents, and UX, which pitfalls to avoid when shipping updates, and which metrics matter most for success, including completion rate, CSAT, retention, and cost per resolution. You will also hear how to run experiment tracking with model and prompt change logs, set up governance that prevents regressions, and choose between SaaS and on premise deployments that meet security and compliance needs.
Stay Updated: Craig Smith on X: https://x.com/craigss Eye on A.I. on X: https://x.com/EyeOn_AI

478 Listeners

174 Listeners

341 Listeners

153 Listeners

214 Listeners

90 Listeners

132 Listeners

95 Listeners

154 Listeners

209 Listeners

591 Listeners

268 Listeners

26 Listeners

35 Listeners

40 Listeners