
Sign up to save your podcasts
Or


This episode is sponsored by AGNTCY. Unlock agents at scale with an open Internet of Agents.
Visit https://agntcy.org/ and add your support.
Why should AI that learns from the messy real world still obey strict logic, and what does it take to make that reliability hold up in production? In this episode of Eye on AI, host Craig Smith sits down with Sunita Sarawagi to unpack how large scale learning can be combined with explicit rules and constraints so models stay trustworthy. We cover when world ingestion fails without structure, how to encode domain logic alongside LLMs, and which hybrid or neurosymbolic approaches reduce hallucinations while preserving flexibility. You will hear how to design a reliability stack for real users, detect out of distribution inputs, and choose evaluation signals that reflect outcomes rather than accuracy alone.
Learn how product teams layer formal logic on top of generative models, decide what to hard code versus learn from data, and enforce business policies across agents, tools, and knowledge graphs. You will also hear how to run safe experiments, track prompt and model changes, prevent regressions before they reach customers, and plan for compute and infrastructure at scale with metrics like completion rate, CSAT, retention, and cost per resolution.
Stay Updated: Craig Smith on X: https://x.com/craigss Eye on A.I. on X: https://x.com/EyeOn_AI
By Craig S. Smith4.7
5555 ratings
This episode is sponsored by AGNTCY. Unlock agents at scale with an open Internet of Agents.
Visit https://agntcy.org/ and add your support.
Why should AI that learns from the messy real world still obey strict logic, and what does it take to make that reliability hold up in production? In this episode of Eye on AI, host Craig Smith sits down with Sunita Sarawagi to unpack how large scale learning can be combined with explicit rules and constraints so models stay trustworthy. We cover when world ingestion fails without structure, how to encode domain logic alongside LLMs, and which hybrid or neurosymbolic approaches reduce hallucinations while preserving flexibility. You will hear how to design a reliability stack for real users, detect out of distribution inputs, and choose evaluation signals that reflect outcomes rather than accuracy alone.
Learn how product teams layer formal logic on top of generative models, decide what to hard code versus learn from data, and enforce business policies across agents, tools, and knowledge graphs. You will also hear how to run safe experiments, track prompt and model changes, prevent regressions before they reach customers, and plan for compute and infrastructure at scale with metrics like completion rate, CSAT, retention, and cost per resolution.
Stay Updated: Craig Smith on X: https://x.com/craigss Eye on A.I. on X: https://x.com/EyeOn_AI

478 Listeners

174 Listeners

341 Listeners

154 Listeners

213 Listeners

90 Listeners

131 Listeners

95 Listeners

155 Listeners

209 Listeners

591 Listeners

268 Listeners

26 Listeners

35 Listeners

39 Listeners