
Sign up to save your podcasts
Or


This episode is sponsored by AGNTCY. Unlock agents at scale with an open Internet of Agents.
Visit https://agntcy.org/ and add your support.
Why should AI that learns from the messy real world still obey strict logic, and what does it take to make that reliability hold up in production? In this episode of Eye on AI, host Craig Smith sits down with Sunita Sarawagi to unpack how large scale learning can be combined with explicit rules and constraints so models stay trustworthy. We cover when world ingestion fails without structure, how to encode domain logic alongside LLMs, and which hybrid or neurosymbolic approaches reduce hallucinations while preserving flexibility. You will hear how to design a reliability stack for real users, detect out of distribution inputs, and choose evaluation signals that reflect outcomes rather than accuracy alone.
Learn how product teams layer formal logic on top of generative models, decide what to hard code versus learn from data, and enforce business policies across agents, tools, and knowledge graphs. You will also hear how to run safe experiments, track prompt and model changes, prevent regressions before they reach customers, and plan for compute and infrastructure at scale with metrics like completion rate, CSAT, retention, and cost per resolution.
Stay Updated: Craig Smith on X: https://x.com/craigss Eye on A.I. on X: https://x.com/EyeOn_AI
By Craig S. Smith4.7
5555 ratings
This episode is sponsored by AGNTCY. Unlock agents at scale with an open Internet of Agents.
Visit https://agntcy.org/ and add your support.
Why should AI that learns from the messy real world still obey strict logic, and what does it take to make that reliability hold up in production? In this episode of Eye on AI, host Craig Smith sits down with Sunita Sarawagi to unpack how large scale learning can be combined with explicit rules and constraints so models stay trustworthy. We cover when world ingestion fails without structure, how to encode domain logic alongside LLMs, and which hybrid or neurosymbolic approaches reduce hallucinations while preserving flexibility. You will hear how to design a reliability stack for real users, detect out of distribution inputs, and choose evaluation signals that reflect outcomes rather than accuracy alone.
Learn how product teams layer formal logic on top of generative models, decide what to hard code versus learn from data, and enforce business policies across agents, tools, and knowledge graphs. You will also hear how to run safe experiments, track prompt and model changes, prevent regressions before they reach customers, and plan for compute and infrastructure at scale with metrics like completion rate, CSAT, retention, and cost per resolution.
Stay Updated: Craig Smith on X: https://x.com/craigss Eye on A.I. on X: https://x.com/EyeOn_AI

479 Listeners

172 Listeners

343 Listeners

150 Listeners

205 Listeners

95 Listeners

129 Listeners

92 Listeners

153 Listeners

228 Listeners

633 Listeners

274 Listeners

25 Listeners

36 Listeners

42 Listeners