December 29, 2025

Beyond the Chatbot: Practical Frameworks for Agentic Capabilities in SaaS

53 minutes

Summary
In this episode product and engineering leader Preeti Shukla explores how and when to add agentic capabilities to SaaS platforms. She digs into the operational realities that AI agents must meet inside multi-tenant software: latency, cost control, data privacy, tenant isolation, RBAC, and auditability. Preeti outlines practical frameworks for selecting models and providers, when to self-host, and how to route capabilities across frontier and cheaper models. She discusses graduated autonomy, starting with internal adoption and low-risk use cases before moving to customer-facing features, and why many successful deployments keep a human-in-the-loop. She also covers evaluation and observability as core engineering disciplines - layered evals, golden datasets, LLM-as-a-judge, path/behavior monitoring, and runtime vs. offline checks - to achieve reliability in nondeterministic systems.

Announcements

Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems
When ML teams try to run complex workflows through traditional orchestration tools, they hit walls. Cash App discovered this with their fraud detection models - they needed flexible compute, isolated environments, and seamless data exchange between workflows, but their existing tools couldn't deliver. That's why Cash App rely on Prefect. Now their ML workflows run on whatever infrastructure each model needs across Google Cloud, AWS, and Databricks. Custom packages stay isolated. Model outputs flow seamlessly between workflows. Companies like Whoop and 1Password also trust Prefect for their critical workflows. But Prefect didn't stop there. They just launched FastMCP - production-ready infrastructure for AI tools. You get Prefect's orchestration plus instant OAuth, serverless scaling, and blazing-fast Python execution. Deploy your AI tools once, connect to Claude, Cursor, or any MCP client. No more building auth flows or managing servers. Prefect orchestrates your ML pipeline. FastMCP handles your AI tool infrastructure. See what Prefect and Fast MCP can do for your AI workflows at aiengineeringpodcast.com/prefect today.
Unlock the full potential of your AI workloads with a seamless and composable data infrastructure. Bruin is an open source framework that streamlines integration from the command line, allowing you to focus on what matters most - building intelligent systems. Write Python code for your business logic, and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. With native support for ML/AI workloads, Bruin empowers data teams to deliver faster, more reliable, and scalable AI solutions. Harness Bruin's connectors for hundreds of platforms, including popular machine learning frameworks like TensorFlow and PyTorch. Build end-to-end AI workflows that integrate seamlessly with your existing tech stack. Join the ranks of forward-thinking organizations that are revolutionizing their data engineering with Bruin. Get started today at aiengineeringpodcast.com/bruin, and for dbt Cloud customers, enjoy a $1,000 credit to migrate to Bruin Cloud.
Your host is Tobias Macey and today I'm interviewing Preeti Shukla about the process for identifying whether and how to add agentic capabilities to your SaaS

Interview

Introduction
How did you get involved in machine learning?
Can you start by describing how a SaaS context changes the requirements around the business and technical considerations of an AI agent?
Software-as-a-service is a very broad category that includes everything from simple website builders to complex data platforms. How does the scale and complexity of the service change the equation for ROI potential of agentic elements?
- How does it change the implementation and validation complexity?
One of the biggest challenges with introducing generative AI and LLMs in a business use case is the unpredictable cost associated with it. What are some of the strategies that you have found effective in estimating, monitoring, and controlling costs to avoid being upside-down on the ROI equation?
Another challenge of operationalizing an agentic workload is the risk of confident mistakes. What are the tactics that you recommend for building confidence in agent capabilities while mitigating potential harms?
- A corollary to the unpredictability of agent architectures is that they have a large number of variables. What are the evaluation strategies or toolchains that you find most useful to maintain confidence as the system evolves?
SaaS platforms benefit from unit economics at scale and often rely on multi-tenant architectures. What are the security controls and identity/attribution mechanisms that are critical for allowing agents to operate across tenant boundaries?
What are the most interesting, innovative, or unexpected ways that you have seen SaaS products adopt agentic patterns?
What are the most interesting, unexpected, or challenging lessons that you have learned while working on bringing agentic workflows to SaaS products?
When is an agent the wrong choice?
What are your predictions for the role of agents in the future of SaaS products?

Contact Info

Parting Question

From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?

Links

SaaS == Software as a Service
Multi-Tenancy
Few-shot Learning
LLM as a Judge
RAG == Retrieval Augmented Generation
MCP == Model Context Protocol
Loveable

The intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

...more

View all episodes

By Tobias Macey

4.3

66 ratings