Summary
In this episode Craig McLuckie, co-creator of Kubernetes and founder/CEO of Stacklok, talks about how to improve security and reliability for AI agents using curated, optimized deployments of the Model Context Protocol (MCP). Craig explains why MCP is emerging as the API layer for AI‑native applications, how to balance short‑term productivity with long‑term platform thinking, and why great tools plus frontier models still drive the best outcomes. He digs into common adoption pitfalls (tool pollution, insecure NPX installs, scattered credentials), the necessity of continuous evals for stochastic systems, and the shift from “what the agent can access” to “what the agent knows.” Craig also shares how ToolHive approaches secure runtimes, a virtual MCP gateway with semantic search, orchestration and transactional semantics, a registry for organizational tooling, and a console for self‑service—along with pragmatic patterns for auth, policy, and observability.
Announcements
- Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems
- When ML teams try to run complex workflows through traditional orchestration tools, they hit walls. Cash App discovered this with their fraud detection models - they needed flexible compute, isolated environments, and seamless data exchange between workflows, but their existing tools couldn't deliver. That's why Cash App rely on Prefect. Now their ML workflows run on whatever infrastructure each model needs across Google Cloud, AWS, and Databricks. Custom packages stay isolated. Model outputs flow seamlessly between workflows. Companies like Whoop and 1Password also trust Prefect for their critical workflows. But Prefect didn't stop there. They just launched FastMCP - production-ready infrastructure for AI tools. You get Prefect's orchestration plus instant OAuth, serverless scaling, and blazing-fast Python execution. Deploy your AI tools once, connect to Claude, Cursor, or any MCP client. No more building auth flows or managing servers. Prefect orchestrates your ML pipeline. FastMCP handles your AI tool infrastructure. See what Prefect and Fast MCP can do for your AI workflows at aiengineeringpodcast.com/prefect today.
- Unlock the full potential of your AI workloads with a seamless and composable data infrastructure. Bruin is an open source framework that streamlines integration from the command line, allowing you to focus on what matters most - building intelligent systems. Write Python code for your business logic, and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. With native support for ML/AI workloads, Bruin empowers data teams to deliver faster, more reliable, and scalable AI solutions. Harness Bruin's connectors for hundreds of platforms, including popular machine learning frameworks like TensorFlow and PyTorch. Build end-to-end AI workflows that integrate seamlessly with your existing tech stack. Join the ranks of forward-thinking organizations that are revolutionizing their data engineering with Bruin. Get started today at aiengineeringpodcast.com/bruin, and for dbt Cloud customers, enjoy a $1,000 credit to migrate to Bruin Cloud.
- Your host is Tobias Macey and today I'm interviewing Craig McLuckie about improving the security of your AI agents through curated and optimized MCP deployment
Interview
- Introduction
- How did you get involved in machine learning?
- MCP saw huge growth in attention and adoption over the course of this year. What are the stumbling blocks that teams run into when going to production with MCP servers?
- How do improperly managed MCP servers contribute to security problems in an agent-driven software development workflow?
- What are some of the problematic practices or shortcuts that you are seeing teams implement when running MCP services for their developers?
- What are the benefits of a curated and opinionated MCP service as shared infrastructure for an engineering team?
- You are building ToolHive as a system for managing and securing MCP services as a platform component. What are the strategic benefits of starting with that as the foundation for your company?
- There are several services for managing MCP server deployment and access control. What are the unique elements of ToolHive that make it worth adopting?
- For software-focused agentic AI, the approach of Claude Code etc. to be command-line based opens the door for an effectively unbounded set of tools. What are the benefits of MCP over arbitrary CLI execution in that context?
- What are the most interesting, innovative, or unexpected ways that you have seen ToolHive/MCP used?
- What are the most interesting, unexpected, or challenging lessons that you have learned while working on ToolHive?
- When is ToolHive the wrong choice?
- What do you have planned for the future of ToolHive/Stacklok?
Contact Info
Parting Question
- From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?
Links
- StackLok
- MCP == Model Context Protocol
- Kubernetes
- CNCF == Cloud Native Computing Foundation
- SDLC == Software Development Life Cycle
- The Bitter Lesson
- TLA+
- Jepsen Tests
- ToolHive
- API Gateway
- Glean
The intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0