Pop Goes the Stack

AI Red Teaming in Practice: Scores, guardrails, auto-remediation


Listen Later

AI in production isn’t just another feature to ship. It’s a non-deterministic system that can be socially engineered, fuzzed, and pushed into failure states you won’t find with traditional testing. Recorded live in Las Vegas at F5’s AppWorld 2026, this episode of Pop Goes the Stack brings Joel Moses together with Jimmy White, F5’s VP of AI Security (via the CalypsoAI acquisition), for a practical look at what AI red teaming actually is and how it works when the attacker is an agent.

 

Jimmy reframes genAI security as a permutation problem: if there are countless prompt combinations that could unlock sensitive data or trigger unsafe actions, you need genAI-powered red team agents to explore those paths at scale. The discussion covers custom intents, agentic “fingerprints” that reveal not just what was compromised but how it happened, and why that “how” is the key to building protections you can trust.

 

You’ll also hear how scoring and reporting translate into guardrails, how auto-remediation can be validated with positive and negative test cases before a human publishes changes, and why relying on models to internalize safety isn’t a realistic plan. The conversation closes on agentic AI risk, where tools and permissions matter more than the model’s reasoning, and introduces “thought injection” as a way to redirect unsafe actions without breaking the agent loop.

If you’re building AI apps, deploying MCP-connected systems, or worrying about agents becoming tomorrow’s service accounts, this episode gives you a sharper playbook for testing, governance, and resilience.

...more
View all episodesView all episodes
Download on the App Store

Pop Goes the StackBy F5