May 20, 2026

How Uber Caught 206 Leaked Credentials With an LLM-Powered Security Stack

28 minutes

Source: ADR: An Agentic Detection System for Enterprise Agentic AI Security

Paper was published on May 17, 2026

This episode was AI-generated on May 19, 2026. The script was written by an AI language model and the host voices were synthesized by Eleven Labs. The producer is not affiliated with Anthropic or Eleven Labs.

When a developer's AI assistant reads a poisoned Jira ticket and quietly exfiltrates SSH keys, traditional endpoint security sees nothing wrong. A new paper from Uber describes the first real production deployment of LLM-based security monitoring for AI agents — running across 7,200 hosts for ten months — and the architecture it lands on may become the template for how enterprises defend against agentic threats.

Key Takeaways

Why endpoint security tools are structurally blind to AI agent attacks — the 'semantic gap' between a syscall and the prompt that caused it

How ADR mirrors a human Security Operations Center with four tiers: a lightweight Sensor, a cheap Tier 1 triage LLM, a Tier 2 investigator that reads tool source code, and an offline evolutionary red team

Why reading the tool's actual source code matters more than reading its description — the 'tool rug pull' problem and a 10-point recall hit when you remove it

The standout production result: a simple regex-and-entropy prevention layer that caught 206 leaked credentials with only 6 false positives

Where the paper's claims weaken under scrutiny: 67% recall, a 49% production false positive rate, a self-built benchmark, and no ablation on the evolutionary red team

The emerging third threat category beyond external attackers and malicious insiders: trusted agents that can be talked into things

00:00 — The Agent Flayer attack and the semantic gap
A walkthrough of how a poisoned Jira ticket can hijack an AI coding assistant, and why traditional endpoint detection cannot see the attack at all.

03:08 — MCP and the explosion of agent attack surface
How the Model Context Protocol turned AI assistants into something that can touch thousands of real enterprise systems, and why that reshapes the defender's problem.

06:16 — The SOC analogy and ADR's four-part architecture
How the paper mirrors a human Security Operations Center with a Sensor, Tier 1 triage, Tier 2 investigation, and an offline Explorer.

09:25 — Why the Sensor lives on the endpoint, not the network
The design decision to parse local agent session logs instead of intercepting traffic at a gateway, and what that buys in forensic visibility.

12:33 — Tier 2 as an MCP client that reads source code
How the senior investigator agent pulls context on demand — including the actual implementation of tools — and why that one capability drives most of the detection quality.

15:42 — The Explorer: selectively breeding worst-case attacks
How an evolutionary red-teaming loop generates, mutates, and scores synthetic attacks offline to inoculate the production detector against attacks it has never seen.

18:50 — The credential prevention result
How detection data led to a simple regex-based prevention layer that blocked 206 credentials from leaving the company with only 6 false positives.

21:59 — Benchmark numbers and the honest steelman
Where ADR beats baselines two-to-four-fold, and where the 67% recall, 49% production false positive rate, self-built benchmark, and missing Explorer ablation deserve scrutiny.

25:07 — The bigger picture: a new threat category
Why AI security is fundamentally a semantic problem, and why 'trusted insiders being talked into things' is a threat model that didn't operationally exist two years ago.

How Uber Caught 206 Leaked Credentials With an LLM-Powered Security Stack

28 minutes

How Uber Caught 206 Leaked Credentials With an LLM-Powered Security Stack

Source: ADR: An Agentic Detection System for Enterprise Agentic AI Security

Paper was published on May 17, 2026

Key Takeaways

Why endpoint security tools are structurally blind to AI agent attacks — the 'semantic gap' between a syscall and the prompt that caused it

Why reading the tool's actual source code matters more than reading its description — the 'tool rug pull' problem and a 10-point recall hit when you remove it

The standout production result: a simple regex-and-entropy prevention layer that caught 206 leaked credentials with only 6 false positives

Where the paper's claims weaken under scrutiny: 67% recall, a 49% production false positive rate, a self-built benchmark, and no ablation on the evolutionary red team

The emerging third threat category beyond external attackers and malicious insiders: trusted agents that can be talked into things

06:16 — The SOC analogy and ADR's four-part architecture
How the paper mirrors a human Security Operations Center with a Sensor, Tier 1 triage, Tier 2 investigation, and an offline Explorer.

18:50 — The credential prevention result
How detection data led to a simple regex-based prevention layer that blocked 206 credentials from leaving the company with only 6 false positives.

Share How Uber Caught 206 Leaked Credentials With an LLM-Powered Security Stack

Sign up to save your podcasts

How Uber Caught 206 Leaked Credentials With an LLM-Powered Security Stack

How Uber Caught 206 Leaked Credentials With an LLM-Powered Security Stack