AI Security Podcast

Inside the 29.5 Million DARPA AI Cyber Challenge: How Autonomous Agents Find & Patch Vulns


Listen Later

What does it take to build a fully autonomous AI system that can find, verify, and patch vulnerabilities in open-source software? Michael Brown, Principal Security Engineer at Trail of Bits, joins us to go behind the scenes of the 3-year DARPA AI Cyber Challenge (AICC), where his team's agent, "Buttercup," won second place.

Michael, a self-proclaimed "AI skeptic," shares his surprise at how capable LLMs were at generating high-quality patches . However, he also shared the most critical lesson from the competition: "AI was actually the commodity" The real differentiator wasn't the AI model itself, but the "best of both worlds" approach, robust engineering, intelligent scaffolding, and using "AI where it's useful and conventional stuff where it's useful" .

This is a great listen for any engineering or security team building AI solutions. We cover the multi-agent architecture of Buttercup, the real-world costs and the open-source future of this technology .


Questions asked:

(00:00) Introduction: The DARPA AI Hacking Challenge(03:00) Who is Michael Brown? (Trail of Bits AI/ML Research)(04:00) What is the DARPA AI Cyber Challenge (AICC)?(04:45) Why did the AICC take 3 years to run?(07:00) The AICC Finals: Trail of Bits takes 2nd place(07:45) The AICC Goal: Autonomously find AND patch open source(10:45) Competition Rules: No "virtual patching"(11:40) AICC Scoring: Finding vs. Patching(14:00) The competition was fully autonomous(14:40) The 3-month sprint to build Buttercup v1(15:45) The origin of the name "Buttercup" (The Princess Bride)(17:40) The original (and scrapped) concept for Buttercup(20:15) The critical difference: Finding vs. Verifying a vulnerability(26:30) LLMs were allowed, but were they the key?(28:10) Choosing LLMs: Using OpenAI for patching, Anthropic for fuzzing(30:30) What was the biggest surprise? (An AI skeptic is blown away)(32:45) Why the latest models weren't always better(35:30) The #1 lesson: The importance of high-quality engineering(39:10) Scaffolding vs. AI: What really won the competition?(40:30) Key Insight: AI was the commodity, engineering was the differentiator(41:40) The "Best of Both Worlds" approach (AI + conventional tools)(43:20) Pro Tip: Don't ask AI to "boil the ocean"(45:00) Buttercup's multi-agent architecture (Engineer, Security, QA)(47:30) Can you use Buttercup for your enterprise? (The $100k+ cost)(48:50) Buttercup is open source and runs on a laptop(51:30) The future of Buttercup: Connecting to OSS-Fuzz(52:45) How Buttercup compares to commercial tools (RunSybil, XBOW)(53:50) How the 1st place team (Team Atlanta) won(56:20) Where to find Michael Brown & Buttercup


Resources discussed during the interview:

  • Trail of Bits
  • Buttercup (Open Source Project)
  • DARPA AI Cyber Challenge (AICC)
  • Movie: The Princess Bride
...more
View all episodesView all episodes
Download on the App Store

AI Security PodcastBy Kaizenteq Team

  • 4.9
  • 4.9
  • 4.9
  • 4.9
  • 4.9

4.9

8 ratings


More shows like AI Security Podcast

View all
Defensive Security Podcast - Malware, Hacking, Cyber Security & Infosec by Jerry Bell and Andrew Kalat

Defensive Security Podcast - Malware, Hacking, Cyber Security & Infosec

373 Listeners

Risky Business by Patrick Gray

Risky Business

376 Listeners

SANS Internet Stormcenter Daily Cyber Security Podcast (Stormcast) by Johannes B. Ullrich

SANS Internet Stormcenter Daily Cyber Security Podcast (Stormcast)

653 Listeners

CyberWire Daily by N2K Networks

CyberWire Daily

1,020 Listeners

Smashing Security by Graham Cluley

Smashing Security

320 Listeners

Cybersecurity Today by Jim Love

Cybersecurity Today

177 Listeners

CISO Series Podcast by David Spark, Mike Johnson, and Andy Ellis

CISO Series Podcast

189 Listeners

Practical AI by Practical AI LLC

Practical AI

213 Listeners

Defense in Depth by David Spark, Steve Zalewski, Geoff Belknap

Defense in Depth

74 Listeners

Cloud Security Podcast by Cloud Security Podcast Team

Cloud Security Podcast

57 Listeners

Cyber Security Headlines by CISO Series

Cyber Security Headlines

136 Listeners

Risky Bulletin by risky.biz

Risky Bulletin

46 Listeners

Hacker And The Fed by Chris Tarbell & Hector Monsegur

Hacker And The Fed

171 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

131 Listeners

This Day in AI Podcast by Michael Sharkey, Chris Sharkey

This Day in AI Podcast

209 Listeners