A verbatim reading of ExploitGym, a large-scale benchmark of eight hundred and ninety-eight real-world vulnerabilities across userspace programs, Google's V8 JavaScript engine, and the Linux kernel, designed to measure whether AI agents can turn a proof-of-vulnerability into a working exploit that achieves unauthorized code execution. The authors find that frontier models can already exploit a non-trivial fraction of these targets, with the strongest configurations producing working exploits for over a hundred and fifty instances, and that standard mitigations reduce but do not eliminate agent success. The paper establishes exploitation as an under-evaluated, dual-use capability and argues that autonomous exploit development by AI agents is no longer hypothetical.