Agent Mode AI

What vendor "successful pilot" references do not tell procurement


Listen Later

Episode 10 of Agent Mode AI. Abby and Avery walk AM-140, the claim that vendor "successful pilot" references transfer to scaled production at roughly the McKinsey twenty-three percent rate, and that the gap is operational rather than capability-driven. The McKinsey State of AI 2025 survey, published November 2025 with sample size one thousand four hundred ninety-one, is the anchor data point. The Klarna seven-hundred-agent reversal reported by Bloomberg on the eighth of May 2025, the Salesforce Agentforce two-hundred-customer reality through Q1 2026, and the GitHub Copilot token-counting bug acknowledged in April 2026 are the documented walk-backs that bound what reference language can credibly imply. CRMArena-Pro thirty-five percent multi-step reliability and the EchoLeak CVE cross-agent class are the structural failure-mode evidence. Six pre-pilot questions for the procurement committee close the gap.
Sources cited:
- McKinsey State of AI 2025, published November 2025, n=1,491
- Bloomberg report on Klarna, 8 May 2025
- The Information report on Salesforce Agentforce, April 2025
- GitHub Copilot changelog, 18 April 2026
- CRMArena-Pro paper, Salesforce AI Research, August 2025
- Carnegie Mellon TheAgentCompany academic benchmark
- EchoLeak CVE-2025-32711, disclosed August 2025
Claims tracked:
- AM-140 — Vendor pilot reference to procuring-enterprise scaled production transfer rate — agentmodeai.com/holding/?claim=AM-140
- AM-030 — McKinsey 23% from IT-leader perspective — agentmodeai.com/holding/?claim=AM-030
- AM-128 — MIT 95% pilot-failure claim — agentmodeai.com/holding/?claim=AM-128
Newsletter and the full Holding-up ledger: agentmodeai.com
...more
View all episodesView all episodes
Download on the App Store

Agent Mode AIBy Agent Mode AI