AI Stress Test

From nuclear testing moratoriums to AI safety thresholds


Listen Later

Over 33 years ago, the US conducted its final nuclear test - transitioning to a science-based Stockpile Stewardship Program that maintains civilization-ending arsenals through simulations alone, never testing them again.​​
New AI safety research has found frontier models achieving 70% performance on complex software engineering tasks and demonstrating potential for weaponised capabilities - prompting developers to deploy unprecedented safety controls not because they've definitively crossed danger thresholds, but because they cannot rule out crossing them.​​
Whilst nuclear warheads and AI systems are very different - one frozen-in-time physics, the other evolving from under 10% to 70% capability in under two years - we can draw interesting parallels with regard to governing powerful technologies through computational assessment rather than direct testing when the risks of empirical validation become unacceptable.​​
Will your organisation wait for evidence of actual harm before implementing enhanced AI controls or will you trigger safeguards when capability thresholds are reached?
Profiled research:
International AI Safety Report 2025: First Key Update: https://doi.org/10.48550/arXiv.2510.13653;
Detecting and Reducing Scheming in AI Models: https://openai.com/index/detecting-and-reducing-scheming-in-ai-models/;
LLM Jailbreak Detection for (Almost) Free: https://doi.org/10.48550/arXiv.2509.14558;
InvThink: Towards AI Safety via Inverse Reasoning: https://doi.org/10.48550/arXiv.2510.01569;
AI Red-Teaming Design: Threat Models and Tools: https://cset.georgetown.edu/article/ai-red-teaming-design-threat-models-and-tools/
...more
View all episodesView all episodes
Download on the App Store

AI Stress TestBy Geoff Ferres