
Sign up to save your podcasts
Or


Reference: Gallifant, J. & Bitterman, D.S. (2025). Humanity’s Next Medical Exam: Preparing to Evaluate Superhuman Systems. NEJM AI, 2(11). DOI: 10.1056/AIe2501008
When an AI scores 100% on a medical exam but can't navigate a hospital ward, is it really a doctor?
Today, we break down a new editorial from NEJM AI by Gallifant and Bitterman. We explore the transition from "recall" to "reasoning" and why the future of AI safety lies in "Interactive Interrogation" and high-fidelity sandboxes.
The models are becoming superhuman. It’s time our tests caught up.
Further recommended listening: https://www.youtube.com/watch?v=yQLOicn2vPU
#ai in medicine Music generated by Mubert https://mubert.com/render
By Stephen AugerReference: Gallifant, J. & Bitterman, D.S. (2025). Humanity’s Next Medical Exam: Preparing to Evaluate Superhuman Systems. NEJM AI, 2(11). DOI: 10.1056/AIe2501008
When an AI scores 100% on a medical exam but can't navigate a hospital ward, is it really a doctor?
Today, we break down a new editorial from NEJM AI by Gallifant and Bitterman. We explore the transition from "recall" to "reasoning" and why the future of AI safety lies in "Interactive Interrogation" and high-fidelity sandboxes.
The models are becoming superhuman. It’s time our tests caught up.
Further recommended listening: https://www.youtube.com/watch?v=yQLOicn2vPU
#ai in medicine Music generated by Mubert https://mubert.com/render