This Tuesday morning, a major new benchmark reveals leading AI models are failing high-risk conversations despite improving safety. Both GlobeNewswire and GeekWire report on the new mPACT benchmark from AI safety company mpathic, which evaluates how models like Claude, ChatGPT, and Gemini handle sensitive topics such as suicide risk, eating disorders, and misinformation. While models generally avoid harmful responses, they consistently fall short of clinical adequacy, particularly in recognizing