Share Why model self-reports are insufficient—and why we studied them anyway by Robert Long

Copy link

September 01, 2025

Why model self-reports are insufficient—and why we studied them anyway by Robert Long

22 minutes

Read the post here.

Intro

In April and May 2025, Eleos AI Research conducted a limited welfare evaluation of Anthropic’s Claude Opus 4 before its release. We explored Claude’s expressions about consciousness, well-being, and preferences, using automated single-turn interviews and extended manual conversations. A summary of our findings appears in section 5.3 of the Claude 4 System Card.

We conducted this evaluation despite being acutely aware that we cannot “just ask” a large language model (LLM) whether it is conscious, suffering, or satisfied. It's highly unlikely that the resulting answers result from genuine introspection. Accordingly, we do not take Claude Opus 4’s responses at face value, as one might if talking to a human.

Despite this, we believe that model expressions about AI welfare are important and worth investigating, while handling them with caution. This post explains why, and then discusses the patterns we observed in our interviews.

...more

View all episodes

By Eleos AI