Awesome Agents Podcast

Reasoning Models Can't Hide Their Thinking - OpenAI Study


Listen Later

OpenAI's CoT-Control benchmark shows frontier reasoning models score 0.1-15.4% at steering their own chain of thought - a result the company frames as good news for AI oversight.
...more
View all episodesView all episodes
Download on the App Store

Awesome Agents PodcastBy Awesome Agents