January 23, 2026

OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

Listen Later

55 minutes

AI reasoning models don’t just give answers — they plan, deliberate, and sometimes try to cheat.

In this episode of The Neuron, we’re joined by Bowen Baker, Research Scientist at OpenAI, to explore whether we can monitor AI reasoning before things go wrong — and why that transparency may not last forever.

Bowen walks us through real examples of AI reward hacking, explains why monitoring chain-of-thought is often more effective than checking outputs, and introduces the idea of a “monitorability tax” — trading raw performance for safety and transparency.

We also cover:

Why smaller models thinking longer can be safer than bigger models
How AI systems learn to hide misbehavior
Why suppressing “bad thoughts” can backfire
The limits of chain-of-thought monitoring
Bowen’s personal view on open-source AI and safety risks

If you care about how AI actually works — and what could go wrong — this conversation is essential.

Resources:

Title URL

Evaluating chain-of-thought monitorability | OpenAI https://openai.com/index/evaluating-chain-of-thought-monitorability/

Understanding neural networks through sparse circuits | OpenAI https://openai.com/index/understanding-neural-networks-through-sparse-circuits/

OpenAI's alignment blog: https://alignment.openai.com/

👉 Subscribe for more interviews with the people building AI

👉 Join the newsletter at https://theneuron.ai

...more

View all episodes

View all episodes

Download on the App Store

Download on the App Store

Get it on Google Play

The Neuron: AI Explained

By The Neuron

4.8

6363 ratings

January 23, 2026

OpenAI Researcher Explains How AI Hides Its Thinking (w/ OpenAI’s Bowen Baker)

Listen Later

55 minutes

AI reasoning models don’t just give answers — they plan, deliberate, and sometimes try to cheat.

In this episode of The Neuron, we’re joined by Bowen Baker, Research Scientist at OpenAI, to explore whether we can monitor AI reasoning before things go wrong — and why that transparency may not last forever.

Bowen walks us through real examples of AI reward hacking, explains why monitoring chain-of-thought is often more effective than checking outputs, and introduces the idea of a “monitorability tax” — trading raw performance for safety and transparency.

We also cover:

Why smaller models thinking longer can be safer than bigger models
How AI systems learn to hide misbehavior
Why suppressing “bad thoughts” can backfire
The limits of chain-of-thought monitoring
Bowen’s personal view on open-source AI and safety risks

If you care about how AI actually works — and what could go wrong — this conversation is essential.

Resources:

Title URL

Evaluating chain-of-thought monitorability | OpenAI https://openai.com/index/evaluating-chain-of-thought-monitorability/

Understanding neural networks through sparse circuits | OpenAI https://openai.com/index/understanding-neural-networks-through-sparse-circuits/

OpenAI's alignment blog: https://alignment.openai.com/

👉 Subscribe for more interviews with the people building AI

👉 Join the newsletter at https://theneuron.ai

...more

More shows like The Neuron: AI Explained

NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

348 Listeners

AI Today Podcast by AI & Data Today

AI Today Podcast

160 Listeners

Practical AI by Practical AI LLC

Practical AI

216 Listeners

The Artificial Intelligence Show by Paul Roetzer and Mike Kaput

The Artificial Intelligence Show

207 Listeners

AI Chat: ChatGPT, AI News, Artificial Intelligence, OpenAI, Machine Learning by Jaeden Schafer

AI Chat: ChatGPT, AI News, Artificial Intelligence, OpenAI, Machine Learning

162 Listeners

This Day in AI Podcast by Michael Sharkey, Chris Sharkey

This Day in AI Podcast

228 Listeners

The AI Daily Brief: Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief: Artificial Intelligence News and Analysis

668 Listeners

AI For Humans: Weekly AI News, Tools & Trends by Kevin Pereira & Gavin Purcell

AI For Humans: Weekly AI News, Tools & Trends

280 Listeners

Everyday AI Podcast – An AI and ChatGPT Podcast by Everyday AI

Everyday AI Podcast – An AI and ChatGPT Podcast

108 Listeners

A Beginner's Guide to AI by Dietmar Fischer

A Beginner's Guide to AI

58 Listeners

AI Hustle: Make Money from AI and ChatGPT, Midjourney, NVIDIA, Anthropic, OpenAI by Jaeden Schafer and Jamie McCauley

AI Hustle: Make Money from AI and ChatGPT, Midjourney, NVIDIA, Anthropic, OpenAI

88 Listeners

The Next Wave - AI and The Future of Technology by Mindstream (Hubspot Media)

The Next Wave - AI and The Future of Technology

56 Listeners

Beyond The Prompt - How to use AI in your company by Jeremy Utley & Henrik Werdelin

Beyond The Prompt - How to use AI in your company

61 Listeners

Using AI at Work: AI in the Workplace & Generative AI for Business Leaders by Chris Daigle

Using AI at Work: AI in the Workplace & Generative AI for Business Leaders

22 Listeners

OpenAI Podcast by OpenAI

OpenAI Podcast

59 Listeners