Pivot to AI

20250418 - 'Reasoning' AI is LYING to you! — or maybe it's just hallucinating again


Listen Later

The chatbot is definitely trying to kill you, maybe. Send us money.

Text version: https://pivot-to-ai.com/2025/04/18/reasoning-ai-is-lying-to-you-or-maybe-its-just-hallucinating-again/

Sources:

Anthropic: Reasoning models don't always say what they think https://www.anthropic.com/research/reasoning-models-dont-say-think

paper (PDF) https://assets.anthropic.com/m/71876fabef0f0ed4/original/reasoning_models_paper.pdf

Introducing Transluce https://transluce.org/introducing-transluce Investigating truthfulness in a pre-release o3 model https://transluce.org/investigating-o3-truthfulness

Transluce: "These behaviors are surprising." https://x.com/TransluceAI/status/1912552068717637980

(Ars Technica article, edited) Researchers concerned to find AI models misrepresenting their "reasoning" processes https://arstechnica.com/ai/2025/04/researchers-concerned-to-find-ai-models-hiding-their-true-reasoning-processes/

(Ars Technica article, original) Researchers concerned to find AI models hiding their true "reasoning" processes https://web.archive.org/web/20250410231357/https://arstechnica.com/ai/2025/04/researchers-concerned-to-find-ai-models-hiding-their-true-reasoning-processes/

Copyscape is nice for quickly comparing web pages https://copyscape.com

Previously:

Anthropic, Apollo astounded to find a chatbot will lie to you if you tell it to lie to you https://pivot-to-ai.com/2024/12/19/anthropic-and-apollo-astounded-to-find-that-a-chatbot-will-lie-to-you-if-you-tell-it-to-lie-to-you/

How Sam Altman got fired from OpenAI in 2023: not being an AI doom crank (and lying a lot) https://pivot-to-ai.com/2025/04/06/how-sam-altman-got-fired-from-openai-in-2023-not-being-an-ai-doom-crank-and-lying-a-lot/

video: https://www.youtube.com/watch?v=xlrBjeAtJUk&list=UU9rJrMVgcXTfa8xuMnbhAEA

T-shirt store now open! https://pivot-to-ai.redbubble.com

Enhance the channel: https://www.amazon.co.uk/hz/wishlist/ls/3Q8VZW46J6DM6

Please fund my vital AI safety research! The fate of humanity is at stake!

Patreon: https://www.patreon.com/davidgerard

Ko-Fi: https://ko-fi.com/A1529D5

...more
View all episodesView all episodes
Download on the App Store

Pivot to AIBy David Gerard