Pretrained

Can we really trust reasoning


Listen Later

Pierce and Richard cover the news that dropped over the holiday break. Getting breaking news incorporated within chatbots, OpenAI's "code red" over Google's Gemini 3, benchmarking the reliability of chain of thought to introspect model behavior, and a review of Claude Skills.

Further reading:
- https://www.wired.com/story/us-invaded-venezuela-and-captured-nicolas-maduro-chatgpt-disagrees
- https://fortune.com/2025/12/02/sam-altman-declares-code-red-google-gemini-ceo-sundar-pichai/
- https://openai.com/index/evaluating-chain-of-thought-monitorability/
- https://platform.claude.com/docs/en/agents-and-tools/agent-skills/overview

...more
View all episodesView all episodes
Download on the App Store

PretrainedBy Pierce Freeman & Richard Diehl Martinez