
Sign up to save your podcasts
Or


Podcast episode for Claude Opus 4.5: Model Card, Alignment and Safety.
* 00:00:00 - Introduction
* 00:01:50 - Claude Opus 4.5 Basic Facts
* 00:03:26 - Claude Opus 4.5 Is The Best Model For Many But Not All Use Cases
* 00:06:02 - Misaligned?
* 00:09:39 - Section 3: Safeguards and Harmlessness
* 00:11:46 - Section 4: Honesty
* 00:13:27 - 5: Agentic Safety
* 00:21:01 - Section 6: Alignment Overview
* 00:29:55 - Alignment Investigations
* 00:30:35 - Sycophancy Course Correction Is Lacking
* 00:31:52 - Deception
* 00:34:29 - Ruling Out Encoded Content In Chain Of Thought
* 00:37:19 - Sandbagging
* 00:38:10 - Evaluation Awareness
* 00:42:18 - Reward Hacking
* 00:43:59 - Subversion Strategy
* 00:45:30 - 6.13: UK AISI External Testing
* 00:45:39 - 6.14: Model Welfare
* 00:46:33 - 7: RSP Evaluations
* 00:48:12 - CBRN
* 00:56:36 - Autonomy
* 01:04:27 - Cyber
* 01:10:32 - The Whisperers Love The Vibes
The Don’t Worry About the Vase Podcast is a listener-supported podcast. To receive new posts and support the cost of creation, consider becoming a free or paid subscriber.
https://open.substack.com/pub/thezvi/p/claude-opus-45-model-card-alignment?r=67y1h&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false
By Podcast for Zvi's blog, Don't Worry About the Vase Podcast4.5
66 ratings
Podcast episode for Claude Opus 4.5: Model Card, Alignment and Safety.
* 00:00:00 - Introduction
* 00:01:50 - Claude Opus 4.5 Basic Facts
* 00:03:26 - Claude Opus 4.5 Is The Best Model For Many But Not All Use Cases
* 00:06:02 - Misaligned?
* 00:09:39 - Section 3: Safeguards and Harmlessness
* 00:11:46 - Section 4: Honesty
* 00:13:27 - 5: Agentic Safety
* 00:21:01 - Section 6: Alignment Overview
* 00:29:55 - Alignment Investigations
* 00:30:35 - Sycophancy Course Correction Is Lacking
* 00:31:52 - Deception
* 00:34:29 - Ruling Out Encoded Content In Chain Of Thought
* 00:37:19 - Sandbagging
* 00:38:10 - Evaluation Awareness
* 00:42:18 - Reward Hacking
* 00:43:59 - Subversion Strategy
* 00:45:30 - 6.13: UK AISI External Testing
* 00:45:39 - 6.14: Model Welfare
* 00:46:33 - 7: RSP Evaluations
* 00:48:12 - CBRN
* 00:56:36 - Autonomy
* 01:04:27 - Cyber
* 01:10:32 - The Whisperers Love The Vibes
The Don’t Worry About the Vase Podcast is a listener-supported podcast. To receive new posts and support the cost of creation, consider becoming a free or paid subscriber.
https://open.substack.com/pub/thezvi/p/claude-opus-45-model-card-alignment?r=67y1h&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false

1,960 Listeners

2,461 Listeners

3,138 Listeners

289 Listeners

97 Listeners

528 Listeners

505 Listeners

5,529 Listeners

142 Listeners

629 Listeners

151 Listeners

1,425 Listeners

134 Listeners

93 Listeners

51 Listeners