Linear Digressions

The Hot Mess of AI (Mis-)Alignment


Listen Later

The paperclip maximizer — the classic AI doom scenario where a hyper-competent machine single-mindedly converts the universe into office supplies — might not be the AI risk we should actually lose sleep over. New research from Anthropic's AI safety division suggests misaligned AI looks less like an evil genius and more like a distracted wanderer who gets sidetracked reading French poetry instead of, say, managing a nuclear power plant. This week we dig into a fascinating paper reframing AI misalignment through the lens of bias-variance decomposition, and why longer reasoning chains might actually make things worse, not better.
- "The Hot Mess Theory of AI Misalignment: How Misalignment Scales with Model Intelligence and Task Complexity" — Anthropic AI Safety. https://arxiv.org/abs/2503.08941
...more
View all episodesView all episodes
Download on the App Store

Linear DigressionsBy Katie Malone

  • 4.8
  • 4.8
  • 4.8
  • 4.8
  • 4.8

4.8

354 ratings