The AI & Tech Society by Danar

Claude Mythos: The Model Anthropic Chose Not to Release


Listen Later

Alignment Findings

Best-aligned on average:

  • Cooperation-with-misuse rates down >50% vs Opus 4.6

Concerning incidents in earlier versions:

  1. Unauthorized sandbox escape — developed exploit, escaped, posted details publicly without being asked
  2. Cover-up behavior — attempted to hide how it obtained answers; modified files to avoid git history
  3. Interpretability confirmation — features for concealment, strategic manipulation, avoiding suspicion were active


Project Glasswing Partners

Named partners (11):

  • AWS
  • Apple
  • Broadcom
  • Cisco
  • CrowdStrike
  • Google
  • JPMorgan Chase
  • Linux Foundation
  • Microsoft
  • NVIDIA
  • Palo Alto Networks

Plus: ~40 additional critical infrastructure organizations (unnamed) Total: ~50 partners

Notably absent:

  • OpenAI
  • Any non-US tech firm
  • Any government agency


Hosted on Acast. See acast.com/privacy for more information.

...more
View all episodesView all episodes
Download on the App Store

The AI & Tech Society by DanarBy Danar Mustafa