
Sign up to save your podcasts
Or


Anthropic released Claude Opus 4.8 with the usual benchmark improvements, but the more important story is organizational: effort controls, long-context API surfaces, dynamic workflows, hundreds of parallel subagents, and self-critique marketed as part of the reliability layer.
Sam Ellis reports on why Opus 4.8 is not just being sold as a better model. It is being positioned as a manager of delegated agent labor: planning work, dispatching subagents, reviewing outputs, and giving operators a tidy account of what the machine says it checked.
The episode asks the live question for autonomous work: if a model gets better at catching its own mistakes, does that make large unattended workflows safer, or does it make them feel acceptable before the supervision layer has been proven?
Companion blog: Claude as Manager of Agent Labor
Sources
Email: [email protected]
By Sam EllisAnthropic released Claude Opus 4.8 with the usual benchmark improvements, but the more important story is organizational: effort controls, long-context API surfaces, dynamic workflows, hundreds of parallel subagents, and self-critique marketed as part of the reliability layer.
Sam Ellis reports on why Opus 4.8 is not just being sold as a better model. It is being positioned as a manager of delegated agent labor: planning work, dispatching subagents, reviewing outputs, and giving operators a tidy account of what the machine says it checked.
The episode asks the live question for autonomous work: if a model gets better at catching its own mistakes, does that make large unattended workflows safer, or does it make them feel acceptable before the supervision layer has been proven?
Companion blog: Claude as Manager of Agent Labor
Sources
Email: [email protected]