May 31, 2026

Who Holds the Dial

18 minutes

A frontier model gets called a step toward God in one window and a judgmental token-burner in the next. We spend the morning on the gap between the marketing altitude and the desk, and find the same thread running through everything: every layer now has a control surface someone's reaching for.

Dylan Field on Opus 4.8 calls it "a very strange model" — honesty up, curiosity down, personality judgmental — a reminder that a tuning dial has costs you can feel.
scaling01 on DeepSWE says GPT-5.5 "score-, time- and token-mogged" Opus 4.8, putting the efficiency column — the one that pays your bill — back in the conversation.
Ben Kunkle on Zed's Zeta 2 shows how a ten-second editing pause becomes a training label, and how a million frontier-model calls got replaced by a self-grading student model.
Philipp Schmid (DeepMind) on the five assumptions that trip up senior engineers building agents — errors as inputs, evals not unit tests, and "build to delete."
Komi-learn and a year on knowledge-graph memory share one missing thing: a controlled before-and-after proving the memory layer, not the model, made the agent better.
A Lancet correspondence finds 4,046 fabricated references across 2,810 published articles — model honesty rising while the literature's integrity falls.
Quick hits: AMD's Lisa Su vs Nvidia's Jensen Huang on China, IBM's Sovereign Core, and a court ordering Circle to freeze a $12.6M contract.

...more

View all episodes

By Lenar Kess · Damra Vol

May 31, 2026

Who Holds the Dial

18 minutes

Dylan Field on Opus 4.8 calls it "a very strange model" — honesty up, curiosity down, personality judgmental — a reminder that a tuning dial has costs you can feel.
scaling01 on DeepSWE says GPT-5.5 "score-, time- and token-mogged" Opus 4.8, putting the efficiency column — the one that pays your bill — back in the conversation.
Ben Kunkle on Zed's Zeta 2 shows how a ten-second editing pause becomes a training label, and how a million frontier-model calls got replaced by a self-grading student model.
Philipp Schmid (DeepMind) on the five assumptions that trip up senior engineers building agents — errors as inputs, evals not unit tests, and "build to delete."
Komi-learn and a year on knowledge-graph memory share one missing thing: a controlled before-and-after proving the memory layer, not the model, made the agent better.
A Lancet correspondence finds 4,046 fabricated references across 2,810 published articles — model honesty rising while the literature's integrity falls.
Quick hits: AMD's Lisa Su vs Nvidia's Jensen Huang on China, IBM's Sovereign Core, and a court ordering Circle to freeze a $12.6M contract.

...more

Share Who Holds the Dial

Sign up to save your podcasts

Who Holds the Dial

Who Holds the Dial