Braid

Cheaper From Both Ends


Listen Later

A Chinese lab cut the price of a frontier-class coding model to a fraction of Opus, Nvidia tried to own every layer from the laptop to the data center, and one developer ran the new Gemma 4 on a decade-old Xeon. The cost of running intelligence got attacked from both ends on the same morning — and the question underneath all of it is who gets to set that cost.

  • MiniMax M3 claims parity with Opus 4.7 at roughly twelve cents per million input tokens versus five dollars — but the weights are promised in about ten days, so "open-weights" is still a countdown.
  • Nvidia's DGX Station puts a GB300 chip and up to 748GB of memory on a desktop, enough to run a one-trillion-parameter model locally; the RTX Spark chip pushes the same idea into laptops, while the Vera CPUs — with Anthropic, OpenAI, and SpaceX as early customers — signal a move off x86.
  • A 10-year-old Xeon is all you need: cafkafk runs a 26B mixture-of-experts model at reading speed on a 2016 CPU with no GPU, arguing mainstream tools hide the performance levers.
  • Cosmos 3 is Nvidia's open physical-AI world model, backed by a Cosmos Coalition with Runway as a founding member.
  • Cadence and Nvidia claim a "Level 5" autonomous chip-verification agent that turns months into a day — a large autonomy claim in a domain where mistakes ship in silicon.
  • Anthropic will let the EU's ENISA join Project Glasswing for access to a model called Mythos, even as a Wirescreen analysis documents 500+ PLA attempts to procure Nvidia chips and governments from India and the UAE to France move to own their compute.
...more
View all episodesView all episodes
Download on the App Store

BraidBy Lenar Kess · Damra Vol