
Sign up to save your podcasts
Or
Note: below is a hypothetical future written in strong terms and does not track my actual probabilities.
Throughout 2025, a huge amount of compute is spent on producing data in verifiable tasks, such as math[1] (w/ "does it compile as a proof?" being the ground truth label) and code (w/ "does it compile and past unit tests?" being the ground truth label).
In 2026, when the next giant compute clusters w/ their GB200's are built, labs train the next larger model over 100 days, then some extra RL(H/AI)F and whatever else they've cooked up by then.
By mid-2026, we have a model that is very generally intelligent, that is superhuman in coding and math proofs.
Naively, 10x-ing research means releasing 10x the same quality amount of papers in a year; however, these new LLM's have a different skill profile, allowing different types of research and workflows.
If [...]
---
Outline:
(02:11) Scale Capabilities Safely
(02:40) Step 1: Hardening Defenses and More Control
(03:18) Step 2: Automate Interp
(07:43) Conclusion
The original text contained 3 footnotes which were omitted from this narration.
---
First published:
Source:
Narrated by TYPE III AUDIO.
Note: below is a hypothetical future written in strong terms and does not track my actual probabilities.
Throughout 2025, a huge amount of compute is spent on producing data in verifiable tasks, such as math[1] (w/ "does it compile as a proof?" being the ground truth label) and code (w/ "does it compile and past unit tests?" being the ground truth label).
In 2026, when the next giant compute clusters w/ their GB200's are built, labs train the next larger model over 100 days, then some extra RL(H/AI)F and whatever else they've cooked up by then.
By mid-2026, we have a model that is very generally intelligent, that is superhuman in coding and math proofs.
Naively, 10x-ing research means releasing 10x the same quality amount of papers in a year; however, these new LLM's have a different skill profile, allowing different types of research and workflows.
If [...]
---
Outline:
(02:11) Scale Capabilities Safely
(02:40) Step 1: Hardening Defenses and More Control
(03:18) Step 2: Automate Interp
(07:43) Conclusion
The original text contained 3 footnotes which were omitted from this narration.
---
First published:
Source:
Narrated by TYPE III AUDIO.
26,342 Listeners
2,393 Listeners
7,949 Listeners
4,130 Listeners
87 Listeners
1,446 Listeners
8,910 Listeners
88 Listeners
372 Listeners
5,421 Listeners
15,334 Listeners
468 Listeners
122 Listeners
76 Listeners
449 Listeners