
Sign up to save your podcasts
Or


Sometimes, people talk about transformers as having "world models" as a result of being trained to predict text data on the internet. But what does this even mean? In this episode, I talk with Adam Shai and Paul Riechers about their work applying computational mechanics, a sub-field of physics studying how to predict random processes, to neural networks.
Patreon: https://www.patreon.com/axrpodcast
Ko-fi: https://ko-fi.com/axrpodcast
The transcript: https://axrp.net/episode/2024/09/29/episode-36-adam-shai-paul-riechers-computational-mechanics.html
Topics we discuss, and timestamps:
0:00:42 - What computational mechanics is
0:29:49 - Computational mechanics vs other approaches
0:36:16 - What world models are
0:48:41 - Fractals
0:57:43 - How the fractals are formed
1:09:55 - Scaling computational mechanics for transformers
1:21:52 - How Adam and Paul found computational mechanics
1:36:16 - Computational mechanics for AI safety
1:46:05 - Following Adam and Paul's research
Simplex AI Safety: https://www.simplexaisafety.com/
Research we discuss:
Transformers represent belief state geometry in their residual stream: https://arxiv.org/abs/2405.15943
Transformers represent belief state geometry in their residual stream [LessWrong post]: https://www.lesswrong.com/posts/gTZ2SxesbHckJ3CkF/transformers-represent-belief-state-geometry-in-their
Why Would Belief-States Have A Fractal Structure, And Why Would That Matter For Interpretability? An Explainer: https://www.lesswrong.com/posts/mBw7nc4ipdyeeEpWs/why-would-belief-states-have-a-fractal-structure-and-why
Episode art by Hamish Doodles: hamishdoodles.com
By Daniel Filan4.4
88 ratings
Sometimes, people talk about transformers as having "world models" as a result of being trained to predict text data on the internet. But what does this even mean? In this episode, I talk with Adam Shai and Paul Riechers about their work applying computational mechanics, a sub-field of physics studying how to predict random processes, to neural networks.
Patreon: https://www.patreon.com/axrpodcast
Ko-fi: https://ko-fi.com/axrpodcast
The transcript: https://axrp.net/episode/2024/09/29/episode-36-adam-shai-paul-riechers-computational-mechanics.html
Topics we discuss, and timestamps:
0:00:42 - What computational mechanics is
0:29:49 - Computational mechanics vs other approaches
0:36:16 - What world models are
0:48:41 - Fractals
0:57:43 - How the fractals are formed
1:09:55 - Scaling computational mechanics for transformers
1:21:52 - How Adam and Paul found computational mechanics
1:36:16 - Computational mechanics for AI safety
1:46:05 - Following Adam and Paul's research
Simplex AI Safety: https://www.simplexaisafety.com/
Research we discuss:
Transformers represent belief state geometry in their residual stream: https://arxiv.org/abs/2405.15943
Transformers represent belief state geometry in their residual stream [LessWrong post]: https://www.lesswrong.com/posts/gTZ2SxesbHckJ3CkF/transformers-represent-belief-state-geometry-in-their
Why Would Belief-States Have A Fractal Structure, And Why Would That Matter For Interpretability? An Explainer: https://www.lesswrong.com/posts/mBw7nc4ipdyeeEpWs/why-would-belief-states-have-a-fractal-structure-and-why
Episode art by Hamish Doodles: hamishdoodles.com

26,388 Listeners

2,424 Listeners

1,081 Listeners

107 Listeners

112,433 Listeners

212 Listeners

9,826 Listeners

89 Listeners

488 Listeners

5,475 Listeners

132 Listeners

16,083 Listeners

96 Listeners

209 Listeners

133 Listeners