
Sign up to save your podcasts
Or
Audio note: this article contains 51 uses of latex notation, so the narration may be difficult to follow. There's a link to the original text in the episode description.
Today I’m going to discuss how to think about logits like a statistician, and what this implies about circuits. This post doesn’t have any prerequisites other than perhaps a very basic statistical background that can be adequately recovered from the AI-generated “glossary” to the right. I think the material here is good thing to know in general (thinking through this helped clarify my thinking about a lot of things), and it will be useful background for a future post I’m planning on “SLT in a nutshell”. If you want a “TL:DR” takeaway of the discussion that follows, the gist is that neural networks use logit addition to integrate (roughly) independent “parallel” information from various sources; and that thinking [...]
---
Outline:
(01:14) Basics of logits and logistic tasks
(08:17) Parallel prediction circuits
(08:21) Log odds and independent predictions
(11:02) Independence and circuits
(14:37) Appreciating the wisdom of the elders
(15:31) Interpretability insights
The original text contained 5 footnotes which were omitted from this narration.
---
First published:
Source:
Narrated by TYPE III AUDIO.
Audio note: this article contains 51 uses of latex notation, so the narration may be difficult to follow. There's a link to the original text in the episode description.
Today I’m going to discuss how to think about logits like a statistician, and what this implies about circuits. This post doesn’t have any prerequisites other than perhaps a very basic statistical background that can be adequately recovered from the AI-generated “glossary” to the right. I think the material here is good thing to know in general (thinking through this helped clarify my thinking about a lot of things), and it will be useful background for a future post I’m planning on “SLT in a nutshell”. If you want a “TL:DR” takeaway of the discussion that follows, the gist is that neural networks use logit addition to integrate (roughly) independent “parallel” information from various sources; and that thinking [...]
---
Outline:
(01:14) Basics of logits and logistic tasks
(08:17) Parallel prediction circuits
(08:21) Log odds and independent predictions
(11:02) Independence and circuits
(14:37) Appreciating the wisdom of the elders
(15:31) Interpretability insights
The original text contained 5 footnotes which were omitted from this narration.
---
First published:
Source:
Narrated by TYPE III AUDIO.
26,331 Listeners
2,403 Listeners
7,873 Listeners
4,105 Listeners
87 Listeners
1,449 Listeners
8,765 Listeners
90 Listeners
350 Listeners
5,370 Listeners
14,993 Listeners
468 Listeners
128 Listeners
72 Listeners
438 Listeners