
Sign up to save your podcasts
Or


Audio note: this article contains 51 uses of latex notation, so the narration may be difficult to follow. There's a link to the original text in the episode description.
Today I’m going to discuss how to think about logits like a statistician, and what this implies about circuits. This post doesn’t have any prerequisites other than perhaps a very basic statistical background that can be adequately recovered from the AI-generated “glossary” to the right. I think the material here is good thing to know in general (thinking through this helped clarify my thinking about a lot of things), and it will be useful background for a future post I’m planning on “SLT in a nutshell”. If you want a “TL:DR” takeaway of the discussion that follows, the gist is that neural networks use logit addition to integrate (roughly) independent “parallel” information from various sources; and that thinking [...]
---
Outline:
(01:14) Basics of logits and logistic tasks
(08:17) Parallel prediction circuits
(08:21) Log odds and independent predictions
(11:02) Independence and circuits
(14:37) Appreciating the wisdom of the elders
(15:31) Interpretability insights
The original text contained 5 footnotes which were omitted from this narration.
---
First published:
Source:
Narrated by TYPE III AUDIO.
By LessWrong
Audio note: this article contains 51 uses of latex notation, so the narration may be difficult to follow. There's a link to the original text in the episode description.
Today I’m going to discuss how to think about logits like a statistician, and what this implies about circuits. This post doesn’t have any prerequisites other than perhaps a very basic statistical background that can be adequately recovered from the AI-generated “glossary” to the right. I think the material here is good thing to know in general (thinking through this helped clarify my thinking about a lot of things), and it will be useful background for a future post I’m planning on “SLT in a nutshell”. If you want a “TL:DR” takeaway of the discussion that follows, the gist is that neural networks use logit addition to integrate (roughly) independent “parallel” information from various sources; and that thinking [...]
---
Outline:
(01:14) Basics of logits and logistic tasks
(08:17) Parallel prediction circuits
(08:21) Log odds and independent predictions
(11:02) Independence and circuits
(14:37) Appreciating the wisdom of the elders
(15:31) Interpretability insights
The original text contained 5 footnotes which were omitted from this narration.
---
First published:
Source:
Narrated by TYPE III AUDIO.

112,586 Listeners

130 Listeners

7,219 Listeners

531 Listeners

16,096 Listeners

4 Listeners

14 Listeners

2 Listeners