
Sign up to save your podcasts
Or


In our new video, we talk about research on interpreting InceptionV1, a convolutional neural network. Researchers have been able to understand the function of neurons and channels inside the network and uncover visual processing algorithms by looking at the weights. The work on InceptionV1 is early but landmark mechanistic interpretability research, and it functions well as an introduction to the field. We also go into the rationale and goals of the field and mention some more recent research near the end. Our main source material is the circuits thread in the Distill journal and this article on feature visualization. The author of the script is Arthur Frost. I have included the script below, although I recommend watching the video since the script has been written with accompanying moving visuals in mind.
Intro
In 2018, researchers trained an AI to find out if people were at [...]
---
Outline:
(00:56) Intro
(07:16) Visualisation by Optimisation
(11:09) Circuits
(15:27) Polysemanticity
(17:00) Closing thoughts, and the past few years of interpretability
---
First published:
Source:
Narrated by TYPE III AUDIO.
By LessWrongIn our new video, we talk about research on interpreting InceptionV1, a convolutional neural network. Researchers have been able to understand the function of neurons and channels inside the network and uncover visual processing algorithms by looking at the weights. The work on InceptionV1 is early but landmark mechanistic interpretability research, and it functions well as an introduction to the field. We also go into the rationale and goals of the field and mention some more recent research near the end. Our main source material is the circuits thread in the Distill journal and this article on feature visualization. The author of the script is Arthur Frost. I have included the script below, although I recommend watching the video since the script has been written with accompanying moving visuals in mind.
Intro
In 2018, researchers trained an AI to find out if people were at [...]
---
Outline:
(00:56) Intro
(07:16) Visualisation by Optimisation
(11:09) Circuits
(15:27) Polysemanticity
(17:00) Closing thoughts, and the past few years of interpretability
---
First published:
Source:
Narrated by TYPE III AUDIO.

112,882 Listeners

130 Listeners

7,216 Listeners

533 Listeners

16,223 Listeners

4 Listeners

14 Listeners

2 Listeners