
Sign up to save your podcasts
Or
In episode 40 of The Gradient Podcast, Andrey Kurenkov speaks to Catherine Olsson and Nelson Elhage.
Catherine and Nelson are both members of technical staff at Anthropic, which is an AI safety and research company that’s working to build reliable, interpretable, and steerable AI systems. Catherine and Nelson’s focus is on interpretability, and we will discuss several of their recent works in this interview.
Follow The Gradient on Twitter
Outline:
(00:00) Intro
(01:10) Catherine’s Path into AI
(03:25) Nelson’s Path into AI
(05:23) Overview of Anthropic
(08:21) Mechanistic Interpretability
(15:15) Transformer Circuits
(21:30) Toy Transformer
(27:25) Induction Heads
(31:00) In-Context Learning
(35:10) Evidence for Induction Heads Enabling In-Context Learning
(39:30) What’s Next
(43:10) Replicating Results
(46:00) Outro
Links:
Anthropic
Zoom In: An Introduction to Circuits
Mechanistic Interpretability, Variables, and the Importance of Interpretable Bases
A Mathematical Framework for Transformer Circuits
In-context Learning and Induction Heads
PySvelte
4.7
4747 ratings
In episode 40 of The Gradient Podcast, Andrey Kurenkov speaks to Catherine Olsson and Nelson Elhage.
Catherine and Nelson are both members of technical staff at Anthropic, which is an AI safety and research company that’s working to build reliable, interpretable, and steerable AI systems. Catherine and Nelson’s focus is on interpretability, and we will discuss several of their recent works in this interview.
Follow The Gradient on Twitter
Outline:
(00:00) Intro
(01:10) Catherine’s Path into AI
(03:25) Nelson’s Path into AI
(05:23) Overview of Anthropic
(08:21) Mechanistic Interpretability
(15:15) Transformer Circuits
(21:30) Toy Transformer
(27:25) Induction Heads
(31:00) In-Context Learning
(35:10) Evidence for Induction Heads Enabling In-Context Learning
(39:30) What’s Next
(43:10) Replicating Results
(46:00) Outro
Links:
Anthropic
Zoom In: An Introduction to Circuits
Mechanistic Interpretability, Variables, and the Importance of Interpretable Bases
A Mathematical Framework for Transformer Circuits
In-context Learning and Induction Heads
PySvelte
10,688 Listeners
323 Listeners
189 Listeners
1,260 Listeners
196 Listeners
287 Listeners
9,048 Listeners
87 Listeners
387 Listeners
5,420 Listeners
146 Listeners
15,207 Listeners
2,187 Listeners
75 Listeners
134 Listeners