August 26, 2022

Catherine Olsson and Nelson Elhage: Anthropic, Understanding Transformers

Listen Later

47 minutes

In episode 40 of The Gradient Podcast, Andrey Kurenkov speaks to Catherine Olsson and Nelson Elhage.

Catherine and Nelson are both members of technical staff at Anthropic, which is an AI safety and research company that’s working to build reliable, interpretable, and steerable AI systems. Catherine and Nelson’s focus is on interpretability, and we will discuss several of their recent works in this interview.
Follow The Gradient on Twitter

Outline:

(00:00) Intro
(01:10) Catherine’s Path into AI
(03:25) Nelson’s Path into AI
(05:23) Overview of Anthropic
(08:21) Mechanistic Interpretability
(15:15) Transformer Circuits
(21:30) Toy Transformer
(27:25) Induction Heads
(31:00) In-Context Learning
(35:10) Evidence for Induction Heads Enabling In-Context Learning
(39:30) What’s Next
(43:10) Replicating Results
(46:00) Outro

Links:

Anthropic

Zoom In: An Introduction to Circuits

Mechanistic Interpretability, Variables, and the Importance of Interpretable Bases

A Mathematical Framework for Transformer Circuits

In-context Learning and Induction Heads

PySvelte

Get full access to The Gradient at thegradientpub.substack.com/subscribe

...more

View all episodes

View all episodes

Download on the App Store

Download on the App Store

Get it on Google Play

The Gradient: Perspectives on AI

By Daniel Bashir

4.7

4747 ratings

August 26, 2022

Catherine Olsson and Nelson Elhage: Anthropic, Understanding Transformers

Listen Later

47 minutes

In episode 40 of The Gradient Podcast, Andrey Kurenkov speaks to Catherine Olsson and Nelson Elhage.

Catherine and Nelson are both members of technical staff at Anthropic, which is an AI safety and research company that’s working to build reliable, interpretable, and steerable AI systems. Catherine and Nelson’s focus is on interpretability, and we will discuss several of their recent works in this interview.
Follow The Gradient on Twitter

Outline:

(00:00) Intro
(01:10) Catherine’s Path into AI
(03:25) Nelson’s Path into AI
(05:23) Overview of Anthropic
(08:21) Mechanistic Interpretability
(15:15) Transformer Circuits
(21:30) Toy Transformer
(27:25) Induction Heads
(31:00) In-Context Learning
(35:10) Evidence for Induction Heads Enabling In-Context Learning
(39:30) What’s Next
(43:10) Replicating Results
(46:00) Outro

Links:

Anthropic

Zoom In: An Introduction to Circuits

Mechanistic Interpretability, Variables, and the Importance of Interpretable Bases

A Mathematical Framework for Transformer Circuits

In-context Learning and Induction Heads

PySvelte

Get full access to The Gradient at thegradientpub.substack.com/subscribe

...more

More shows like The Gradient: Perspectives on AI

The Joe Rogan Experience by Joe Rogan

The Joe Rogan Experience

229,169 Listeners

The a16z Show by Andreessen Horowitz

The a16z Show

1,089 Listeners

NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

334 Listeners

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas by Sean Carroll | Wondery

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

4,182 Listeners

Practical AI by Practical AI LLC

Practical AI

211 Listeners

The Journal. by The Wall Street Journal & Spotify Studios

The Journal.

6,095 Listeners

All-In with Chamath, Jason, Sacks & Friedberg by All-In Podcast, LLC

All-In with Chamath, Jason, Sacks & Friedberg

9,927 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

511 Listeners

Hard Fork by The New York Times

Hard Fork

5,512 Listeners

The Rest Is History by Goalhanger

The Rest Is History

15,272 Listeners

Huberman Lab by Scicomm Media

Huberman Lab

29,246 Listeners

Disintegrator by Roberto Alonso Trillo, Marek Poliks, and Helena McFadzean

Disintegrator

10 Listeners

Practical: AI & Business News by Practical News

Practical: AI & Business News

25 Listeners