The Gradient: Perspectives on AI

Hattie Zhou: Lottery Tickets and Algorithmic Reasoning in LLMs


Listen Later

In episode 60 of The Gradient Podcast, Daniel Bashir speaks to Hattie Zhou.

Hattie is a PhD student at the Université de Montréal and Mila. Her research focuses on understanding how and why neural networks work, based on the belief that the performance of modern neural networks exceeds our understanding and that building more capable and trustworthy models requires bridging this gap. Prior to Mila, she spent time as a data scientist at Uber and did research with Uber AI Labs.

Have suggestions for future podcast guests (or other feedback)? Let us know here!

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (01:55) Hattie’s Origin Story, Uber AI Labs, empirical theory and other sorts of research

* (10:00) Intro to the Lottery Ticket Hypothesis & Deconstructing Lottery Tickets

* (14:30) Lottery tickets as lucky initialization

* (17:00) Types of masking and the “masking is training” claim

* (24:00) Type-0 masks and weight evolution over long training trajectories

* (27:00) Can you identify good masks or training trajectories a priori?

* (29:00) The role of signs in neural net initialization

* (35:27) The Supermask

* (41:00) Masks to probe pretrained models and model steerability

* (47:40) Fortuitous Forgetting in Connectionist Networks

* (54:00) Relationships to other work (double descent, grokking, etc.)

* (1:01:00) The iterative training process in fortuitous forgetting, scale and value of exploring alternatives

* (1:03:35) In-Context Learning and Teaching Algorithmic Reasoning

* (1:09:00) Learning + algorithmic reasoning, prompting strategy

* (1:13:50) What’s happening with in-context learning?

* (1:14:00) Induction heads

* (1:17:00) ICL and gradient descent

* (1:22:00) Algorithmic prompting vs discovery

* (1:24:45) Future directions for algorithmic prompting

* (1:26:30) Interesting work from NeurIPS 2022

* (1:28:20) Hattie’s perspective on scientific questions people pay attention to, underrated problems

* (1:34:30) Hattie’s perspective on ML publishing culture

* (1:42:12) Outro

Links:

* Hattie’s homepage and Twitter

* Papers

* Deconstructing Lottery Tickets: Zeros, signs, and the Supermask

* Fortuitous Forgetting in Connectionist Networks

* Teaching Algorithmic Reasoning via In-context Learning



Get full access to The Gradient at thegradientpub.substack.com/subscribe
...more
View all episodesView all episodes
Download on the App Store

The Gradient: Perspectives on AIBy Daniel Bashir

  • 4.7
  • 4.7
  • 4.7
  • 4.7
  • 4.7

4.7

47 ratings


More shows like The Gradient: Perspectives on AI

View all
The Joe Rogan Experience by Joe Rogan

The Joe Rogan Experience

230,021 Listeners

The a16z Show by Andreessen Horowitz

The a16z Show

1,094 Listeners

NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

349 Listeners

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas by Sean Carroll

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

4,176 Listeners

Practical AI by Practical AI LLC

Practical AI

209 Listeners

The Journal. by The Wall Street Journal & Spotify Studios

The Journal.

6,114 Listeners

All-In with Chamath, Jason, Sacks & Friedberg by All-In Podcast, LLC

All-In with Chamath, Jason, Sacks & Friedberg

10,230 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

548 Listeners

Hard Fork by The New York Times

Hard Fork

5,547 Listeners

The Rest Is History by Goalhanger

The Rest Is History

15,875 Listeners

Huberman Lab by Scicomm Media

Huberman Lab

29,337 Listeners

Disintegrator by Roberto Alonso Trillo, Marek Poliks, and Helena McFadzean

Disintegrator

14 Listeners

Practical: AI & Business News by Practical News

Practical: AI & Business News

26 Listeners