The Cartesian Cafe

Greg Yang | Large N Limits: Random Matrices & Neural Networks


Listen Later

Greg Yang is a mathematician and AI researcher at Microsoft Research who for the past several years has done incredibly original theoretical work in the understanding of large artificial neural networks. Greg received his bachelors in mathematics from Harvard University in 2018 and while there won the Hoopes prize for best undergraduate thesis. He also received an Honorable Mention for the Morgan Prize for Outstanding Research in Mathematics by an Undergraduate Student in 2018 and was an invited speaker at the International Congress of Chinese Mathematicians in 2019.

In this episode, we get a sample of Greg's work, which goes under the name "Tensor Programs" and currently spans five highly technical papers. The route chosen to compress Tensor Programs into the scope of a conversational video is to place its main concepts under the umbrella of one larger, central, and time-tested idea: that of taking a large N limit. This occurs most famously in the Law of Large Numbers and the Central Limit Theorem, which then play a fundamental role in the branch of mathematics known as Random Matrix Theory (RMT). We review this foundational material and then show how Tensor Programs (TP) generalizes this classical work, offering new proofs of RMT. We conclude with the applications of Tensor Programs to a (rare!) rigorous theory of neural networks.

Patreon: https://www.patreon.com/timothynguyen

Part I. Introduction

  • 00:00:00 : Biography
  • 00:02:45 : Harvard hiatus 1: Becoming a DJ
  • 00:07:40 : I really want to make AGI happen (back in 2012)
  • 00:09:09 : Impressions of Harvard math
  • 00:17:33 : Harvard hiatus 2: Math autodidact
  • 00:22:05 : Friendship with Shing-Tung Yau
  • 00:24:06 : Landing a job at Microsoft Research: Two Fields Medalists are all you need
  • 00:26:13 : Technical intro: The Big Picture
  • 00:28:12 : Whiteboard outline
  • Part II. Classical Probability Theory

    • 00:37:03 : Law of Large Numbers
  • 00:45:23 : Tensor Programs Preview
  • 00:47:26 : Central Limit Theorem
  • 00:56:55 : Proof of CLT: Moment method
  • 1:00:20 : Moment method explicit computations
  • Part III. Random Matrix Theory

    • 1:12:46 : Setup
  • 1:16:55 : Moment method for RMT
  • 1:21:21 : Wigner semicircle law
  • Part IV. Tensor Programs

    • 1:31:03 : Segue using RMT
  • 1:44:22 : TP punchline for RMT
  • 1:46:22 : The Master Theorem (the key result of TP)
  • 1:55:04 : Corollary: Reproof of RMT results
  • 1:56:52 : General definition of a tensor program
  • Part V. Neural Networks and Machine Learning

    • 2:09:05 : Feed forward neural network (3 layers) example
  • 2:19:16 : Neural network Gaussian Process
  • 2:23:59 : Many distinct large N limits for neural networks
  • 2:27:24 : abc parametrizations (Note: "a" is absorbed into "c" here): variance and learning rate scalings
  • 2:36:54 : Geometry of space of abc parametrizations
  • 2:39:41: Kernel regime
  • 2:41:32 : Neural tangent kernel
  • 2:43:35: (No) feature learning
  • 2:48:42 : Maximal feature learning
  • 2:52:33 : Current problems with deep learning
  • 2:55:02 : Hyperparameter transfer (muP)
  • 3:00:31 : Wrap up
  • Further Reading:

    Tensor Programs I, II, III, IV, V by Greg Yang and coauthors.

    Twitter: @iamtimnguyen

    Webpage: http://www.timothynguyen.org

    ...more
    View all episodesView all episodes
    Download on the App Store

    The Cartesian CafeBy Timothy Nguyen

    • 4.7
    • 4.7
    • 4.7
    • 4.7
    • 4.7

    4.7

    51 ratings


    More shows like The Cartesian Cafe

    View all
    StarTalk Radio by Neil deGrasse Tyson

    StarTalk Radio

    13,994 Listeners

    The Daily by The New York Times

    The Daily

    112,758 Listeners

    Our Opinions Are Correct by Our Opinions Are Correct

    Our Opinions Are Correct

    379 Listeners

    Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas by Sean Carroll | Wondery

    Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

    4,097 Listeners

    Theories of Everything with Curt Jaimungal by Theories of Everything

    Theories of Everything with Curt Jaimungal

    453 Listeners

    The Art of Mathematics by Carol Jacoby

    The Art of Mathematics

    21 Listeners

    The Ezra Klein Show by New York Times Opinion

    The Ezra Klein Show

    14,859 Listeners

    The Joy of Why by Steven Strogatz, Janna Levin and Quanta Magazine

    The Joy of Why

    424 Listeners

    Sherlock Holmes Short Stories by NOISER

    Sherlock Holmes Short Stories

    557 Listeners