Sommerfeld Theory Colloquium (ASC)

The Underlying Scaling Laws and Universal Statistical Structure of Complex Datasets


Listen Later

We study universal traits which emerge both in real-world complex datasets, as well as in artificially generated ones. Our approach is to analogize data to a physical system and employ tools from statistical physics and Random Matrix Theory (RMT) to reveal their underlying structure. We focus on the feature-feature covariance matrix, analyzing both its local and global eigenvalue statistics. Our main observations are: (i) The power-law scalings that the bulk of its eigenvalues exhibit are vastly different for uncorrelated random data compared to real-world data, (ii) this scaling behavior can be completely recovered by introducing long range correlations in a simple way to the synthetic data, (iii) both generated and real-world datasets lie in the same universality class from the RMT perspective, as chaotic rather than integrable systems, (iv) the expected RMT statistical behavior already manifests for empirical covariance matrices at dataset sizes significantly smaller than those conventionally used for real-world training, and can be related to the number of samples required to approximate the population power-law scaling behavior, (v) the Shannon entropy is correlated with local RMT structure and eigenvalues scaling, and substantially smaller in strongly correlated datasets compared to uncorrelated synthetic data, and requires fewer samples to reach the distribution entropy. These findings can have numerous implications to the characterization of the complexity of data sets, including differentiating synthetically generated from natural data, quantifying noise, developing better data pruning methods and classifying effective learning models utilizing these scaling laws.
...more
View all episodesView all episodes
Download on the App Store

Sommerfeld Theory Colloquium (ASC)By Michael Haack

  • 4.5
  • 4.5
  • 4.5
  • 4.5
  • 4.5

4.5

2 ratings


More shows like Sommerfeld Theory Colloquium (ASC)

View all
Into the Impossible With Brian Keating by Big Bang Productions Inc.

Into the Impossible With Brian Keating

1,065 Listeners

Hegel lectures by Robert Brandom, LMU Munich by Robert Brandom, Axel Hutter

Hegel lectures by Robert Brandom, LMU Munich

6 Listeners

MCMP – Philosophy of Science by MCMP Team

MCMP – Philosophy of Science

1 Listeners

MCMP – Mathematical Philosophy (Archive 2011/12) by MCMP Team

MCMP – Mathematical Philosophy (Archive 2011/12)

6 Listeners

Sommerfeld Lecture Series (ASC) by The Arnold Sommerfeld Center for Theoretical Physics (ASC)

Sommerfeld Lecture Series (ASC)

0 Listeners

John Lennox - Hat die Wissenschaft Gott begraben? by Professor John C. Lennox, University of Oxford

John Lennox - Hat die Wissenschaft Gott begraben?

4 Listeners

LMU Physik 2 für Chemiker (PN2) SS2016 by Prof. Dr. Jan Lipfert

LMU Physik 2 für Chemiker (PN2) SS2016

0 Listeners

MCMP – Philosophy of Physics by MCMP Team

MCMP – Philosophy of Physics

3 Listeners

Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02 by Ludwig-Maximilians-Universität München

Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02

1 Listeners

Why This Universe? by Dan Hooper, Shalma Wegsman

Why This Universe?

391 Listeners

ISCB34 - 34th Annual Conference of the International Society for Clinical Biostatistics - Munich, 25-29 August 2013 by Prof. Dr. rer. nat. Ulrich Mansmann

ISCB34 - 34th Annual Conference of the International Society for Clinical Biostatistics - Munich, 25-29 August 2013

0 Listeners