Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02

Statistical Issues in Machine Learning


Listen Later

Recursive partitioning methods from machine learning are being widely applied in many scientific fields such as, e.g., genetics and bioinformatics. The present work is concerned with the two main problems that arise in recursive partitioning, instability and biased variable selection, from a statistical point of view. With respect to the first issue, instability, the entire scope of methods from standard classification trees over robustified classification trees and ensemble methods such as TWIX, bagging and random forests is covered in this work. While ensemble methods prove to be much more stable than single trees, they also loose most of their interpretability. Therefore an adaptive cutpoint selection scheme is suggested with which a TWIX ensemble reduces to a single tree if the partition is sufficiently stable. With respect to the second issue, variable selection bias, the statistical sources of this artifact in single trees and a new form of bias inherent in ensemble methods based on bootstrap samples are investigated. For single trees, one unbiased split selection criterion is evaluated and another one newly introduced here. Based on the results for single trees and further findings on the effects of bootstrap sampling on association measures, it is shown that, in addition to using an unbiased split selection criterion, subsampling instead of bootstrap sampling should be employed in ensemble methods to be able to reliably compare the variable importance scores of predictor variables of different types. The statistical properties and the null hypothesis of a test for the random forest variable importance are critically investigated. Finally, a new, conditional importance measure is suggested that allows for a fair comparison in the case of correlated predictor variables and better reflects the null hypothesis of interest.
...more
View all episodesView all episodes
Download on the App Store

Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02By Ludwig-Maximilians-Universität München

  • 5
  • 5
  • 5
  • 5
  • 5

5

1 ratings


More shows like Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02

View all
Medizinische Fakultät - Digitale Hochschulschriften der LMU - Teil 06/19 by Ludwig-Maximilians-Universität München

Medizinische Fakultät - Digitale Hochschulschriften der LMU - Teil 06/19

0 Listeners

LMU Kapitalgesellschaftsrecht by Dr. jur. Timo Fest, LL.M. (Pennsylvania)

LMU Kapitalgesellschaftsrecht

0 Listeners

LMU Grundkurs Strafrecht I (L-Z) WS 2014/15 by Prof. Dr. jur. Helmut Satzger

LMU Grundkurs Strafrecht I (L-Z) WS 2014/15

0 Listeners

MCMP – Metaphysics and Philosophy of Language by MCMP Team

MCMP – Metaphysics and Philosophy of Language

2 Listeners

Hegel lectures by Robert Brandom, LMU Munich by Robert Brandom, Axel Hutter

Hegel lectures by Robert Brandom, LMU Munich

6 Listeners

John Lennox - Hat die Wissenschaft Gott begraben? by Professor John C. Lennox, University of Oxford

John Lennox - Hat die Wissenschaft Gott begraben?

4 Listeners

LMU Rechtsphilosophie by Prof. Dr. jur. Dr. jur. h.c. mult. Bernd Schünemann

LMU Rechtsphilosophie

0 Listeners

MCMP – Philosophy of Mathematics by MCMP Team

MCMP – Philosophy of Mathematics

2 Listeners

LMU Wiederholung und Vertiefung zum Schuldrecht - Lehrstuhl für Bürgerliches Recht by Professor Dr. Stephan Lorenz

LMU Wiederholung und Vertiefung zum Schuldrecht - Lehrstuhl für Bürgerliches Recht

0 Listeners

Epistemology and Philosophy of Science: Prof. Dr. Stephan Hartmann – HD by Ludwig-Maximilians-Universität München

Epistemology and Philosophy of Science: Prof. Dr. Stephan Hartmann – HD

1 Listeners