Mathematik, Informatik und Statistik - Open Access LMU - Teil 03/03

Categorical variables with many categories are preferentially selected in model selection procedures for multivariable regression models on bootstrap samples


Listen Later

To perform model selection in the context of multivariable regression, automated variable selection procedures such as backward elimination are commonly employed. However, these procedures are known to be highly unstable. Their stability can be investigated using bootstrap-based procedures: the idea is to perform model selection on a high number of bootstrap samples successively and to examine the obtained models, for instance in terms of the inclusion of specific predictor variables. However, from the literature such bootstrap-based procedures are known to yield misleading results in some cases. In this paper we aim to thoroughly investigate a particular important facet of these problems. More precisely, we assess the behaviour of regression models--with automated variable selection procedure based on the likelihood ratio test--fitted on bootstrap samples drawn with replacement and on subsamples drawn without replacement with respect to the number and type of included predictor variables. Our study includes both extensive simulations and a real data example from the NHANES study. The results indicate that models derived from bootstrap samples include more predictor variables than models fitted on original samples and that categorical predictor variables with many categories are preferentially selected over categorical predictor variables with fewer categories and over metric predictor variables. We conclude that using bootstrap samples to select variables for multivariable regression models may lead to overly complex models with a preferential selection of categorical predictor variables with many categories. We suggest the use of subsamples instead of bootstrap samples to bypass these drawbacks.
...more
View all episodesView all episodes
Download on the App Store

Mathematik, Informatik und Statistik - Open Access LMU - Teil 03/03By Ludwig-Maximilians-Universität München


More shows like Mathematik, Informatik und Statistik - Open Access LMU - Teil 03/03

View all
Einführung in die Ethnologie by Prof. Dr. Frank Heidemann

Einführung in die Ethnologie

0 Listeners

Theoretical Physics Schools (ASC) by The Arnold Sommerfeld Center for Theoretical Physics (ASC)

Theoretical Physics Schools (ASC)

2 Listeners

MCMP – Mathematical Philosophy (Archive 2011/12) by MCMP Team

MCMP – Mathematical Philosophy (Archive 2011/12)

6 Listeners

Hegel lectures by Robert Brandom, LMU Munich by Robert Brandom, Axel Hutter

Hegel lectures by Robert Brandom, LMU Munich

6 Listeners

MCMP – Metaphysics and Philosophy of Language by MCMP Team

MCMP – Metaphysics and Philosophy of Language

2 Listeners

MCMP – Philosophy of Science by MCMP Team

MCMP – Philosophy of Science

1 Listeners

MCMP – Philosophy of Physics by MCMP Team

MCMP – Philosophy of Physics

3 Listeners

Sommerfeld Lecture Series (ASC) by The Arnold Sommerfeld Center for Theoretical Physics (ASC)

Sommerfeld Lecture Series (ASC)

0 Listeners

Medizin - Open Access LMU - Teil 12/22 by Ludwig-Maximilians-Universität München

Medizin - Open Access LMU - Teil 12/22

0 Listeners

Women Thinkers in Antiquity and the Middle Ages - SD by Peter Adamson

Women Thinkers in Antiquity and the Middle Ages - SD

0 Listeners