Mathematik, Informatik und Statistik - Open Access LMU - Teil 03/03

Categorical variables with many categories are preferentially selected in model selection procedures for multivariable regression models on bootstrap samples


Listen Later

To perform model selection in the context of multivariable regression, automated variable selection procedures such as backward elimination are commonly employed. However, these procedures are known to be highly unstable. Their stability can be investigated using bootstrap-based procedures: the idea is to perform model selection on a high number of bootstrap samples successively and to examine the obtained models, for instance in terms of the inclusion of specific predictor variables. However, from the literature such bootstrap-based procedures are known to yield misleading results in some cases. In this paper we aim to thoroughly investigate a particular important facet of these problems. More precisely, we assess the behaviour of regression models--with automated variable selection procedure based on the likelihood ratio test--fitted on bootstrap samples drawn with replacement and on subsamples drawn without replacement with respect to the number and type of included predictor variables. Our study includes both extensive simulations and a real data example from the NHANES study. The results indicate that models derived from bootstrap samples include more predictor variables than models fitted on original samples and that categorical predictor variables with many categories are preferentially selected over categorical predictor variables with fewer categories and over metric predictor variables. We conclude that using bootstrap samples to select variables for multivariable regression models may lead to overly complex models with a preferential selection of categorical predictor variables with many categories. We suggest the use of subsamples instead of bootstrap samples to bypass these drawbacks.
...more
View all episodesView all episodes
Download on the App Store

Mathematik, Informatik und Statistik - Open Access LMU - Teil 03/03By Ludwig-Maximilians-Universität München


More shows like Mathematik, Informatik und Statistik - Open Access LMU - Teil 03/03

View all
Hegel lectures by Robert Brandom, LMU Munich by Robert Brandom, Axel Hutter

Hegel lectures by Robert Brandom, LMU Munich

8 Listeners

MCMP – Philosophy of Science by MCMP Team

MCMP – Philosophy of Science

1 Listeners

Sommerfeld Lecture Series (ASC) by The Arnold Sommerfeld Center for Theoretical Physics (ASC)

Sommerfeld Lecture Series (ASC)

0 Listeners

Sommerfeld Theory Colloquium (ASC) by Michael Haack

Sommerfeld Theory Colloquium (ASC)

2 Listeners

John Lennox - Hat die Wissenschaft Gott begraben? by Professor John C. Lennox, University of Oxford

John Lennox - Hat die Wissenschaft Gott begraben?

4 Listeners

Theoretical Physics Schools (ASC) by The Arnold Sommerfeld Center for Theoretical Physics (ASC)

Theoretical Physics Schools (ASC)

2 Listeners

MCMP – Philosophy of Mathematics by MCMP Team

MCMP – Philosophy of Mathematics

2 Listeners

MCMP – Philosophy of Physics by MCMP Team

MCMP – Philosophy of Physics

4 Listeners

ISCB34 - 34th Annual Conference of the International Society for Clinical Biostatistics - Munich, 25-29 August 2013 by Prof. Dr. rer. nat. Ulrich Mansmann

ISCB34 - 34th Annual Conference of the International Society for Clinical Biostatistics - Munich, 25-29 August 2013

0 Listeners

LMU Fakultät für Philosophie, Wissenschaftstheorie und Religionswissenschaft - Vorlesungen und Vorträge by Professoren der Fakultät für Philosophie, Wissenschaftstheorie und Religionswissenschaft

LMU Fakultät für Philosophie, Wissenschaftstheorie und Religionswissenschaft - Vorlesungen und Vorträge

0 Listeners