Variable selection for logistic regression using a prediction-focused information criterion

被引:83
作者
Claeskens, Gerda
Croux, Christophe
Van Kerckhoven, Johan
机构
[1] Katholieke Univ Leuven, ORSTAT, B-3000 Louvain, Belgium
[2] Katholieke Univ Leuven, Ctr Stat, B-3000 Louvain, Belgium
关键词
error rate; focused information criterion; forward selection; logistic regression; model selection; risk measures;
D O I
10.1111/j.1541-0420.2006.00567.x
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
In biostatistical practice, it is common to use information criteria as a guide for model selection. We propose new versions of the focused information criterion (FIC) for variable selection in logistic regression. The FIC gives, depending on the quantity to be estimated, possibly different sets of selected variables. The standard version of the FIC measures the mean squared error of the estimator of the quantity of interest in the selected model. In this article, we propose more general versions of the FIC, allowing other risk measures such as the one based on LP error. When prediction of an event is important, as is often the case in medical applications, we construct an FIC using the error rate as a natural risk measure. The advantages of using an information criterion which depends on both the quantity of interest and the selected risk measure are illustrated by means of a simulation study and application to a study on diabetic retinopathy.
引用
收藏
页码:972 / 979
页数:8
相关论文
共 13 条