Equivalence of improvement in area under ROC curve and linear discriminant analysis coefficient under assumption of normality

被引:29
作者
Demler, Olga V. [1 ]
Pencina, Michael J. [1 ]
D'Agostino, Ralph B. [2 ]
机构
[1] Boston Univ, Dept Biostat, Harvard Clin Res Inst, Boston, MA 02118 USA
[2] Boston Univ, Dept Math & Stat, Boston, MA 02215 USA
关键词
linear discriminant analysis; risk prediction model; AUC; ROC; logistic regression; BREAST-CANCER; RISK; PREDICTION;
D O I
10.1002/sim.4196
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
In this paper we investigate the addition of new variables to an existing risk prediction model and the subsequent impact on discrimination quantified by the area under the receiver operating characteristics curve (AUC of ROC). Based on practical experience, concerns have emerged that the significance of association of the variable under study with the outcome in the risk model does not correspond to the significance of the change in AUC: that is, often the variable is significant, but the change in AUC is not. This paper demonstrates that under the assumption of multivariate normality and employing linear discriminant analysis (LDA) to construct the risk prediction tool, statistical significance of the new predictor(s) is equivalent to the statistical significance of the increase in AUC. Under these assumptions the result extends asymptotically to logistic regression. We further show that equality of variance-covariance matrices of predictors within cases and non-cases is not necessary when LDA is used. However, our practical example from the Framingham Heart Study data suggests that the finding might be sensitive to the assumption of normality. Copyright c (C) 2011 John Wiley & Sons, Ltd.
引用
收藏
页码:1410 / 1418
页数:9
相关论文
共 26 条
[11]   2 GUIDELINES FOR BOOTSTRAP HYPOTHESIS-TESTING [J].
HALL, P ;
WILSON, SR .
BIOMETRICS, 1991, 47 (02) :757-762
[12]   Measuring classifier performance: a coherent alternative to the area under the ROC curve [J].
Hand, David J. .
MACHINE LEARNING, 2009, 77 (01) :103-123
[13]   THE MEANING AND USE OF THE AREA UNDER A RECEIVER OPERATING CHARACTERISTIC (ROC) CURVE [J].
HANLEY, JA ;
MCNEIL, BJ .
RADIOLOGY, 1982, 143 (01) :29-36
[14]  
Harrel FE., 2001, REGRESSION MODELING
[15]  
Lee Alan James, 1990, U-statistics: Theory and practice, DOI 10.1201/9780203734520
[16]  
MARDIA KV, 1979, MULTIVARIATE ANAL, P78
[17]  
PEARSON ES, 1977, BIOMETRIKA, V64, P231
[18]   Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond [J].
Pencina, Michael J. ;
D'Agostino, Ralph B., Sr. ;
D'Agostino, Ralph B., Jr. ;
Vasan, Ramachandran S. .
STATISTICS IN MEDICINE, 2008, 27 (02) :157-172
[19]  
Pepe MS, 2004, STAT EVALUATION MED, P77
[20]   AN ANALYSIS OF VARIANCE TEST FOR NORMALITY (COMPLETE SAMPLES) [J].
SHAPIRO, SS ;
WILK, MB .
BIOMETRIKA, 1965, 52 :591-&