Overestimation of the receiver operating characteristic curve for logistic regression

被引:53
作者
Copas, JB [1 ]
Corbett, P [1 ]
机构
[1] Univ Warwick, Dept Stat, Coventry CV4 7AL, W Midlands, England
关键词
logistic regression; ROC; screening score; shrinkage;
D O I
10.1093/biomet/89.2.315
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Logistic regression is often used to find a linear combination of covariates which best discriminates between two groups or populations. The ROC, receiver operating characteristic, curve is a good way of assessing the performance of the resulting score, but using the same data both to fit the score and to calculate its ROC leads to an over-optimistic estimate of the performance which the score would give if it were to be validated on a sample of future cases. The paper studies the extent of this overestimation, and suggests a shrinkage correction for the ROC curve itself and for the area under the curve. The correction is consistent with Efron's formula for the bias in the error rate of a binary prediction rule. Two medical examples are discussed.
引用
收藏
页码:315 / 331
页数:17
相关论文
共 11 条