Is the area under an ROC curve a valid measure of the performance of a screening or diagnostic test?

被引:48
|
作者
Wald, N. J. [1 ]
Bestwick, J. P. [1 ]
机构
[1] Barts & London Queen Marys Sch Med & Dent, Wolfson Inst Prevent Med, London EC1M 6BQ, England
关键词
ROC curve; AUC; screening test; diagnostic test;
D O I
10.1177/0969141313517497
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Objectives: The area under a receiver operating characteristic (ROC) curve (the AUC) is used as a measure of the performance of a screening or diagnostic test. We here assess the validity of the AUC. Methods: Assuming the test results follow Gaussian distributions in affected and unaffected individuals, standard mathematical formulae were used to describe the relationship between the detection rate (DR) (or sensitivity) and the false-positive rate (FPR) of a test with the AUC. These formulae were used to calculate the screening performance (DR for a given FPR, or FPR for a given DR) for different AUC values according to different standard deviations of the test result in affected and unaffected individuals. Results: The DR for a given FPR is strongly dependent on relative differences in the standard deviation of the test variable in affected and unaffected individuals. Consequently, two tests with the same AUC can have a different DR for the same FPR. For example, an AUC of 0.75 has a DR of 24% for a 5% FPR if the standard deviations are the same in affected and unaffected individuals, but 39% for the same 5% FPR if the standard deviation in affected individuals is 1.5 times that in unaffected individuals. Conclusion: The AUC is an unreliable measure of screening performance because in practice the standard deviation of a screening or diagnostic test in affected and unaffected individuals can differ. The problem is avoided by not using AUC at all, and instead specifying DRs for given FPRs or FPRs for given DRs.
引用
收藏
页码:51 / 56
页数:6
相关论文
共 50 条
  • [31] A modified Wald interval for the area under the ROC curve (AUC) in diagnostic case-control studies
    Kottas, Martina
    Kuss, Oliver
    Zapf, Antonia
    BMC MEDICAL RESEARCH METHODOLOGY, 2014, 14
  • [32] On Linear Combinations of Dichotomizers for Maximizing the Area Under the ROC Curve
    Marrocco, Claudio
    Molinara, Mario
    Tortorella, Francesco
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2011, 41 (03): : 610 - 620
  • [33] Rank-based kernel estimation of the area under the ROC curve
    Yin, Jingjing
    Hao, Yi
    Samawi, Hani
    Rochani, Haresh
    STATISTICAL METHODOLOGY, 2016, 32 : 91 - 106
  • [34] Area under the ROC curve by Bubble-Sort approach (BSA)
    Honzik, P
    PROCEEDINGS OF THE 7TH WSEAS INTERNATIONAL CONFERENCE ON AUTOMATIC CONTROL, MODELING AND SIMULATION, 2005, : 494 - 499
  • [35] Is the transformation useful to estimate the area under the ROC curve with skewed data?
    Unal, Ilker
    CUKUROVA MEDICAL JOURNAL, 2018, 43 (01): : 141 - 147
  • [36] Receiver Operating Characteristic (ROC) Curve Analysis for Medical Diagnostic Test Evaluation
    Hajian-Tilaki, Karimollah
    CASPIAN JOURNAL OF INTERNAL MEDICINE, 2013, 4 (02) : 627 - 635
  • [37] Imputation-based empirical likelihood inference for the area under the ROC curve with missing data
    Wang, Binhuan
    Qin, Gengsheng
    STATISTICS AND ITS INTERFACE, 2012, 5 (03) : 319 - 329
  • [38] A comparison of confidence/credible interval methods for the area under the ROC curve for continuous diagnostic tests with small sample size
    Feng, Dai
    Cortese, Giuliana
    Baumgartner, Richard
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2017, 26 (06) : 2603 - 2621
  • [39] Nonparametric bootstrap methods for interval estimation of the area under the ROC curve with correlated diagnostic test data: application to whole-virus ELISA testing in swine
    Pang, Jinji
    Ju, Wangqian
    Welch, Michael
    Gauger, Phillip
    Liu, Peng
    Zhang, Qijing
    Wang, Chong
    FRONTIERS IN VETERINARY SCIENCE, 2023, 10
  • [40] When is the area under the receiver operating characteristic curve an appropriate measure of classifier performance?
    Hand, D. J.
    Anagnostopoulos, C.
    PATTERN RECOGNITION LETTERS, 2013, 34 (05) : 492 - 495