The average receiver operating characteristic curve in multireader multicase imaging studies

被引:32
作者
Chen, W. [1 ]
Samuelson, F. W. [1 ]
机构
[1] US FDA, Div Imaging & Appl Math, Off Sci & Engn Labs, Ctr Devices & Radiol Hlth, Silver Spring, MD 20993 USA
关键词
MAXIMUM-LIKELIHOOD-ESTIMATION; ROC CURVES; COMPUTER; DIAGNOSIS; SYSTEMS;
D O I
10.1259/bjr.20140016
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
Objective: In multireader, multicase (MRMC) receiver operating characteristic (ROC) studies for evaluating medical imaging systems, the area under the ROC curve (AUC) is often used as a summary metric. Owing to the limitations of AUC, plotting the average ROC curve to accompany the rigorous statistical inference on AUC is recommended. The objective of this article is to investigate methods for generating the average ROC curve from ROC curves of individual readers. Methods: We present both a non-parametric method and a parametric method for averaging ROC curves that produce a ROC curve, the area under which is equal to the average AUC of individual readers (a property we call area preserving). We use hypothetical examples, simulated data and a real-world imaging data set to illustrate these methods and their properties. Results: We show that our proposed methods are area preserving. We also show that the method of averaging the ROC parameters, either the conventional bi-normal parameters (a, b) or the proper bi-normal parameters (c, d(a)), is generally not area preserving and may produce a ROC curve that is intuitively not an average of multiple curves. Conclusion: Our proposed methods are useful for making plots of average ROC curves in MRMC studies as a companion to the rigorous statistical inference on the AUC end point. The software implementing these methods is freely available from the authors. Advances in knowledge: Methods for generating the average ROC curve in MRMC ROC studies are formally investigated. The area-preserving criterion we defined is useful to evaluate such methods.
引用
收藏
页数:8
相关论文
共 19 条
[11]  
Metz CE, 1998, STAT MED, V17, P1033, DOI 10.1002/(SICI)1097-0258(19980515)17:9<1033::AID-SIM784>3.3.CO
[12]  
2-Q
[14]   Assessing Radiologist Performance Using Combined Digital Mammography and Breast Tomosynthesis Compared with Digital Mammography Alone: Results of a Multicenter, Multireader Trial [J].
Rafferty, Elizabeth A. ;
Park, Jeong Mi ;
Philpotts, Liane E. ;
Poplack, Steven P. ;
Sumkin, Jules H. ;
Halpern, Elkan F. ;
Niklason, Loren T. .
RADIOLOGY, 2013, 266 (01) :104-113
[15]   Inference Based on Diagnostic Measures from Studies of New Imaging Devices [J].
Samuelson, Frank W. .
ACADEMIC RADIOLOGY, 2013, 20 (07) :816-824
[16]  
Swets John., 1982, EVALUATION DIAGNOSTI
[17]  
Van Dyke CW, 1993, 79 RSNA M CHIC IL
[18]   Assessment of medical imaging and computer-assist systems: Lessons from recent experience [J].
Wagner, RF ;
Beiden, SV ;
Campbell, G ;
Metz, CE ;
Sacks, WM .
ACADEMIC RADIOLOGY, 2002, 9 (11) :1264-1277
[19]   Assessment of medical imaging systems and computer aids: A tutorial review [J].
Wagner, Robert F. ;
Metz, Charles E. ;
Campbell, Gregory .
ACADEMIC RADIOLOGY, 2007, 14 (06) :723-748