Assessing operating characteristics of CAD algorithms in the absence of a gold standard

被引:4
作者
Choudhury, Kingshuk Roy [1 ]
Paik, David S. [2 ]
Yi, Chin A. [3 ]
Napel, Sandy [2 ]
Roos, Justus [2 ]
Rubin, Geoffrey D. [2 ]
机构
[1] Natl Univ Ireland Univ Coll Cork, Dept Stat, Cork, Ireland
[2] Stanford Med Sch, Dept Radiol, Stanford, CA 94305 USA
[3] Sungkyunkwan Univ, Samsung Med Ctr, Sch Med, Suwon 440746, South Korea
基金
爱尔兰科学基金会; 美国国家卫生研究院;
关键词
bootstrapping; diagnostic radiography; lung; maximum likelihood estimation; medical image processing; sampling methods; sensitivity analysis; COMPUTER-AIDED DETECTION; IMAGE-DATABASE-CONSORTIUM; PULMONARY NODULES; LUNG NODULES; 2ND READER; MAXIMUM-LIKELIHOOD; CT SCANS; PERFORMANCE; LIDC; SENSITIVITY;
D O I
10.1118/1.3352687
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
Methods: A binomial model for multiple reader detections using different diagnostic protocols was constructed, assuming conditional independence of readings given true lesion status. Operating characteristics of all protocols were estimated by maximum likelihood LCA. Reader panel and LCA based estimates were compared using data simulated from the binomial model for a range of operating characteristics. LCA was applied to 36 thin section thoracic computed tomography data sets from the Lung Image Database Consortium (LIDC): Free search markings of four radiologists were compared to markings from four different CAD assisted radiologists. For real data, bootstrap-based resampling methods, which accommodate dependence in reader detections, are proposed to test of hypotheses of differences between detection protocols. Results: In simulation studies, reader panel based sensitivity estimates had an average relative bias (ARB) of -23% to -27%, significantly higher (p-value < 0.0001) than LCA (ARB -2% to -6%). Specificity was well estimated by both reader panel (ARB -0.6% to -0.5%) and LCA (ARB 1.4%-0.5%). Among 1145 lesion candidates LIDC considered, LCA estimated sensitivity of reference readers (55%) was significantly lower (p-value 0.006) than CAD assisted readers' (68%). Average false positives per patient for reference readers (0.95) was not significantly lower (p-value 0.28) than CAD assisted readers' (1.27). Conclusions: Whereas a gold standard based on a consensus of readers may substantially bias sensitivity estimates, LCA may be a significantly more accurate and consistent means for evaluating diagnostic accuracy.
引用
收藏
页码:1788 / 1795
页数:8
相关论文
共 25 条
  • [1] [Anonymous], 1993, An introduction to the bootstrap
  • [2] [Anonymous], 1968, An introduction to probability theory and its applications
  • [3] [Anonymous], 2007, Randomization tests
  • [4] [Anonymous], 1967, ANN MATH STAT
  • [5] The lung image database consortium (LIDC): Ensuring the integrity of expert-defined "truth"
    Armato, Samuel G., III
    Roberts, Rachael Y.
    McNitt-Gray, Michael F.
    Meyer, Charles R.
    Reeves, Anthony P.
    McLennan, Geoffrey
    Engelmann, Roger M.
    Bland, Peyton H.
    Aberle, Denise R.
    Kazerooni, Ella A.
    MacMahon, Heber
    van Beek, Edwin J. R.
    Yankelevitz, David
    Croft, Barbara Y.
    Clarke, Laurence P.
    [J]. ACADEMIC RADIOLOGY, 2007, 14 (12) : 1455 - 1463
  • [6] The Lung Image Database Consortium (LIDC): An evaluation of radiologist variability in the identification of lung nodules on CT scans
    Armato, Samuel G., III
    McNitt-Gray, Michael F.
    Reeves, Anthony P.
    Meyer, Charles R.
    McLennan, Geoffrey
    Aberle, Denise R.
    Kazerooni, Ella A.
    MacMahon, Heber
    van Beek, Edwin J. R.
    Yankelevitz, David
    Hoffman, Eric A.
    Henschke, Claudia I.
    Roberts, Rachael Y.
    Brown, Matthew S.
    Engelmann, Roger M.
    Pais, Richard C.
    Piker, Christopher W.
    Qing, David
    Kocherginsky, Masha
    Croft, Barbara Y.
    Clarke, Laurence P.
    [J]. ACADEMIC RADIOLOGY, 2007, 14 (11) : 1409 - 1421
  • [7] Comparison of sensitivity and reading time for the use of computer-aided detection (CAD) of pulmonary nodules at MDCT as concurrent or second reader
    F. Beyer
    L. Zierott
    E. M. Fallenberg
    K. U. Juergens
    J. Stoeckel
    W. Heindel
    D. Wormanns
    [J]. European Radiology, 2007, 17 (11) : 2941 - 2947
  • [8] Computer-aided lung nodule detection in CT: Results of large-scale observer test
    Brown, MS
    Goldin, JG
    Rogers, S
    Kim, HJ
    Suh, RD
    McNitt-Gray, MF
    Shah, SK
    Truong, D
    Brown, K
    Sayre, JW
    Gjertson, DW
    Batra, P
    Aberle, DR
    [J]. ACADEMIC RADIOLOGY, 2005, 12 (06) : 681 - 686
  • [9] CHOUDHURY KR, 2008, P ANN C RAD SOC N AM
  • [10] Small pulmonary nodules:: Effect of two computer-aided detection systems on radiologist performance
    Das, Marco
    Muehlenbruch, Georg
    Mahnken, Andreas H.
    Flohr, Thomas G.
    Guendel, Lutz
    Stanzel, Sven
    Kraus, Thomas
    Guenther, Rolf W.
    Wildberger, Joachim E.
    [J]. RADIOLOGY, 2006, 241 (02) : 564 - 571