Forming a reference standard from LIDC data: Impact of reader agreement on reported CAD performance

被引:10
作者
Ochs, Robert [1 ,3 ]
Kim, Hyun J. [2 ,3 ]
Angel, Erin [1 ,3 ]
Panknin, Christoph [3 ,4 ]
McNitt-Graya, Michael [1 ,3 ]
Brown, Matthew [1 ,3 ]
机构
[1] Univ Calif Los Angeles, Dept Biomed Phys, Los Angeles, CA 90024 USA
[2] Univ Calif Los Angeles, Dept Biostat, Los Angeles, CA USA
[3] Univ Calif Los Angeles, Dept Radiol, Los Angeles, CA USA
[4] Siemens Med Solut, Forchheim, Germany
来源
MEDICAL IMAGING 2007: COMPUTER-AIDED DIAGNOSIS, PTS 1 AND 2 | 2007年 / 6514卷
关键词
computer-aided detection (CAD); evaluation and validation; X-ray CT; nodule detection; LIDC; reference standard;
D O I
10.1117/12.707916
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The Lung Image Database Consortium (LIDC) has provided a publicly available collection of CT images with nodule markings from four radiologists. The LIDC protocol does not require radiologists to reach a consensus during the reading process, and as a result, there are varying levels of reader agreement for each potential nodule with no explicit reference standard for nodules. The purpose of this work was to investigate the effects of the level of reader agreement on the development of a reference standard and the subsequent impact on CAD performance. Ninety series were downloaded from the LIDC database. Four different reference standards were created based on the markings of the LIDC radiologists, reflecting four different levels of reader agreement. All series were analyzed with a research CAD system and its performance was measured against each of the four standards. Between the standards with the lowest (any I of 4 readers) and highest (all 4 readers) required level of reader agreement, the number of nodules ! 3 mm decreased 4.8% (from 174 to 90) and CAD sensitivity for nodules >= 3 mm increased from 0.70 +/- 0.34 to 0.79 +/- 0.35. Between the same reference standards, the number of nodules < 3 mm decreased 84% (from 483 to 75) and CAD sensitivity for nodules < 3 mm increased from 0.30 +/- 0.29 to 0.51 +/- 0.45. This research illustrates the importance of indicating the method used to form the reference standard, since the method influences both the number of nodules and reported CAD performance.
引用
收藏
页数:6
相关论文
共 9 条
  • [1] Lung image database consortium: Developing a resource for the medical imaging research community
    Armato, SG
    McLennan, G
    McNitt-Gray, MF
    Meyer, CR
    Yankelevitz, D
    Aberle, DR
    Henschke, CI
    Hoffman, EA
    Kazerooni, EA
    MacMahon, H
    Reeves, AP
    Croft, BY
    Clarke, LP
    [J]. RADIOLOGY, 2004, 232 (03) : 739 - 748
  • [2] ARMATO SG, 2007, P SPIE
  • [3] ARMATO SG, 2006, SSK0407 RSNA
  • [4] Computer-aided lung nodule detection in CT: Results of large-scale observer test
    Brown, MS
    Goldin, JG
    Rogers, S
    Kim, HJ
    Suh, RD
    McNitt-Gray, MF
    Shah, SK
    Truong, D
    Brown, K
    Sayre, JW
    Gjertson, DW
    Batra, P
    Aberle, DR
    [J]. ACADEMIC RADIOLOGY, 2005, 12 (06) : 681 - 686
  • [5] Lung micronodules: Automated method for detection at thin-section CT - Initial experience
    Brown, MS
    Goldin, JG
    Suh, RD
    McNitt-Gray, MF
    Sayre, JW
    Aberle, DR
    [J]. RADIOLOGY, 2003, 226 (01) : 256 - 262
  • [6] Assessment methodologies and statistical issues for computer-aided diagnosis of lung nodules in computed tomography: Contemporary research topics relevant to the lung image database consortium
    Dodd, LE
    Wagner, RF
    Armato, SG
    McNitt-Gray, MF
    Beiden, S
    Chan, HP
    Gur, D
    McLennan, G
    Metz, CE
    Petrick, N
    Sahiner, B
    Sayre, J
    [J]. ACADEMIC RADIOLOGY, 2004, 11 (04) : 462 - 475
  • [7] MCNITTGRAY MF, 2007, P SPIE
  • [8] Evaluation of lung MDCT nodule annotation across radiologists and methods
    Meyer, Charles R.
    Johnson, Timothy D.
    McLennan, Geoffrey
    Aberle, Denise R.
    Kazerooni, Ella A.
    MacMahon, Heber
    Mullan, Brian F.
    Yankelevitz, David F.
    van Beek, Edwin J. R.
    Armato, Samuel G., III
    McNitt-Gray, Michael F.
    Reeves, Anthony P.
    Gur, David
    Henschke, Claudia I.
    Hoffman, Eric A.
    Bland, Peyton H.
    Laderach, Gary
    Pais, Richie
    Qing, David
    Piker, Chris
    Guo, Junfeng
    Starkey, Adam
    Max, Daniel
    Croft, Barbara Y.
    Clarke, Laurence P.
    [J]. ACADEMIC RADIOLOGY, 2006, 13 (10) : 1254 - 1265
  • [9] Variations in measured performance of CAD schemes due to database composition and scoring protocol
    Nishikawa, RM
    Yarusso, LM
    [J]. MEDICAL IMAGING 1998: IMAGE PROCESSING, PTS 1 AND 2, 1998, 3338 : 840 - 844