Stand-Alone Artificial Intelligence for Breast Cancer Detection in Mammography: Comparison With 101 Radiologists

被引:345
作者
Rodriguez-Ruiz, Alejandro [1 ]
Lang, Kristina [2 ]
Gubern-Merida, Albert [3 ]
Broeders, Mireille [4 ,5 ]
Gennaro, Gisella [6 ]
Clauser, Paola [7 ]
Helbich, Thomas H. [7 ]
Chevalier, Margarita [8 ]
Tan, Tao [3 ]
Mertelmeier, Thomas [9 ]
Wallis, Matthew G. [10 ,11 ]
Andersson, Ingvar [12 ]
Zackrisson, Sophia [13 ]
Mann, Ritse M. [1 ]
Sechopoulos, Ioannis [1 ,5 ]
机构
[1] Radboud Univ Med, Dept Radiol & Nucl Med, Med Ctr, Geert Grootepl 10,Post 766, NL-6525 GA Nijmegen, Netherlands
[2] Swiss Fed Inst Technol, Inst Biomed Engn, Zurich, Switzerland
[3] ScreenPoint Med BV, Nijmegen, Netherlands
[4] Radboud Univ Nijmegen, Dept Hlth Evidence, Med Ctr, Nijmegen, Netherlands
[5] Dutch Expert Ctr Screening LRCB, Nijmegen, Netherlands
[6] IRCCS, Veneto Inst Oncol IOV, Padua, Italy
[7] Med Univ Vienna, Dept Biomed Imaging & Image Guided Therapy, Div Mol & Gender Imaging, Vienna, Austria
[8] Univ Complutense Madrid, Fac Med, Radiol Dept, Med Phys Grp, Madrid, Spain
[9] Siemens Healthcare GmbH, Diagnost Imaging Xray Prod Technol & Concepts, Forchheim, Germany
[10] Cambridge Univ Hosp NHS Fdn Trust, Cambridge Breast Unit, Cambridge Biomed Campus, Cambridge, England
[11] Cambridge Univ Hosp NHS Fdn Trust, NIHR Biomed Res Unit, Cambridge Biomed Campus, Cambridge, England
[12] Skane Univ Hosp, Unilabs Breast Ctr, Malmo, Sweden
[13] Lund Univ, Skine Univ Hosp, Dept Translat Med, Diagnost Radiol, Malmo, Sweden
来源
JNCI-JOURNAL OF THE NATIONAL CANCER INSTITUTE | 2019年 / 111卷 / 09期
关键词
COMPUTER-AIDED DETECTION; SCREENING MAMMOGRAPHY; DIAGNOSTIC-ACCURACY; DIGITAL MAMMOGRAPHY; TOMOSYNTHESIS; PERFORMANCE; IMAGE; HYPOTHESIS; GUIDELINES; PROGNOSIS;
D O I
10.1093/jnci/djy222
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Background: Artificial intelligence (AI) systems performing at radiologist-like levels in the evaluation of digital mammography (DM) would improve breast cancer screening accuracy and efficiency. We aimed to compare the stand-alone performance of an AI system to that of radiologists in detecting breast cancer in DM. Methods: Nine multi-reader, multi-case study datasets previously used for different research purposes in seven countries were collected. Each dataset consisted of DM exams acquired with systems from four different vendors, multiple radiologists' assessments per exam, and ground truth verified by histopathological analysis or follow-up, yielding a total of 2652 exams (653 malignant) and interpretations by 101 radiologists (28 296 independent interpretations). An AI system analyzed these exams yielding a level of suspicion of cancer present between 1 and 10. The detection performance between the radiologists and the AI system was compared using a noninferiority null hypothesis at a margin of 0.05. Results: The performance of the AI system was statistically noninferior to that of the average of the 101 radiologists. The AI system had a 0.840 (95% confidence interval [CI] = 0.820 to 0.860) area under the ROC curve and the average of the radiologists was 0.814 (95% CI = 0.787 to 0.841) (difference 95% CI = -0.003 to 0.055). The AI system had an AUC higher than 61.4% of the radiologists. Conclusions: The evaluated AI system achieved a cancer detection accuracy comparable to an average breast radiologist in this retrospective setting. Although promising, the performance and impact of such a system in a screening setting needs further investigation.
引用
收藏
页码:916 / 922
页数:7
相关论文
共 49 条
  • [1] [Anonymous], 2017, BMJ
  • [2] Accuracy of screening mammography interpretation by characteristics of radiologists
    Barlow, WE
    Chi, C
    Carney, PA
    Taplin, SH
    D'Orsi, C
    Cutter, G
    Hendrick, RE
    Elmore, JG
    [J]. JNCI-JOURNAL OF THE NATIONAL CANCER INSTITUTE, 2004, 96 (24): : 1840 - 1850
  • [3] Deep Learning in Mammography Diagnostic Accuracy of a Multipurpose Image Analysis Software in the Detection of Breast Cancer
    Becker, Anton S.
    Marcon, Magda
    Ghafoor, Soleen
    Wurnig, Moritz C.
    Frauenfelder, Thomas
    Boss, Andreas
    [J]. INVESTIGATIVE RADIOLOGY, 2017, 52 (07) : 434 - 440
  • [4] Diagnostic Assessment of Deep Learning Algorithms for Detection of Lymph Node Metastases in Women With Breast Cancer
    Bejnordi, Babak Ehteshami
    Veta, Mitko
    van Diest, Paul Johannes
    van Ginneken, Bram
    Karssemeijer, Nico
    Litjens, Geert
    van der Laak, Jeroen A. W. M.
    [J]. JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2017, 318 (22): : 2199 - 2210
  • [5] ANALYSIS OF CANCERS MISSED AT SCREENING MAMMOGRAPHY
    BIRD, RE
    WALLACE, TW
    YANKASKAS, BC
    [J]. RADIOLOGY, 1992, 184 (03) : 613 - 617
  • [6] PROVING THE NULL HYPOTHESIS IN CLINICAL-TRIALS
    BLACKWELDER, WC
    [J]. CONTROLLED CLINICAL TRIALS, 1982, 3 (04): : 345 - 353
  • [7] Learning from unbalanced data: A cascade-based approach for detecting clustered microcalcifications
    Bria, A.
    Karssemeijer, N.
    Tortorella, F.
    [J]. MEDICAL IMAGE ANALYSIS, 2014, 18 (02) : 241 - 252
  • [8] The impact of mammographic screening on breast cancer mortality in Europe: a review of observational studies
    Broeders, Mireille
    Moss, Sue
    Nystrom, Lennarth
    Njor, Sisse
    Jonsson, Hakan
    Poop, Ellen
    Massat, Nathalie
    Duffy, Stephen
    Lynge, Elsebeth
    Paci, Eugenio
    [J]. JOURNAL OF MEDICAL SCREENING, 2012, 19 : 14 - 25
  • [9] Use of previous screening mammograms to identify features indicating cases that would have a possible gain in prognosis following earlier detection
    Broeders, MJM
    Onland-Moret, NC
    Rijken, HJTM
    Hendriks, JHCL
    Verbeek, ALM
    Holland, R
    [J]. EUROPEAN JOURNAL OF CANCER, 2003, 39 (12) : 1770 - 1775
  • [10] The average receiver operating characteristic curve in multireader multicase imaging studies
    Chen, W.
    Samuelson, F. W.
    [J]. BRITISH JOURNAL OF RADIOLOGY, 2014, 87 (1040)