Improving breast cancer diagnostics with deep learning for MRI

被引:79
作者
Witowski, Jan [1 ,2 ]
Heacock, Laura [1 ]
Reig, Beatriu [1 ]
Kang, Stella K. [1 ,3 ]
Lewin, Alana [1 ]
Pysarenko, Kristine [1 ]
Patel, Shalin [1 ]
Samreen, Naziya [1 ]
Rudnicki, Wojciech [4 ]
Luczynska, Elzbieta [4 ]
Popiela, Tadeusz [5 ]
Moy, Linda [1 ,2 ,6 ,7 ]
Geras, Krzysztof J. [1 ,2 ,6 ,7 ,8 ,9 ]
机构
[1] New York Univ, Dept Radiol, Grossman Sch Med, New York, NY 10016 USA
[2] New York Univ, Ctr Adv Innovat & Res, New York, NY 10016 USA
[3] New York Univ, Dept Populat Hlth, Grossman Sch Med, New York, NY 10016 USA
[4] Jagiellonian Univ, Electradiol Dept, Med Coll, PL-31126 Krakow, Poland
[5] Jagiellonian Univ, Chair Radiol, Med Coll, PL-31501 Krakow, Poland
[6] New York Univ, Vilcek Inst Grad Biomed Sci, Grossman Sch Med, New York, NY 10016 USA
[7] New York Univ Langone Hlth, Perlmutter Canc Ctr, New York, NY 10016 USA
[8] New York Univ, Ctr Data Sci, New York, NY 10011 USA
[9] New York Univ, Courant Inst Math Sci, Dept Comp Sci, New York, NY 10012 USA
基金
美国国家卫生研究院;
关键词
MULTIREADER; ACCURACY; CURVES; SYSTEM; RISK;
D O I
10.1126/scitranslmed.abo4802
中图分类号
Q2 [细胞生物学];
学科分类号
071009 ; 090102 ;
摘要
Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) has a high sensitivity in detecting breast cancer but often leads to unnecessary biopsies and patient workup. We used a deep learning (DL) system to improve the overall accuracy of breast cancer diagnosis and personalize management of patients undergoing DCE-MRI. On the internal test set (n = 3936 exams), our system achieved an area under the receiver operating characteristic curve (AUROC) of 0.92 (95% CI: 0.92 to 0.93). In a retrospective reader study, there was no statistically significant difference (P = 0.19) between five board-certified breast radiologists and the DL system (mean Delta AUROC, +0.04 in favor of the DL system). Radiologists' performance improved when their predictions were averaged with DL's predictions [mean Delta AUPRC (area under the precision-recall curve), +0.07]. We demonstrated the generalizability of the DL system using multiple datasets from Poland and the United States. An additional reader study on a Polish dataset showed that the DL system was as robust to distribution shift as radiologists. In subgroup analysis, we observed consistent results across different cancer subtypes and patient demographics. Using decision curve analysis, we showed that the DL system can reduce unnecessary biopsies in the range of clinically relevant risk thresholds. This would lead to avoiding biopsies yielding benign results in up to 20% of all patients with BI-RADS category 4 lesions. Last, we performed an error analysis, investigating situations where DL predictions were mostly incorrect. This exploratory work creates a foundation for deployment and prospective analysis of DL-based models for breast MRI.
引用
收藏
页数:13
相关论文
共 65 条
[1]   Diagnostic accuracy of deep learning in medical imaging: a systematic review and meta-analysis [J].
Aggarwal, Ravi ;
Sounderajah, Viknesh ;
Martin, Guy ;
Ting, Daniel S. W. ;
Karthikesalingam, Alan ;
King, Dominic ;
Ashrafian, Hutan ;
Darzi, Ara .
NPJ DIGITAL MEDICINE, 2021, 4 (01)
[2]  
American College of Radiology, 2013, ACR BIRADS ATLAS BRE, V5
[3]  
Bergstra J, 2012, J MACH LEARN RES, V13, P281
[4]  
Biewald L., 2020, EXPT TRACKING WEIGHT
[5]   A Systematic Review of the Literature Demonstrates Some Errors in the Use of Decision Curve Analysis but Generally Correct Interpretation of Findings [J].
Capogrosso, Paolo ;
Vickers, Andrew J. .
MEDICAL DECISION MAKING, 2019, 39 (05) :493-498
[6]   Relevance of breast MRI in determining the size and focality of invasive breast cancer treated by mastectomy: a prospective study [J].
Carin, Anne-Julie ;
Moliere, Sebastien ;
Gabriele, Victor ;
Lodi, Massimo ;
Thiebaut, Nicolas ;
Neuberger, Karl ;
Mathelin, Carole .
WORLD JOURNAL OF SURGICAL ONCOLOGY, 2017, 15
[7]  
Carpenter J, 2000, STAT MED, V19, P1141, DOI 10.1002/(SICI)1097-0258(20000515)19:9<1141::AID-SIM479>3.0.CO
[8]  
2-F
[9]  
Chakraborty Dev, 2022, CRAN, DOI 10.32614/CRAN.package.RJafroc
[10]  
Chakraborty DP., 2017, OBSERVER PERFORMANCE, DOI [DOI 10.1201/9781351228190, 10.1201/9781351228190]