Low-Shot Deep Learning of Diabetic Retinopathy With Potential Applications to Address Artificial Intelligence Bias in Retinal Diagnostics and Rare Ophthalmic Diseases

被引:54
作者
Burlina, Philippe [1 ,2 ,3 ]
Paul, William [1 ]
Mathew, Philip [1 ]
Joshi, Neil [1 ]
Pacheco, Katia D. [4 ]
Bressler, Neil M. [2 ]
机构
[1] Johns Hopkins Univ, Appl Phys Lab, Baltimore, MD 21218 USA
[2] Johns Hopkins Univ, Sch Med, Wilmer Eye Inst, Retina Div, Baltimore, MD 21205 USA
[3] Johns Hopkins Univ, Dept Comp Sci, Malone Ctr Engn Healthcare, Baltimore, MD USA
[4] Brazilian Ctr Vision Eye Hosp, Dept Ophthalmol, Retina Div, Brasilia, DF, Brazil
关键词
MACULAR DEGENERATION; ALGORITHM;
D O I
10.1001/jamaophthalmol.2020.3269
中图分类号
R77 [眼科学];
学科分类号
100212 ;
摘要
IMPORTANCE Recent studies have demonstrated the successful application of artificial intelligence (AI) for automated retinal disease diagnostics but have not addressed a fundamental challenge for deep learning systems: the current need for large, criterion standard-annotated retinal data sets for training. Low-shot learning algorithms, aiming to learn from a relatively low number of training data, may be beneficial for clinical situations involving rare retinal diseases or when addressing potential bias resulting from data that may not adequately represent certain groups for training, such as individuals older than 85 years. OBJECTIVE To evaluate whether low-shot deep learning methods are beneficial when using small training data sets for automated retinal diagnostics. DESIGN, SETTING, AND PARTICIPANTS This cross-sectional study, conducted from July 1, 2019, to June 21, 2020, compared different diabetic retinopathy classification algorithms, traditional and low-shot, for 2-class designations (diabetic retinopathy warranting referral vs not warranting referral). The public domain EyePACS data set was used, which originally included 88 692 fundi from 44 346 individuals. Statistical analysis was performed from February 1 to June 21, 2020. MAIN OUTCOMES AND MEASURES The performance (95% CIs) of the various AI algorithms was measured via receiver operating curves and their area under the curve (AUC), precision recall curves, accuracy, and F1 score, evaluated for different training data sizes, ranging from 5120 to 10 samples per class. RESULTS Deep learning algorithms, when trained with sufficiently large data sets (5120 samples per class), yielded comparable performance, with an AUC of 0.8330 (95% CI, 0.8140-0.8520) for a traditional approach (eg, fined-tuned ResNet), compared with low-shot methods (AUC, 0.8348 [95% CI, 0.8159-0.8537]) (using self-supervised Deep InfoMax [our method denoted as DIM]). However, when far fewer training images were available (n = 160), the traditional deep learning approach had an AUC decreasing to 0.6585 (95% CI, 0.6332-0.6838) and was outperformed by a low-shot method using self-supervision with an AUC of 0.7467 (95% CI, 0.7239-0.7695). At very low shots (n = 10), the traditional approach had performance close to chance, with an AUC of 0.5178 (95% CI, 0.4909-0.5447) compared with the best low-shot method (AUC, 0.5778 [95% CI, 0.5512-0.6044]). CONCLUSIONS AND RELEVANCE These findings suggest the potential benefits of using low-shot methods for AI retinal diagnostics when a limited number of annotated training retinal images are available (eg, with rare ophthalmic diseases or when addressing potential AI bias).
引用
收藏
页码:1070 / 1077
页数:8
相关论文
共 21 条
[1]  
[Anonymous], arXiv:1901.03407
[2]  
Bachman P, 2019, ARXIV190600910
[3]   Where's Wally Now? Deep Generative and Discriminative Embeddings for Novelty Detection [J].
Burlina, Philippe ;
Joshi, Neil ;
Wang, I-Jeng .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :11499-11508
[4]   Assessment of Deep Generative Models for High-Resolution Synthetic Retinal Image Generation of Age-Related Macular Degeneration [J].
Burlina, Philippe M. ;
Joshi, Neil ;
Pacheco, Katia D. ;
Liu, T. Y. Alvin ;
Bressler, Neil M. .
JAMA OPHTHALMOLOGY, 2019, 137 (03) :258-264
[5]   Automated Grading of Age-Related Macular Degeneration From Color Fundus Images Using Deep Convolutional Neural Networks [J].
Burlina, Philippe M. ;
Joshi, Neil ;
Pekala, Michael ;
Pacheco, Katia D. ;
Freund, David E. ;
Bressler, Neil M. .
JAMA OPHTHALMOLOGY, 2017, 135 (11) :1170-1176
[6]   Introduction to Machine Learning, Neural Networks, and Deep Learning [J].
Choi, Rene Y. ;
Coyner, Aaron S. ;
Kalpathy-Cramer, Jayashree ;
Chiang, Michael F. ;
Campbell, J. Peter .
TRANSLATIONAL VISION SCIENCE & TECHNOLOGY, 2020, 9 (02)
[7]  
Cuadros Jorge, 2009, J Diabetes Sci Technol, V3, P509
[8]   A Clinician's Guide to Artificial Intelligence: How to Critically Appraise Machine Learning Studies [J].
Faes, Livia ;
Liu, Xiaoxuan ;
Wagner, Siegfried K. ;
Fu, Dun Jack ;
Balaskas, Konstantinos ;
Sim, Dawn A. ;
Bachmann, Lucas M. ;
Keane, Pearse A. ;
Denniston, Alastair K. .
TRANSLATIONAL VISION SCIENCE & TECHNOLOGY, 2020, 9 (02)
[9]   Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs [J].
Gulshan, Varun ;
Peng, Lily ;
Coram, Marc ;
Stumpe, Martin C. ;
Wu, Derek ;
Narayanaswamy, Arunachalam ;
Venugopalan, Subhashini ;
Widner, Kasumi ;
Madams, Tom ;
Cuadros, Jorge ;
Kim, Ramasamy ;
Raman, Rajiv ;
Nelson, Philip C. ;
Mega, Jessica L. ;
Webster, R. .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2016, 316 (22) :2402-2410
[10]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778