Strong versus Weak Data Labeling for Artificial Intelligence Algorithms in the Measurement of Geographic Atrophy

被引:1
作者
Domalpally, Amitha [1 ,2 ,5 ]
Slater, Robert [1 ]
Linderman, Rachel E. [1 ,2 ]
Balaji, Rohit
Bogost, Jacob [1 ]
Voland, Rick [2 ]
Pak, Jeong
Blodi, Barbara A. [1 ]
Channa, Roomasa [2 ]
Fong, Donald [3 ]
Chew, Emily Y. [4 ]
机构
[1] Univ Wisconsin, Dept Ophthalmol & Visual Sci, A EYE Res Unit, Madison, WI USA
[2] Univ Wisconsin, Wisconsin Reading Ctr, Dept Ophthalmol & Visual Sci, Madison, WI USA
[3] Annexon Biosci, Brisbane, CA USA
[4] NEI, Div Epidemiol & Clin Applicat, NIH, Bethesda, MD USA
[5] 301 S Westfield Rd,Suite 200, Madison, WI 53717 USA
来源
OPHTHALMOLOGY SCIENCE | 2024年 / 4卷 / 05期
基金
美国国家卫生研究院;
关键词
Arti fi cial intelligence; Geographic atrophy; Dry AMD; Data labeling; EYE DISEASE; QUANTIFICATION; GROWTH;
D O I
10.1016/j.xops.2024.100477
中图分类号
R77 [眼科学];
学科分类号
100212 ;
摘要
Purpose: To gain an understanding of data labeling requirements to train deep learning models for measurement of geographic atrophy (GA) with fundus autofluorescence (FAF) images. Design: Evaluation of artificial intelligence (AI) algorithms. Subjects: The Age-Related Eye Disease Study 2 (AREDS2) images were used for training and crossvalidation, and GA clinical trial images were used for testing. Methods: Training data consisted of 2 sets of FAF images; 1 with area measurements only and no indication of GA location (Weakly labeled) and the second with GA segmentation masks (Strongly labeled). Main Outcome Measures: BlandeAltman plots and scatter plots were used to compare GA area measurement between ground truth and AI measurements. The Dice coefficient was used to compare accuracy of segmentation of the Strong model. Results: In the cross-validation AREDS2 data set (n = 601), the mean (standard deviation [SD]) area of GA measured by human grader, Weakly labeled AI model, and Strongly labeled AI model was 6.65 (6.3) mm 2 , 6.83 (6.29) mm 2 , and 6.58 (6.24) mm 2 , respectively. The mean difference between ground truth and AI was 0.18 mm 2 (95% confidence interval, [CI], -7.57 to 7.92) for the Weakly labeled model and -0.07 mm 2 (95% CI, -1.61 to 1.47) for the Strongly labeled model. With GlaxoSmithKline testing data (n = 156), the mean (SD) GA area was 9.79 (5.6) mm 2 , 8.82 (4.61) mm 2 , and 9.55 (5.66) mm 2 for human grader, Strongly labeled AI model, and Weakly labeled AI model, respectively. The mean difference between ground truth and AI for the 2 models was -0.97 mm 2 (95% CI, -4.36 to 2.41) and -0.24 mm 2 (95% CI, -4.98 to 4.49), respectively. The Dice coefficient was 0.99 for intergrader agreement, 0.89 for the cross-validation data, and 0.92 for the testing data. Conclusions: Deep learning models can achieve reasonable accuracy even with Weakly labeled data. Training methods that integrate large volumes of Weakly labeled images with small number of Strongly labeled images offer a promising solution to overcome the burden of cost and time for data labeling. (c) 2024 by the American Academy of Ophthal-mology. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-
引用
收藏
页数:11
相关论文
共 31 条
  • [1] Deep Learning-Based Prediction of Diabetic Retinopathy Using CLAHE and ESRGAN for Enhancement
    Alwakid, Ghadah
    Gouda, Walaa
    Humayun, Mamoona
    [J]. HEALTHCARE, 2023, 11 (06)
  • [2] Deep Learning to Predict Geographic Atrophy Area and Growth Rate from Multimodal Imaging
    Anegondi, Neha
    Gao, Simon S.
    Steffen, Verena
    Spaide, Richard F.
    Sadda, SriniVas R.
    Holz, Frank G.
    Rabe, Christina
    Honigberg, Lee
    Newton, Elizabeth M.
    Cluceru, Julia
    Kawczynski, Michael G.
    Bengtsson, Thomas
    Ferrara, Daniela
    Yang, Qi
    [J]. OPHTHALMOLOGY RETINA, 2023, 7 (03): : 243 - 252
  • [3] Arslan J, 2021, TRANSL VIS SCI TECHN, V10, DOI [10.1167/tvst.10.8.2, 10.1167/tvst.10.6.2]
  • [4] Artificial Intelligence Algorithms for Analysis of Geographic Atrophy: A Review and Evaluation
    Arslan, Janan
    Samarasinghe, Gihan
    Benke, Kurt K.
    Sowmya, Arcot
    Wu, Zhichao
    Guymer, Robyn H.
    Baird, Paul N.
    [J]. TRANSLATIONAL VISION SCIENCE & TECHNOLOGY, 2020, 9 (02): : 1 - 18
  • [5] Deep learning in geographic atrophy: the best is yet to come
    Biarnes, Marc
    [J]. LANCET DIGITAL HEALTH, 2021, 3 (10): : E617 - E618
  • [6] The Age-related Eye Disease Study 2 (AREDS2) Study Design and Baseline Characteristics (AREDS2 Report Number 1)
    Chew, Emily Y.
    Clemons, Traci
    SanGiovanni, John Paul
    Danis, Ronald
    Domalpally, Amitha
    McBee, Wendy
    Sperduto, Robert
    Ferris, Frederick L.
    [J]. OPHTHALMOLOGY, 2012, 119 (11) : 2282 - 2289
  • [7] Automatic geographic atrophy segmentation using optical attenuation in OCT scans with deep learning
    Chu, Zhongdi
    Wang, Liang
    Zhou, Xiao
    Shi, Yingying
    Cheng, Yuxuan
    Laiginhas, Rita
    Zhou, Hao
    Shen, Mengxi
    Zhang, Qinqin
    de Sisternes, Luis
    Lee, Aaron Y.
    Gregori, Giovanni
    Rosenfeld, Philip J.
    Wang, Ruikang K.
    [J]. BIOMEDICAL OPTICS EXPRESS, 2022, 13 (03) : 1328 - 1343
  • [8] Evaluation of Geographic Atrophy from Color Photographs and Fundus Autofluorescence Images Age-Related Eye Disease Study 2 Report Number 11
    Domalpally, Amitha
    Danis, Ronald
    Agorn, Elvira
    Blodi, Barbara
    Clemons, Traci
    Chew, Emily
    [J]. OPHTHALMOLOGY, 2016, 123 (11) : 2401 - 2407
  • [9] The Collaborative Community on Ophthalmic Imaging Roadmap for Artificial Intelligence in Age-Related Macular Degeneration
    Dow, Eliot R.
    Keenan, Tiarnan D. L.
    Lad, Eleonora M.
    Lee, Aaron Y.
    Lee, Cecilia S.
    Loewenstein, Anat
    Eydelman, Malvina B.
    Chew, Emily Y.
    Keane, Pearse A.
    Lim, Jennifer, I
    [J]. OPHTHALMOLOGY, 2022, 129 (05) : E43 - E59
  • [10] Harvey H, 2019, ARTIF INTELL, P61, DOI [DOI 10.1007/978-3-319-94878-2_6, 10.1007/978-3-319-94878-2_6, DOI 10.1007/978-3-319-94878-2_4]