Strong versus Weak Data Labeling for Artificial Intelligence Algorithms in the Measurement of Geographic Atrophy

被引:1
作者
Domalpally, Amitha [1 ,2 ,5 ]
Slater, Robert [1 ]
Linderman, Rachel E. [1 ,2 ]
Balaji, Rohit
Bogost, Jacob [1 ]
Voland, Rick [2 ]
Pak, Jeong
Blodi, Barbara A. [1 ]
Channa, Roomasa [2 ]
Fong, Donald [3 ]
Chew, Emily Y. [4 ]
机构
[1] Univ Wisconsin, Dept Ophthalmol & Visual Sci, A EYE Res Unit, Madison, WI USA
[2] Univ Wisconsin, Wisconsin Reading Ctr, Dept Ophthalmol & Visual Sci, Madison, WI USA
[3] Annexon Biosci, Brisbane, CA USA
[4] NEI, Div Epidemiol & Clin Applicat, NIH, Bethesda, MD USA
[5] 301 S Westfield Rd,Suite 200, Madison, WI 53717 USA
来源
OPHTHALMOLOGY SCIENCE | 2024年 / 4卷 / 05期
基金
美国国家卫生研究院;
关键词
Arti fi cial intelligence; Geographic atrophy; Dry AMD; Data labeling; EYE DISEASE; QUANTIFICATION; GROWTH;
D O I
10.1016/j.xops.2024.100477
中图分类号
R77 [眼科学];
学科分类号
100212 ;
摘要
Purpose: To gain an understanding of data labeling requirements to train deep learning models for measurement of geographic atrophy (GA) with fundus autofluorescence (FAF) images. Design: Evaluation of artificial intelligence (AI) algorithms. Subjects: The Age-Related Eye Disease Study 2 (AREDS2) images were used for training and crossvalidation, and GA clinical trial images were used for testing. Methods: Training data consisted of 2 sets of FAF images; 1 with area measurements only and no indication of GA location (Weakly labeled) and the second with GA segmentation masks (Strongly labeled). Main Outcome Measures: BlandeAltman plots and scatter plots were used to compare GA area measurement between ground truth and AI measurements. The Dice coefficient was used to compare accuracy of segmentation of the Strong model. Results: In the cross-validation AREDS2 data set (n = 601), the mean (standard deviation [SD]) area of GA measured by human grader, Weakly labeled AI model, and Strongly labeled AI model was 6.65 (6.3) mm 2 , 6.83 (6.29) mm 2 , and 6.58 (6.24) mm 2 , respectively. The mean difference between ground truth and AI was 0.18 mm 2 (95% confidence interval, [CI], -7.57 to 7.92) for the Weakly labeled model and -0.07 mm 2 (95% CI, -1.61 to 1.47) for the Strongly labeled model. With GlaxoSmithKline testing data (n = 156), the mean (SD) GA area was 9.79 (5.6) mm 2 , 8.82 (4.61) mm 2 , and 9.55 (5.66) mm 2 for human grader, Strongly labeled AI model, and Weakly labeled AI model, respectively. The mean difference between ground truth and AI for the 2 models was -0.97 mm 2 (95% CI, -4.36 to 2.41) and -0.24 mm 2 (95% CI, -4.98 to 4.49), respectively. The Dice coefficient was 0.99 for intergrader agreement, 0.89 for the cross-validation data, and 0.92 for the testing data. Conclusions: Deep learning models can achieve reasonable accuracy even with Weakly labeled data. Training methods that integrate large volumes of Weakly labeled images with small number of Strongly labeled images offer a promising solution to overcome the burden of cost and time for data labeling. (c) 2024 by the American Academy of Ophthal-mology. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-
引用
收藏
页数:11
相关论文
共 31 条
  • [11] C5 Inhibitor Avacincaptad Pegol for Geographic Atrophy Due to Age-Related Macular Degeneration A Randomized Pivotal Phase 2/3 Trial
    Jaffe, Glenn J.
    Westby, Keith
    Csaky, Karl G.
    Mones, Jordi
    Pearlman, Joel A.
    Patel, Sunil S.
    Joondeph, Brian C.
    Randolph, John
    Masonson, Harvey
    Rezaei, Kourous A.
    [J]. OPHTHALMOLOGY, 2021, 128 (04) : 576 - 586
  • [12] A Deep Learning Approach for Automated Detection of Geographic Atrophy from Color Fundus Photographs
    Keenan, Tiarnan D.
    Dharssi, Shazia
    Peng, Yifan
    Chen, Qingyu
    Agron, Elvira
    Wong, Wai T.
    Lu, Zhiyong
    Chew, Emily Y.
    [J]. OPHTHALMOLOGY, 2019, 126 (11) : 1533 - 1540
  • [13] A global review of publicly available datasets for ophthalmological imaging: barriers to access, usability, and generalisability
    Khan, Saad M.
    Liu, Xiaoxuan
    Nath, Siddharth
    Korot, Edward
    Faes, Livia
    Wagner, Siegfried K.
    Keane, Pearse A.
    Sebire, Neil J.
    Burton, Matthew J.
    Denniston, Alastair K.
    [J]. LANCET DIGITAL HEALTH, 2021, 3 (01): : E51 - E66
  • [14] Grader Variability and the Importance of Reference Standards for Evaluating Machine Learning Models for Diabetic Retinopathy
    Krause, Jonathan
    Gulshan, Varun
    Rahimy, Ehsan
    Karth, Peter
    Widner, Kasumi
    Corrado, Greg S.
    Peng, Lily
    Webster, Dale R.
    [J]. OPHTHALMOLOGY, 2018, 125 (08) : 1264 - 1272
  • [15] Complement C3 Inhibitor Pegcetacoplan for Geographic Atrophy Secondary to Age-Related Macular Degeneration A Randomized Phase 2 Trial
    Liao, David S.
    Grossi, Federico, V
    El Mehdi, Delphine
    Gerber, Monica R.
    Brown, David M.
    Heier, Jeffrey S.
    Wykoff, Charles C.
    Singerman, Lawrence J.
    Abraham, Prema
    Grassmann, Felix
    Nuernberg, Peter
    Weber, Bernhard H. F.
    Deschatelets, Pascal
    Kim, Robert Y.
    Chung, Carol Y.
    Ribeiro, Ramiro M.
    Hamdani, Mohamed
    Rosenfeld, Philip J.
    Boyer, David S.
    Slakter, Jason S.
    Francois, Cedric G.
    [J]. OPHTHALMOLOGY, 2020, 127 (02) : 186 - 195
  • [16] A Deep Learning Model for Segmentation of Geographic Atrophy to Study Its Long-Term Natural History
    Liefers, Bart
    Colijn, Johanna M.
    Gonzalez-Gonzalo, Cnstina
    Verzijden, Timo
    Wang, Jie Jin
    Joachim, Nichole
    Mitchell, Paul
    Hoyng, Carel B.
    van Ginneken, Bram
    Klaver, Caroline C. W.
    Sanchez, Clara I.
    [J]. OPHTHALMOLOGY, 2020, 127 (08) : 1086 - 1096
  • [17] Feature Pyramid Networks for Object Detection
    Lin, Tsung-Yi
    Dollar, Piotr
    Girshick, Ross
    He, Kaiming
    Hariharan, Bharath
    Belongie, Serge
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 936 - 944
  • [18] Deep learning-based classification of retinal atrophy using fundus autofluorescence imaging
    Miere, Alexandra
    Capuano, Vittorio
    Kessler, Arthur
    Zambrowski, Olivia
    Jung, Camille
    Colantuono, Donato
    Pallone, Carlotta
    Semoun, Oudy
    Petit, Eric
    Souied, Eric
    [J]. COMPUTERS IN BIOLOGY AND MEDICINE, 2021, 130
  • [19] Deep Learning-Based Algorithms in Screening of Diabetic Retinopathy: A Systematic Review of Diagnostic Performance
    Nielsen, Katrine B.
    Lautrup, Mie L.
    Andersen, Jakob K. H.
    Savarimuthu, Thiusius R.
    Grauslund, Jakob
    [J]. OPHTHALMOLOGY RETINA, 2019, 3 (04): : 294 - 304
  • [20] Realization of the Contrast Limited Adaptive Histogram Equalization (CLAHE) for real-time image enhancement
    Reza, AM
    [J]. JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2004, 38 (01): : 35 - 44