Semi-supervised training using cooperative labeling of weakly annotated data for nodule detection in chest CT

被引:4
作者
Maynord, Michael [1 ,2 ]
Farhangi, M. Mehdi [2 ]
Fermuller, Cornelia [3 ]
Aloimonos, Yiannis [1 ]
Levine, Gary [4 ]
Petrick, Nicholas [2 ]
Sahiner, Berkman [2 ]
Pezeshk, Aria [2 ,5 ]
机构
[1] Univ Maryland, Dept Comp Sci, Iribe Ctr Comp Sci & Engn, College Pk, MD 20742 USA
[2] FDA, Div Imaging Diagnost & Software Reliabil DIDSR, OSEL, CDRH, Silver Spring, MD 20993 USA
[3] Univ Maryland, Inst Adv Comp Studies, Iribe Ctr Comp Sci & Engn, College Pk, MD 20742 USA
[4] FDA, Div Radiol Imaging Devices & Elect Prod, CDRH, Silver Spring, MD USA
[5] Plato Syst, San Mateo, CA USA
关键词
computer aided detection; pulmonary nodules; semi-supervised learning; FALSE-POSITIVE REDUCTION; LUNG NODULES; AUTOMATIC DETECTION; IMAGES;
D O I
10.1002/mp.16219
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
PurposeMachine learning algorithms are best trained with large quantities of accurately annotated samples. While natural scene images can often be labeled relatively cheaply and at large scale, obtaining accurate annotations for medical images is both time consuming and expensive. In this study, we propose a cooperative labeling method that allows us to make use of weakly annotated medical imaging data for the training of a machine learning algorithm. As most clinically produced data are weakly-annotated - produced for use by humans rather than machines and lacking information machine learning depends upon - this approach allows us to incorporate a wider range of clinical data and thereby increase the training set size. MethodsOur pseudo-labeling method consists of multiple stages. In the first stage, a previously established network is trained using a limited number of samples with high-quality expert-produced annotations. This network is used to generate annotations for a separate larger dataset that contains only weakly annotated scans. In the second stage, by cross-checking the two types of annotations against each other, we obtain higher-fidelity annotations. In the third stage, we extract training data from the weakly annotated scans, and combine it with the fully annotated data, producing a larger training dataset. We use this larger dataset to develop a computer-aided detection (CADe) system for nodule detection in chest CT. ResultsWe evaluated the proposed approach by presenting the network with different numbers of expert-annotated scans in training and then testing the CADe using an independent expert-annotated dataset. We demonstrate that when availability of expert annotations is severely limited, the inclusion of weakly-labeled data leads to a 5% improvement in the competitive performance metric (CPM), defined as the average of sensitivities at different false-positive rates. ConclusionsOur proposed approach can effectively merge a weakly-annotated dataset with a small, well-annotated dataset for algorithm training. This approach can help enlarge limited training data by leveraging the large amount of weakly labeled data typically generated in clinical image interpretation.
引用
收藏
页码:4255 / 4268
页数:14
相关论文
共 26 条
  • [1] Reduced Lung-Cancer Mortality with Low-Dose Computed Tomographic Screening
    Aberle, Denise R.
    Adams, Amanda M.
    Berg, Christine D.
    Black, William C.
    Clapp, Jonathan D.
    Fagerstrom, Richard M.
    Gareen, Ilana F.
    Gatsonis, Constantine
    Marcus, Pamela M.
    Sicks, JoRean D.
    [J]. NEW ENGLAND JOURNAL OF MEDICINE, 2011, 365 (05) : 395 - 409
  • [2] End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography
    Ardila, Diego
    Kiraly, Atilla P.
    Bharadwaj, Sujeeth
    Choi, Bokyung
    Reicher, Joshua J.
    Peng, Lily
    Tse, Daniel
    Etemadi, Mozziyar
    Ye, Wenxing
    Corrado, Greg
    Naidich, David P.
    Shetty, Shravya
    [J]. NATURE MEDICINE, 2019, 25 (06) : 954 - +
  • [3] The Lung Image Database Consortium, (LIDC) and Image Database Resource Initiative (IDRI): A Completed Reference Database of Lung Nodules on CT Scans
    Armato, Samuel G., III
    McLennan, Geoffrey
    Bidaut, Luc
    McNitt-Gray, Michael F.
    Meyer, Charles R.
    Reeves, Anthony P.
    Zhao, Binsheng
    Aberle, Denise R.
    Henschke, Claudia I.
    Hoffman, Eric A.
    Kazerooni, Ella A.
    MacMahon, Heber
    van Beek, Edwin J. R.
    Yankelevitz, David
    Biancardi, Alberto M.
    Bland, Peyton H.
    Brown, Matthew S.
    Engelmann, Roger M.
    Laderach, Gary E.
    Max, Daniel
    Pais, Richard C.
    Qing, David P-Y
    Roberts, Rachael Y.
    Smith, Amanda R.
    Starkey, Adam
    Batra, Poonam
    Caligiuri, Philip
    Farooqi, Ali
    Gladish, Gregory W.
    Jude, C. Matilda
    Munden, Reginald F.
    Petkovska, Iva
    Quint, Leslie E.
    Schwartz, Lawrence H.
    Sundaram, Baskaran
    Dodd, Lori E.
    Fenimore, Charles
    Gur, David
    Petrick, Nicholas
    Freymann, John
    Kirby, Justin
    Hughes, Brian
    Casteele, Alessi Vande
    Gupte, Sangeeta
    Sallam, Maha
    Heath, Michael D.
    Kuhn, Michael H.
    Dharaiya, Ekta
    Burns, Richard
    Fryd, David S.
    [J]. MEDICAL PHYSICS, 2011, 38 (02) : 915 - 931
  • [4] Semi-supervised Medical Image Segmentation via Learning Consistency Under Transformations
    Bortsova, Gerda
    Dubost, Florian
    Hogeweg, Laurens
    Katramados, Ioannis
    de Bruijne, Marleen
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT VI, 2019, 11769 : 810 - 818
  • [5] [Anonymous], 2020, CA Cancer J Clin, V70, P313, DOI [10.3322/caac.21492, 10.3322/caac.21609]
  • [6] A Two-Stage Convolutional Neural Networks for Lung Nodule Detection
    Cao, Haichao
    Liu, Hong
    Song, Enmin
    Ma, Guangzhi
    Xu, Xiangyang
    Jin, Renchao
    Liu, Tengying
    Hung, Chih-Cheng
    [J]. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2020, 24 (07) : 2006 - 2015
  • [7] Automatic detection of lung nodules in CT datasets based on stable 3D mass-spring models
    Cascio, D.
    Magro, R.
    Fauci, F.
    Iacomi, M.
    Raso, G.
    [J]. COMPUTERS IN BIOLOGY AND MEDICINE, 2012, 42 (11) : 1098 - 1109
  • [8] Not-so-supervised: A survey of semi-supervised, multi-instance, and transfer learning in medical image analysis
    Cheplygina, Veronika
    de Bruijne, Marleen
    Pluim, Josien P. W.
    [J]. MEDICAL IMAGE ANALYSIS, 2019, 54 : 280 - 296
  • [9] Multilevel Contextual 3-D CNNs for False Positive Reduction in Pulmonary Nodule Detection
    Dou, Qi
    Chen, Hao
    Yu, Lequan
    Qin, Jing
    Heng, Pheng-Ann
    [J]. IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2017, 64 (07) : 1558 - 1567
  • [10] Automatic lung nodule detection in thoracic CT scans using dilated slice-wise convolutions
    Farhangi, M. Mehdi
    Sahiner, Berkman
    Petrick, Nicholas
    Pezehsk, Aria
    [J]. MEDICAL PHYSICS, 2021, 48 (07) : 3741 - 3751