How Many Private Data Are Needed for Deep Learning in Lung Nodule Detection on CT Scans? A Retrospective Multicenter Study

被引:8
作者
Son, Jeong Woo [1 ]
Hong, Ji Young [2 ]
Kim, Yoon [1 ,3 ]
Kim, Woo Jin [4 ]
Shin, Dae-Yong [5 ]
Choi, Hyun-Soo [1 ,3 ]
Bak, So Hyeon [6 ,7 ]
Moon, Kyoung Min [8 ]
机构
[1] ZIOVISION, Chunchon 24341, South Korea
[2] Hallym Univ, Med Ctr, Chuncheon Sacred Heart Hosp, Dept Med,Div Pulm & Crit Care Med, Chunchon 24253, South Korea
[3] Kangwon Natl Univ, Coll IT, Dept Comp Sci & Engn, Chunchon 24341, South Korea
[4] Kangwon Natl Univ, Dept Internal Med, Chunchon 24341, South Korea
[5] Kangwon Natl Univ, KNU Ind Cooperat Fdn, Chunchon 24341, South Korea
[6] Univ Ulsan, Coll Med, Asan Med Ctr, Dept Radiol, Seoul 05505, South Korea
[7] Univ Ulsan, Coll Med, Asan Med Ctr, Res Inst Radiol, Seoul 05505, South Korea
[8] Univ Ulsan, Coll Med, Gangneung Asan Hosp, Dept Pulm Allergy & Crit Care Med, Kangnung 25440, South Korea
基金
新加坡国家研究基金会;
关键词
lung nodule; radiologist; deep learning; computed tomography; nodule detection; publicly available data; transfer learning; COMPUTER-AIDED DETECTION; PULMONARY NODULES; ALGORITHMS; IMAGES;
D O I
10.3390/cancers14133174
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Simple Summary The early detection of lung nodules is important for patient treatment and follow-up. Many researchers are investigating deep-learning-based lung nodule detection to ease the burden of lung nodule detection by radiologists. The purpose of this paper is to provide guidelines for collecting lung nodule data to facilitate research. We collected chest computed tomography scans reviewed by radiologists at three hospitals. In addition, several experiments were conducted using the large-scale open dataset, LUNA16. As a result of the experiment, it was possible to prove the value of using the collected data compared to using LUNA16. We also demonstrated the effectiveness of transfer learning from pre-trained learning weights in LUNA16. Finally, our study provides information on the amount of lung nodule data that must be collected to stabilize lung nodule detection performance. Early detection of lung nodules is essential for preventing lung cancer. However, the number of radiologists who can diagnose lung nodules is limited, and considerable effort and time are required. To address this problem, researchers are investigating the automation of deep-learning-based lung nodule detection. However, deep learning requires large amounts of data, which can be difficult to collect. Therefore, data collection should be optimized to facilitate experiments at the beginning of lung nodule detection studies. We collected chest computed tomography scans from 515 patients with lung nodules from three hospitals and high-quality lung nodule annotations reviewed by radiologists. We conducted several experiments using the collected datasets and publicly available data from LUNA16. The object detection model, YOLOX was used in the lung nodule detection experiment. Similar or better performance was obtained when training the model with the collected data rather than LUNA16 with large amounts of data. We also show that weight transfer learning from pre-trained open data is very useful when it is difficult to collect large amounts of data. Good performance can otherwise be expected when reaching more than 100 patients. This study offers valuable insights for guiding data collection in lung nodules studies in the future.
引用
收藏
页数:19
相关论文
共 44 条
  • [1] Reduced Lung-Cancer Mortality with Low-Dose Computed Tomographic Screening
    Aberle, Denise R.
    Adams, Amanda M.
    Berg, Christine D.
    Black, William C.
    Clapp, Jonathan D.
    Fagerstrom, Richard M.
    Gareen, Ilana F.
    Gatsonis, Constantine
    Marcus, Pamela M.
    Sicks, JoRean D.
    [J]. NEW ENGLAND JOURNAL OF MEDICINE, 2011, 365 (05) : 395 - 409
  • [2] Use of a Commercially Available Deep Learning Algorithm to Measure the Solid Portions of Lung Cancer Manifesting as Subsolid Lesions at CT: Comparisons with Radiologists and Invasive Component Size at Pathologic Examination
    Ahn, Yura
    Lee, Sang Min
    Noh, Han Na
    Kim, Wooil
    Choe, Jooae
    Do, Kyung-Hyun
    Seo, Joon Beom
    [J]. RADIOLOGY, 2021, 299 (01) : 202 - 210
  • [3] A review of lung cancer screening and the role of computer-aided detection
    Al Mohammad, B.
    Brennan, P. C.
    Mello-Thoms, C.
    [J]. CLINICAL RADIOLOGY, 2017, 72 (06) : 433 - 442
  • [4] [Anonymous], Cancer Research UK
  • [5] [Anonymous], ELCAP public lung image database
  • [6] [Anonymous], WHO
  • [7] The Lung Image Database Consortium, (LIDC) and Image Database Resource Initiative (IDRI): A Completed Reference Database of Lung Nodules on CT Scans
    Armato, Samuel G., III
    McLennan, Geoffrey
    Bidaut, Luc
    McNitt-Gray, Michael F.
    Meyer, Charles R.
    Reeves, Anthony P.
    Zhao, Binsheng
    Aberle, Denise R.
    Henschke, Claudia I.
    Hoffman, Eric A.
    Kazerooni, Ella A.
    MacMahon, Heber
    van Beek, Edwin J. R.
    Yankelevitz, David
    Biancardi, Alberto M.
    Bland, Peyton H.
    Brown, Matthew S.
    Engelmann, Roger M.
    Laderach, Gary E.
    Max, Daniel
    Pais, Richard C.
    Qing, David P-Y
    Roberts, Rachael Y.
    Smith, Amanda R.
    Starkey, Adam
    Batra, Poonam
    Caligiuri, Philip
    Farooqi, Ali
    Gladish, Gregory W.
    Jude, C. Matilda
    Munden, Reginald F.
    Petkovska, Iva
    Quint, Leslie E.
    Schwartz, Lawrence H.
    Sundaram, Baskaran
    Dodd, Lori E.
    Fenimore, Charles
    Gur, David
    Petrick, Nicholas
    Freymann, John
    Kirby, Justin
    Hughes, Brian
    Casteele, Alessi Vande
    Gupte, Sangeeta
    Sallam, Maha
    Heath, Michael D.
    Kuhn, Michael H.
    Dharaiya, Ekta
    Burns, Richard
    Fryd, David S.
    [J]. MEDICAL PHYSICS, 2011, 38 (02) : 915 - 931
  • [8] Bochkovskiy A., 2020, ARXIV 200410934
  • [9] Impact of a Computer-Aided Detection (CAD) System Integrated into a Picture Archiving and Communication System (PACS) on Reader Sensitivity and Efficiency for the Detection of Lung Nodules in Thoracic CT Exams
    Bogoni, Luca
    Ko, Jane P.
    Alpert, Jeffrey
    Anand, Vikram
    Fantauzzi, John
    Florin, Charles H.
    Koo, Chi Wan
    Mason, Derek
    Rom, William
    Shiau, Maria
    Salganicoff, Marcos
    Naidich, David P.
    [J]. JOURNAL OF DIGITAL IMAGING, 2012, 25 (06) : 771 - 781
  • [10] Radiomics-guided deep neural networks stratify lung adenocarcinoma prognosis from CT scans
    Cho, Hwan-ho
    Lee, Ho Yun
    Kim, Eunjin
    Lee, Geewon
    Kim, Jonghoon
    Kwon, Junmo
    Park, Hyunjin
    [J]. COMMUNICATIONS BIOLOGY, 2021, 4 (01)