Computer-aided diagnosis of lung cancer: the effect of training data sets on classification accuracy of lung nodules

被引:38
作者
Gong, Jing [1 ]
Liu, Ji-Yu [2 ]
Sun, Xi-Wen [2 ]
Zheng, Bin [3 ]
Nie, Sheng-Dong [1 ]
机构
[1] Univ Shanghai Sci & Technol, Sch Med Instrument & Food Engn, 516 Jun Gong Rd, Shanghai 200093, Peoples R China
[2] Shanghai Pulm Hosp, Radiol Dept, 507 Zheng Min Rd, Shanghai 200433, Peoples R China
[3] Univ Oklahoma, Sch Elect & Comp Engn, Norman, OK 73019 USA
基金
上海市自然科学基金; 中国国家自然科学基金;
关键词
computer-aided diagnosis; CADx; lung cancer; early stage; advanced stage; PULMONARY NODULES; CT; BENIGN; PERFORMANCE;
D O I
10.1088/1361-6560/aaa610
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
This study aims to develop a computer-aided diagnosis (CADx) scheme for classification between malignant and benign lung nodules, and also assess whether CADx performance changes in detecting nodules associated with early and advanced stage lung cancer. The study involves 243 biopsy-confirmed pulmonary nodules. Among them, 76 are benign, 81 are stage I and 86 are stage III malignant nodules. The cases are separated into three data sets involving: (1) all nodules, (2) benign and stage I malignant nodules, and (3) benign and stage III malignant nodules. A CADx scheme is applied to segment lung nodules depicted on computed tomography images and we initially computed 66 3D image features. Then, three machine learning models namely, a support vector machine, naive Bayes classifier and linear discriminant analysis, are separately trained and tested by using three data sets and a leave-one-case-out cross-validation method embedded with a Relief-F feature selection algorithm. When separately using three data sets to train and test three classifiers, the average areas under receiver operating characteristic curves (AUC) are 0.94, 0.90 and 0.99, respectively. When using the classifiers trained using data sets with all nodules, average AUC values are 0.88 and 0.99 for detecting early and advanced stage nodules, respectively. AUC values computed from three classifiers trained using the same data set are consistent without statistically significant difference (p > 0.05). This study demonstrates (1) the feasibility of applying a CADx scheme to accurately distinguish between benign and malignant lung nodules, and (2) a positive trend between CADx performance and cancer progression stage. Thus, in order to increase CADx performance in detecting subtle and early cancer, training data sets should include more diverse early stage cancer cases.
引用
收藏
页数:11
相关论文
共 31 条
[1]   Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach [J].
Aerts, Hugo J. W. L. ;
Velazquez, Emmanuel Rios ;
Leijenaar, Ralph T. H. ;
Parmar, Chintan ;
Grossmann, Patrick ;
Cavalho, Sara ;
Bussink, Johan ;
Monshouwer, Rene ;
Haibe-Kains, Benjamin ;
Rietveld, Derek ;
Hoebers, Frank ;
Rietbergen, Michelle M. ;
Leemans, C. Rene ;
Dekker, Andre ;
Quackenbush, John ;
Gillies, Robert J. ;
Lambin, Philippe .
NATURE COMMUNICATIONS, 2014, 5
[2]   Applying a new quantitative global breast MRI feature analysis scheme to assess tumor response to chemotherapy [J].
Aghaei, Faranak ;
Tan, Maxine ;
Hollingsworth, Alan B. ;
Zheng, Bin .
JOURNAL OF MAGNETIC RESONANCE IMAGING, 2016, 44 (05) :1099-1106
[3]   After Detection: The Improved Accuracy of Lung Cancer Assessment Using Radiologic Computer-aided Diagnosis [J].
Amir, Guy J. ;
Lehmann, Harold P. .
ACADEMIC RADIOLOGY, 2016, 23 (02) :186-191
[4]   Pulmonary nodules: Estimation of malignancy at thin-section helical CT - Effect of computer-aided diagnosis on performance of radiologists [J].
Awai, K ;
Murao, K ;
Ozawa, A ;
Nakayama, Y ;
Nakaura, T ;
Liu, D ;
Kawanaka, K ;
Funama, Y ;
Morishita, S ;
Yamashita, Y .
RADIOLOGY, 2006, 239 (01) :276-284
[5]   Pulmonary nodules at chest CT: Effect of computer-aided diagnosis on radiologists' detection performance [J].
Awai, K ;
Murao, K ;
Ozawa, A ;
Komi, M ;
Hayakawa, H ;
Hori, S ;
Nishimura, Y .
RADIOLOGY, 2004, 230 (02) :347-352
[6]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[7]   Neural Network Ensemble-Based Computer-Aided Diagnosis for Differentiation of Lung Nodules on CT Images Clinical Evaluation [J].
Chen, Hui ;
Xu, Yan ;
Ma, Yujing ;
Ma, Binrong .
ACADEMIC RADIOLOGY, 2010, 17 (05) :595-602
[8]   A Combination of Shape and Texture Features for Classification of Pulmonary Nodules in Lung CT Images [J].
Dhara, Ashis Kumar ;
Mukhopadhyay, Sudipta ;
Dutta, Anirvan ;
Garg, Mandeep ;
Khandelwal, Niranjan .
JOURNAL OF DIGITAL IMAGING, 2016, 29 (04) :466-475
[9]   Computer-aided detection of pulmonary nodules using dynamic self-adaptive template matching and a FLDA classifier [J].
Gong, Jing ;
Liu, Ji-yu ;
Wang, Li-jia ;
Zheng, Bin ;
Nie, Sheng-dong .
PHYSICA MEDICA-EUROPEAN JOURNAL OF MEDICAL PHYSICS, 2016, 32 (12) :1502-1509
[10]  
Gong J, 2014, COMM COM INF SC, V461, P39