Features processing for random forest optimization in lung nodule localization

被引:12
作者
El-Askary, Nada S. [1 ]
Salem, Mohammed A. -M. [2 ,3 ]
Roushdy, Mohamed I. [4 ]
机构
[1] Ain Shams Univ, Comp Sci Dept, Fac Comp & Informat Sci, Cairo, Egypt
[2] Ain Shams Univ, Fac Comp & Informat Sci, Sci Comp Dept, Cairo, Egypt
[3] German Univ Cairo, Fac Media Engn & Technol, Cairo, Egypt
[4] Future Univ Egypt, Fac Comp & Informat Technol, New Cairo, Egypt
关键词
Lung nodule localization; Computed Tomography; Automatic detection; Random forest; Lung features; Feature processing; IMAGE DATABASE CONSORTIUM; PULMONARY NODULES; CLASSIFICATION;
D O I
10.1016/j.eswa.2021.116489
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Lung nodule can cause lung cancer and so researchers do their best to detect those nodules in their early stages. Machine learning algorithms are used to detect lung nodules in a short time with high accuracy. Random Forest (RF) is a remarkable ensemble machine learning algorithm can be used to classify medical images, recognize different pathologies and detect deficiencies based on selected input features. The paper proposes a model that enables early detection and localization for lung nodule from CT images and propose RF optimization and analysis the effect of the feature groups on the classification accuracy. Processing was applied on features extracted from CT images to optimize the RF output. In previous work, local features such as Haar features gave better results than region-based features. In the proposed model after applying a novel ANDing technique in preprocessing step these region-based features gave better results and the model accuracy enhanced. By combining global and local features the model classification results and accuracy are greatly improved. Experiments were made using 214 cases with total 2124 CT slices downloaded from the publicly available LIDC database. After applying preprocessing using novel technique, 119 features are calculated and extracted from each pixel in the CT image. Post-processing is made on the extracted features to refine the learner input data. Feature dimensionality reduction was applied by dividing features into 5 different feature sets and select best scored results. Finally, when comparing with previous work, RF is optimized, true positive rate is increased by 8.66% and false positive rate is decreased by 4.4% which led to better localization and accuracy increased by 5.47%. Best achieved results were 96.41%, 95.98% and 96.20% for sensitivity, specificity and accuracy respectively when tuning RF with 80 trees and 0.04 for in bag fraction. Results from RF were compared with other methodologies such as KNN, SVM, CNN and deep learning and RF proved to give best accuracy as mentioned in the discussion section.
引用
收藏
页数:9
相关论文
共 46 条
[1]  
Abbas Q, 2017, INT J MED RES HEALTH, V6, P111
[2]  
American Cancer Society, 2019, DOI [10.1007/978-1-4614-8063-1, DOI 10.1007/978-1-4614-8063-1]
[3]  
[Anonymous], 2021, IEEE Trans. Broadcast.
[4]  
Anthimopoulos M, 2014, IEEE ENG MED BIO, P6040, DOI 10.1109/EMBC.2014.6945006
[5]   Classification of lung nodule malignancy in computed tomography imaging utilising generative adversarial networks and semi-supervised transfer learning [J].
Apostolopoulos, Ioannis D. ;
Papathanasiou, Nikolaos D. ;
Panayiotakis, George S. .
BIOCYBERNETICS AND BIOMEDICAL ENGINEERING, 2021, 41 (04) :1243-1257
[6]   The Lung Image Database Consortium, (LIDC) and Image Database Resource Initiative (IDRI): A Completed Reference Database of Lung Nodules on CT Scans [J].
Armato, Samuel G., III ;
McLennan, Geoffrey ;
Bidaut, Luc ;
McNitt-Gray, Michael F. ;
Meyer, Charles R. ;
Reeves, Anthony P. ;
Zhao, Binsheng ;
Aberle, Denise R. ;
Henschke, Claudia I. ;
Hoffman, Eric A. ;
Kazerooni, Ella A. ;
MacMahon, Heber ;
van Beek, Edwin J. R. ;
Yankelevitz, David ;
Biancardi, Alberto M. ;
Bland, Peyton H. ;
Brown, Matthew S. ;
Engelmann, Roger M. ;
Laderach, Gary E. ;
Max, Daniel ;
Pais, Richard C. ;
Qing, David P-Y ;
Roberts, Rachael Y. ;
Smith, Amanda R. ;
Starkey, Adam ;
Batra, Poonam ;
Caligiuri, Philip ;
Farooqi, Ali ;
Gladish, Gregory W. ;
Jude, C. Matilda ;
Munden, Reginald F. ;
Petkovska, Iva ;
Quint, Leslie E. ;
Schwartz, Lawrence H. ;
Sundaram, Baskaran ;
Dodd, Lori E. ;
Fenimore, Charles ;
Gur, David ;
Petrick, Nicholas ;
Freymann, John ;
Kirby, Justin ;
Hughes, Brian ;
Casteele, Alessi Vande ;
Gupte, Sangeeta ;
Sallam, Maha ;
Heath, Michael D. ;
Kuhn, Michael H. ;
Dharaiya, Ekta ;
Burns, Richard ;
Fryd, David S. .
MEDICAL PHYSICS, 2011, 38 (02) :915-931
[7]  
Bhaskar N., 2020, EUROPEAN J MOL CLIN, V7, P3228
[8]  
Breiman L, 1996, MACH LEARN, V24, P123, DOI 10.1023/A:1018054314350
[9]  
Bronmans B., 2018, BRONMANS2018LUNGNS, P1
[10]   Deep learning classification of lung cancer histology using CT images [J].
Chaunzwa, Tafadzwa L. ;
Hosny, Ahmed ;
Xu, Yiwen ;
Shafer, Andrea ;
Diao, Nancy ;
Lanuti, Michael ;
Christiani, David C. ;
Mak, Raymond H. ;
Aerts, Hugo J. W. L. .
SCIENTIFIC REPORTS, 2021, 11 (01)