Machine Learning Based Framework for Lung Cancer Detection and Image Feature Extraction Using VGG16 with PCA on CT-Scan Images

被引:0
作者
Amit Singh [1 ]
Rakesh Kumar Dwivedi [1 ]
Rajul Rastogi [2 ]
机构
[1] College of Computing Sciences and Information Technology, Teerthanker Mahaveer University, UP, Moradabad
[2] Department of Radiodiagnosis, TMMCRC, Teerthanker Mahaveer University, UP, Moradabad
关键词
Ensemble learning; Feature extraction; Lung cancer; PCA; SMOTE; VGG16;
D O I
10.1007/s42979-024-03414-y
中图分类号
学科分类号
摘要
Lung cancer causes one of the highest mortality rates worldwide, among both men and women. The situation demands for new early detection approaches, to facilitate more accurate diagnoses and treatments. In this study, we aim to increase the lung cancer diagnosis performance, by combining ensemble learning with image analysis to form a detection model. More specifically, we propose an approach that aims at detecting lung cancer from CT-scan images using an ensemble model. The core methodology used here is that VGG16 model is used to do the feature extraction. VGG16 is preferred because it uses small 3 × 3 kernels which helps to capture as many as details of images and thus gives state of the art performance for transfer learning tasks. Since, the features after applying VGG16 are highly dimensional, and hence to make processing easy, we should convert them into lower dimensions. This particular task could be achieved by using the Principal Component Analysis (PCA) technique- which uses some linear algebraic concepts to automate the process of reducing the dimensionality of any set of features efficiently. The ensemble learning technique is used to increase the predictive accuracy by combining different classification algorithms. The proposed work performs combination among LR, GNB, and RF classifiers. Similarly, VGG16 is used for feature extraction, PCA for removing of correlated features in feature subset and dimensionality reduction and ensemble are used to obtain higher accurate robust lung cancer detection system. Results show that our ensemble model outperforms other models, with an accuracy of 97.8% to determine whether lung cancer is present or not. The proposed model improved the accuracy 1.3% form the existing model. In addition to being compared, the model also proposed several innovations: (i) VGG16 is used as a base model for feature selection because it has small receptive fields so it can be effective and pre-trained features for the purpose of transfer learning; (ii) PCA is constructed on top of VGG16 in order to simplify and make the model more effective; (iii) ensemble modelling techniques were applied to improve the classification accuracy of base classifiers. In conclusion, this study contributes to medical diagnostics by demonstrating the potential of integrating VGG16, PCA, and ensemble learning for developing a lung cancer detection model with high accuracy. It is believed that such a developed diagnostic tool can help in advancing the performance of automated lung cancer diagnostic systems for better patient outcome. © The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd. 2024.
引用
收藏
相关论文
共 55 条
[1]  
Zappa C., Mousa S.A., Non-small cell lung cancer: current treatment and future advances, Transl Lung Cancer Res, 5, 3, pp. 288-300, (2016)
[2]  
Wozniak M., Polap D., Capizzi G., Lo Sciuto G., Kosmider L., Frankiewicz K., Small lung nodules detection based on local variance analysis and probabilistic neural network, Comput Methods Programs Biomed, 161, pp. 173-180, (2018)
[3]  
Tyagi S., Tyagi N., Choudhury A., Gupta G., Zahra M.M.A., Rahin S.A., Identification and classification of prostate cancer identification and classification based on improved convolution neural network, BioMed Res Int, (2022)
[4]  
Hart G.R., Roffman D.A., Decker R., Deng J., A multi-parameterized artificial neural network for lung cancer risk prediction, PLoS ONE, 13, 10, pp. 1-13, (2018)
[5]  
Saba T., Automated lung nodule detection and classification based on multiple classifiers voting, Microsc Res Tech, 82, 9, pp. 1601-1609, (2019)
[6]  
Kirienko M., Sollini M., Silvestri G., Convolutional neural networks promising in lung cancer T-parameter assessment on baseline FDG-PET/CT, Contrast Media Mol Imaging, 2018, (2018)
[7]  
Cancer. [Online]. Available
[8]  
Shaukat F., Raja G., Ashraf R., Khalid S., Ahmad M., Ali A., Artificial neural network based classification of lung nodules in CT images using intensity, shape and texture features, J Ambient Intell Humaniz Comput, 10, 10, pp. 4135-4149, (2019)
[9]  
Kanavati F., Toyokawa G., Momosaki S., Rambeau M., Kozuma Y., Weakly-supervised learning for lung carcinoma classification using deep learning, Sci Rep, (2020)
[10]  
Kumar S., Singh K., Kumar S., Kaiwartya O., Cao Y., Zhou H., Delimitated anti jammer scheme for internet of vehicle: machine learning based security approach, IEEE Access, 7, pp. 113311-23, (2019)