EARLY DIAGNOSIS AND CLASSIFICATION OF LUNG CANCER DRIVEN BY MULTI-FEATURE DATA: A COMPARISON AND OPTIMIZATION OF THREE MACHINE LEARNING METHODS

被引:0
作者
Lan, Qi [1 ,2 ]
Wang, Renfeng [1 ]
Fan, Hong [3 ]
机构
[1] Fudan Univ, Zhongshan Hosp, Xiamen Branch, Dept Thorac Surg, Xiamen 361000, Peoples R China
[2] Xiamen Univ, Sch Informat, Dept Software Engn, Xiamen 361102, Peoples R China
[3] Fudan Univ, Zhongshan Hosp, Dept Thorac Surg, Shanghai, Peoples R China
关键词
Lung cancer prediction; machine learning; Random Forest; decision tree; support vector machine; early diagnosis; multi-feature data; CLINICAL-FEATURES; GENERAL-PRACTICE; SYMPTOMS;
D O I
10.1142/S0219519424400797
中图分类号
Q6 [生物物理学];
学科分类号
071011 ;
摘要
The objective of this study was to construct a prediction model for early diagnosis and classification of lung cancer based on multi-dimensional clinical data. Three advanced machine learning models - Random Forest, Decision Tree, and Support Vector Machine (SVM) - were employed to predict lung cancer using features such as age, chronic disease history, and clinical symptoms. The models were optimized through various strategies to improve predictive performance. The results demonstrated that all three models achieved high accuracy and sensitivity in lung cancer prediction. Furthermore, a detailed analysis of feature importance identified key factors such as age and chronic disease history that significantly influenced prediction outcomes. Among the models, the SVM exhibited particularly strong performance, providing robust support for accurate lung cancer prediction. Future work will focus on integrating multimodal data and optimizing model architecture and hyperparameters to further enhance the predictive accuracy and clinical utility of the model, thereby contributing to the early diagnosis and treatment of lung cancer.
引用
收藏
页数:21
相关论文
共 24 条
[1]   Lung Cancer 2020 Epidemiology, Etiology, and Prevention [J].
Bade, Brett C. ;
Dela Cruz, Charles S. .
CLINICS IN CHEST MEDICINE, 2020, 41 (01) :1-+
[2]  
Boehmke B., 2019, HANDS ON MACHINE LEA
[3]   Estimates and Projections of the Global Economic Cost of 29 Cancers in 204 Countries and Territories From 2020 to 2050 [J].
Chen, Simiao ;
Cao, Zhong ;
Prettner, Klaus ;
Kuhn, Michael ;
Yang, Juntao ;
Jiao, Lirui ;
Wang, Zhuoran ;
Li, Weimin ;
Geldsetzer, Pascal ;
Baernighausen, Till ;
Bloom, David E. ;
Wang, Chen .
JAMA ONCOLOGY, 2023, 9 (04) :465-472
[4]  
Everitt B.S., 2011, Encyclopaedic Companion to Medical Statistics, V3rd
[5]  
Fan M., 2023, MOD PREV MED, V10, P1831
[6]   Greedy function approximation: A gradient boosting machine [J].
Friedman, JH .
ANNALS OF STATISTICS, 2001, 29 (05) :1189-1232
[7]   Evaluation of patients with pulmonary nodules: When is it lung cancer? ACCP evidence-based clinical practice guidelines (2nd edition) [J].
Gould, Michael K. ;
Fletcher, James ;
Iannettoni, Mark D. ;
Lynch, William R. ;
Midthun, David E. ;
Naidich, David P. ;
Ost, David E. .
CHEST, 2007, 132 (03) :108S-130S
[8]   What are the clinical features of lung cancer before the diagnosis is made? - A population based case-control study [J].
Hamilton, W ;
Peters, TJ ;
Round, A ;
Sharp, D .
THORAX, 2005, 60 (12) :1059-1065
[9]   Cancer incidence and mortality in China, 2022 ☆ [J].
Han, Bingfeng ;
Zheng, Rongshou ;
Zeng, Hongmei ;
Wang, Shaoming ;
Sun, Kexin ;
Chen, Ru ;
Li, Li ;
Wei, Wenqiang ;
He, Jie .
JOURNAL OF THE NATIONAL CANCER CENTER, 2024, 4 (01) :47-53
[10]  
He SX., 2001, COMPUT MATH APPL, V23, P14, DOI [10.3321/j.issn:0254-7791.2001.03.007, DOI 10.3321/J.ISSN:0254-7791.2001.03.007]