Data Mining Approach to Classify Cases of Lung Cancer

被引:4
作者
Vieira, Eduarda [1 ]
Ferreira, Diana [2 ]
Neto, Cristiana [2 ]
Abelha, Antonio [2 ]
Machado, Jose [2 ]
机构
[1] Univ Minho, Dept Informat, Braga, Portugal
[2] Univ Minho, Algoritmi Res Ctr, Campus Gualtar, Braga, Portugal
来源
TRENDS AND APPLICATIONS IN INFORMATION SYSTEMS AND TECHNOLOGIES, VOL 1 | 2021年 / 1365卷
关键词
Healthcare; Lung cancer; Data mining; Classification; CRISP-DM;
D O I
10.1007/978-3-030-72657-7_49
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
According to the World Cancer Research Fund, a leading authority on cancer prevention research, lung cancer is the most commonly occurring cancer in men and the third most commonly occurring cancer in women, with the 5-year relative survival percentage being significantly low. Smoking is the major risk factor for lung cancer and the symptoms associated with it include cough, fatigue, shortness of breath, chest pain, weight loss, and loss of appetite. In an attempt to build a model capable of identifying individuals with lung cancer, this study aims to build a data mining classification model to predict whether or not a patient has lung cancer based on crucial features such as the above mentioned symptoms. Through the CRISP-DM methodology and the RapidMiner software, different models were built, using different scenarios, algorithms, sampling methods, and data approaches. The best data mining model achieved an accuracy of 93%, a sensitivity of 96%, a specificity of 90% and a precision of 91%, using the Artificial Neural Network algorithm.
引用
收藏
页码:511 / 521
页数:11
相关论文
共 15 条
[1]  
[Anonymous], US CANC STAT DATA VI
[2]  
[Anonymous], Lung cancer statistics
[3]  
[Anonymous], Kaggle-Lung Cancer Dataset By Staceyinrobert
[4]   European consensus statement on lung cancer: Risk factors and prevention [J].
Biesalski, HK ;
de Mesquita, BB ;
Chesson, A ;
Chytil, F ;
Grimble, R ;
Hermus, RJJ ;
Kohrle, J ;
Lotan, R ;
Norpoth, K ;
Pastorino, U ;
Thurnham, D .
CA-A CANCER JOURNAL FOR CLINICIANS, 1998, 48 (03) :167-+
[5]   Recognising Lung Cancer in Primary Care [J].
Bradley, Stephen H. ;
Kennedy, Martyn P. T. ;
Neal, Richard D. .
ADVANCES IN THERAPY, 2019, 36 (01) :19-30
[6]   Recommendation System Using Autoencoders [J].
Ferreira, Diana ;
Silva, Sofia ;
Abelha, Antonio ;
Machado, Jose .
APPLIED SCIENCES-BASEL, 2020, 10 (16)
[7]  
Hirsch FR, 2001, CLIN CANCER RES, V7, P5
[8]  
Jorgensen D.L., 2015, EMERGING TRENDS SOCI, P1, DOI DOI 10.1002/9781118900772.ETRDS0247
[9]  
Krishnaiah V, 2013, INT J COMPUTER SCI I, V4, P39
[10]   Data Mining for Cardiovascular Disease Prediction [J].
Martins, Barbara ;
Ferreira, Diana ;
Neto, Cristiana ;
Abelha, Antonio ;
Machado, Jose .
JOURNAL OF MEDICAL SYSTEMS, 2021, 45 (01)