A k-NN method for lung cancer prognosis with the use of a genetic algorithm for feature selection

被引:149
作者
Maleki N. [1 ]
Zeinali Y. [2 ]
Niaki S.T.A. [2 ]
机构
[1] Department of Industrial Engineering, Faculty of Engineering, University of Tehran
[2] Department of Industrial Engineering, Sharif University of Technology, Tehran
来源
Expert Systems with Applications | 2021年 / 164卷
关键词
Cancer staging diagnosis; Data mining; Feature selection; Genetic algorithm; k-NN technique; Lung cancer;
D O I
10.1016/j.eswa.2020.113981
中图分类号
学科分类号
摘要
Lung cancer is one of the most common diseases for human beings everywhere throughout the world. Early identification of this disease is the main conceivable approach to enhance the possibility of patients’ survival. In this paper, a k-Nearest-Neighbors technique, for which a genetic algorithm is applied for the efficient feature selection to reduce the dataset dimensions and enhance the classifier pace, is employed for diagnosing the stage of patients’ disease. To improve the accuracy of the proposed algorithm, the best value for k is determined using an experimental procedure. The implementation of the proposed approach on a lung cancer database reveals 100% accuracy. This implies that one could use the algorithm to find a correlation between the clinical information and data mining techniques to support lung cancer staging diagnosis efficiently. © 2020 Elsevier Ltd
引用
收藏
相关论文
共 25 条
[1]  
Akben S.B., Early stage chronic kidney disease diagnosis by applying data mining methods to urinalysis, blood analysis and disease history, IRBM, 39, 5, pp. 353-358, (2018)
[2]  
Alharbi A., An automated computer system based on genetic algorithm and fuzzy systems for lung cancer diagnosis, International Journal of Nonlinear Sciences and Numerical Simulation, 19, 6, pp. 583-594, (2018)
[3]  
Alirezaei M., Niaki S.T.A., Akhavan Niaki S.A., A bi-objective hybrid optimization algorithm to reduce noise and data dimension in diabetes diagnosis using support vector machines, Expert Systems with Applications, 127, pp. 47-57, (2019)
[4]  
Chen H.-L., Huang C.-C., Yu X.-G., Xu X., Sun X., Wang G., Wang S.-J., An efficient diagnosis system for detection of Parkinson's disease using fuzzy k-nearest neighbor approach, Expert systems with applications, 40, 1, pp. 263-271, (2013)
[5]  
Cherif W., Optimization of K-NN algorithm by clustering and reliability coefficients: Application to breast-cancer diagnosis, Procedia Computer Science, 127, pp. 293-299, (2018)
[6]  
Han J., Pei J., Kamber M., Data mining: Concepts and techniques, (2011)
[7]  
Hashi E.K., Zaman S.U., Hasan R., (2017)
[8]  
Hayashi Y., Yukita S., Rule extraction using recursive-rule extraction algorithm with J48 graft combined with sampling selection techniques for the diagnosis of type 2 diabetes mellitus in the Pima Indian dataset, Informatics in Medicine Unlocked, 2, pp. 92-104, (2016)
[9]  
Huang G.-M., Huang K.-Y., (2015)
[10]  
Iyer A., Jeyalatha S., Sumbaly R., Diagnosis of diabetes using classification mining techniques, International Journal of Data Mining & Knowledge Management Process (IJDKP), 5, 1, pp. 1-14, (2015)