A novel K-nearest neighbor classifier for lung cancer disease diagnosis

被引:0
|
作者
Sachdeva, Ravi Kumar [1 ]
Bathla, Priyanka [2 ]
Rani, Pooja [3 ]
Lamba, Rohit [4 ]
Ghantasala, G. S. Pradeep [5 ]
Nassar, Ibrahim F. [6 ]
机构
[1] Chitkara University Institute of Engineering and Technology, Chitkara University, Punjab, Rajpura, India
[2] Chandigarh University, Punjab, Gharuan, Mohali, India
[3] MMICTBM, Maharishi Markandeshwar (Deemed to be University), Haryana, Mullana, Ambala, India
[4] Department of Electronics and Communication Engineering, MMEC, Maharishi Markandeshwar (Deemed to be University), Haryana, Mullana, Ambala, India
[5] Department of Computer Science and Engineering, Alliance College of Engineering and Design, Alliance University, Bengaluru, India
[6] Faculty of Specific Education, Ain Shams University, 365 Ramsis Street, Abassia, Cairo, Egypt
关键词
K-near neighbor - Logistics regressions - Lung Cancer - Machine-learning - Naive bayes - Nearest-neighbour - Pearson correlation - Pearson correlation weighted KNN - Random forests - Support vectors machine;
D O I
10.1007/s00521-024-10235-w
中图分类号
学科分类号
摘要
One of the world's deadliest diseases is lung cancer. Based on a few features, machine learning techniques can help in the diagnosis of lung cancer. The performance of several classifiers: support vector machine (SVM), logistic regression (LR), Naïve Bayes (NB), random forest (RF), and K-nearest neighbor (KNN), was evaluated by the authors using the dataset available on Kaggle to create a systematic approach for the diagnosis of lung cancer disease based on readily observable signs and historical medical data without the requirement of CT scan images. The authors have proposed a novel approach for classification called Pearson correlation weighted KNN (PCWKNN), which is a modified version of KNN and uses Pearson correlation coefficient values to determine weights in a weighted KNN. The performance of the classifiers was evaluated using the hold-out validation method. SVM, LR, and RF were 96.77% accurate. NB obtained 95.16% accuracy. KNN achieved 91.93% accuracy. PCWKNN outperformed the employed classifiers and obtained an accuracy of 98.39%. Addressing the imperative for improved model generalization, the researchers utilized PCWKNN on an alternative, more extensive lung cancer dataset and subsequently broadened its application to diverse diseases, including the brain stroke dataset. The encouraging outcomes underscore PCWKNN's resilience and adaptability, suggesting its viability for real-world implementation. © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024.
引用
收藏
页码:22403 / 22416
页数:13
相关论文
共 50 条
  • [31] Fuzzy Monotonic K-Nearest Neighbor Versus Monotonic Fuzzy K-Nearest Neighbor
    Zhu, Hong
    Wang, Xizhao
    Wang, Ran
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2022, 30 (09) : 3501 - 3513
  • [32] Comparison of Fuzzy Diagnosis with K-Nearest Neighbor and Naive Bayes Classifiers in Disease Diagnosis
    Mahdi, Asaad
    Razali, Ahmad
    AlWakil, Ali
    BRAIN-BROAD RESEARCH IN ARTIFICIAL INTELLIGENCE AND NEUROSCIENCE, 2011, 2 (02): : 58 - 66
  • [33] Diagnosis of Arthritis Using K-Nearest Neighbor Approach
    Kaur, Rupinder
    Madaan, Vishu
    Agrawal, Prateek
    ADVANCED INFORMATICS FOR COMPUTING RESEARCH, PT I, 2019, 1075 : 160 - 171
  • [34] Research on the Improvement of K-Nearest Neighbor Classifier for Imbalanced Text Categorization
    Yang Yanmei
    Xu Linying
    2018 EIGHTH INTERNATIONAL CONFERENCE ON INSTRUMENTATION AND MEASUREMENT, COMPUTER, COMMUNICATION AND CONTROL (IMCCC 2018), 2018, : 968 - 972
  • [35] Boosting k-nearest neighbor classifier by means of input space projection
    Garcia-Pedrajas, Nicolas
    Ortiz-Boyer, Domingo
    EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (07) : 10570 - 10582
  • [36] Enhancing Patient Safety Event Reporting by K-nearest Neighbor Classifier
    Liang, Chen
    Gong, Yang
    CONTEXT SENSITIVE HEALTH INFORMATICS: MANY PLACES, MANY USERS, MANY CONTEXTS, MANY USES, 2015, 218 : 93 - 99
  • [37] Detection and Localization of Myocardial Infarction using K-nearest Neighbor Classifier
    Muhammad Arif
    Ijaz A. Malagore
    Fayyaz A. Afsar
    Journal of Medical Systems, 2012, 36 : 279 - 289
  • [38] Design and implementation of a parallel geographically weighted k-nearest neighbor classifier
    Pu, Yingxia
    Zhao, Xinyi
    Chi, Guangqing
    Zhao, Shuhe
    Wang, Jiechen
    Jin, Zhibin
    Yin, Junjun
    COMPUTERS & GEOSCIENCES, 2019, 127 : 111 - 122
  • [39] Fuzzy k-nearest neighbor classifier to predict protein solvent accessibility
    Chang, Jyh-Yeong
    Shyu, Jia-Jie
    Shi, Yi-Xiang
    NEURAL INFORMATION PROCESSING, PART II, 2008, 4985 : 837 - 845
  • [40] Application of k-Nearest Neighbor on feature projections classifier to text categorization
    Yavuz, T
    Guvenir, HA
    ADVANCES IN COMPUTER AND INFORMATION SCIENCES '98, 1998, 53 : 135 - 142