A Robust Chronic Kidney Disease Classifier Using Machine Learning

被引:25
作者
Swain, Debabrata [1 ]
Mehta, Utsav [1 ]
Bhatt, Ayush [1 ]
Patel, Hardeep [1 ]
Patel, Kevin [1 ]
Mehta, Devanshu [1 ]
Acharya, Biswaranjan [2 ]
Gerogiannis, Vassilis C. [3 ]
Kanavos, Andreas [4 ]
Manika, Stella [5 ]
机构
[1] Pandit Deendayal Energy Univ, Comp Sci & Engn Dept, Gandhinagar 382007, India
[2] Marwadi Univ, Dept Comp Engn AI, Rajkot 360003, India
[3] Univ Thessaly, Dept Digital Syst, Larisa 41500, Greece
[4] Ionian Univ, Dept Informat, Corfu 49100, Greece
[5] Univ Thessaly, Dept Planning & Reg Dev, Volos 38334, Greece
关键词
chronic kidney disease; data balancing; hyperparameter tuning; machine learning; SMOTE; supervised learning;
D O I
10.3390/electronics12010212
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Clinical support systems are affected by the issue of high variance in terms of chronic disorder prognosis. This uncertainty is one of the principal causes for the demise of large populations around the world suffering from some fatal diseases such as chronic kidney disease (CKD). Due to this reason, the diagnosis of this disease is of great concern for healthcare systems. In such a case, machine learning can be used as an effective tool to reduce the randomness in clinical decision making. Conventional methods for the detection of chronic kidney disease are not always accurate because of their high degree of dependency on several sets of biological attributes. Machine learning is the process of training a machine using a vast collection of historical data for the purpose of intelligent classification. This work aims at developing a machine-learning model that can use a publicly available data to forecast the occurrence of chronic kidney disease. A set of data preprocessing steps were performed on this dataset in order to construct a generic model. This set of steps includes the appropriate imputation of missing data points, along with the balancing of data using the SMOTE algorithm and the scaling of the features. A statistical technique, namely, the chi-squared test, is used for the extraction of the least-required set of adequate and highly correlated features to the output. For the model training, a stack of supervised-learning techniques is used for the development of a robust machine-learning model. Out of all the applied learning techniques, support vector machine (SVM) and random forest (RF) achieved the lowest false-negative rates and test accuracy, equal to 99.33% and 98.67%, respectively. However, SVM achieved better results than RF did when validated with 10-fold cross-validation.
引用
收藏
页数:13
相关论文
共 32 条
  • [1] Audu A., 2021, ASIAN J PROBAB STAT, V15, P235, DOI [10.9734/ajpas/2021/v15i430377, DOI 10.9734/AJPAS/2021/V15I430377]
  • [2] Biau G, 2012, J MACH LEARN RES, V13, P1063
  • [3] Cahyani N., 2020, Journal of Telecommunication, Electronic and Computer Engineering (JTEC), V12, P25
  • [4] Application of an Improved CHI Feature Selection Algorithm
    Cai, Liang-jing
    Lv, Shu
    Shi, Kai-bo
    [J]. DISCRETE DYNAMICS IN NATURE AND SOCIETY, 2021, 2021
  • [5] Centers for Disease Control and Prevention, 2021, CHRON KIDN DIS US 20
  • [6] SMOTE: Synthetic minority over-sampling technique
    Chawla, Nitesh V.
    Bowyer, Kevin W.
    Hall, Lawrence O.
    Kegelmeyer, W. Philip
    [J]. 2002, American Association for Artificial Intelligence (16)
  • [7] Prediction of Chronic Kidney Disease-A Machine Learning Perspective
    Chittora, Pankaj
    Chaurasia, Sandeep
    Chakrabarti, Prasun
    Kumawat, Gaurav
    Chakrabarti, Tulika
    Leonowicz, Zbigniew
    Jasinski, Michal
    Jasinski, Lukasz
    Gono, Radomir
    Jasinska, Elzbieta
    Bolshev, Vadim
    [J]. IEEE ACCESS, 2021, 9 : 17312 - 17334
  • [8] Darapureddy N., 2021, Int J Eng Adv Technol, V8, P215
  • [9] Das D., 2019, Int J Comput Sci Eng, V7, P548, DOI 10.26438/ijcse/v7i4.548558
  • [10] Hyperparameter Tuning for Machine Learning Algorithms Used for Arabic Sentiment Analysis
    Elgeldawi, Enas
    Sayed, Awny
    Galal, Ahmed R.
    Zaki, Alaa M.
    [J]. INFORMATICS-BASEL, 2021, 8 (04):