Machine learning models for chronic kidney disease diagnosis and prediction

被引:5
作者
Rahman, Md. Mustafizur [1 ,2 ]
Al-Amin, Md. [1 ]
Hossain, Jahangir [2 ]
机构
[1] Jashore Univ Sci & Technol, Dept Elect & Elect Engn, Jashore 7408, Bangladesh
[2] Univ Technol Sydney, Sch Elect & Data Engn, 15 Broadway, Sydney, NSW 2007, Australia
关键词
Chronic kidney disease; Ensemble machine learning; Borderline-SMOTE; MICE; Prediction; SMOTE;
D O I
10.1016/j.bspc.2023.105368
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Background and objective: Chronic kidney disease is a severe health problem that affects people all over the world, particularly in South Asia. Therefore, proper diagnosis and treatment are required as early as possible. The main goal of this study is to detect the presence or absence of CKD in the human body utilizing a variety of features grasped from a few medical tests. Methods: This paper has focused on eight ensemble learning methods for diagnosing CKD on the UCI machine learning datasets. The datasets have been fixed by imputing the missing values using the MICE imputation method and handling the imbalance properties using the borderline SVMSMOTE method to improve the performance of classifiers. Moreover, recursive feature elimination and the boruta method have been used to find the most significant features and reduce the compilation time, while the hyperparameter tuning technique was used to raise the performance of classifiers and get optimal solutions. Results: Taking the most significant feature into consideration, RFE outperformed boruta and selected only 50% of the total features. Moreover, various performance matrices are used to find the best competent classifiers for detecting CKD. LightGBM outperformed state-of-the-art and other ensemble methods with the lowest compilation time and highest accuracy. Based on experimental findings, the proposed method achieved the highest average of 99.75% accuracy, 99.40% precision, 99.41% recall, 99.61% F-measure and 99.57% AUC-ROC. Moreover, our proposed method rises the average detection rate by 5.64%, 1%, 2.04%, 8.63%,1.99%, 2.84%, 2.42% and 4.76%, respectively, in comparison with different approaches performing on the same dataset. Conclusion: Experiments show that our suggested method can identify CKD more precisely than the most recent methods.
引用
收藏
页数:17
相关论文
共 61 条
[21]   Machine Learning Techniques for Chronic Kidney Disease Risk Prediction [J].
Dritsas, Elias ;
Trigka, Maria .
BIG DATA AND COGNITIVE COMPUTING, 2022, 6 (03)
[22]  
Dua C, 2019, Graff, "UCI Machine Learning Repository
[23]   Intelligent Diagnostic Prediction and Classification System for Chronic Kidney Disease [J].
Elhoseny, Mohamed ;
Shankar, K. ;
Uthayakumar, J. .
SCIENTIFIC REPORTS, 2019, 9 (1)
[24]   A survey on missing data in machine learning [J].
Emmanuel, Tlamelo ;
Maupong, Thabiso ;
Mpoeleng, Dimane ;
Semong, Thabo ;
Mphago, Banyatsang ;
Tabona, Oteng .
JOURNAL OF BIG DATA, 2021, 8 (01)
[25]   A decision-theoretic generalization of on-line learning and an application to boosting [J].
Freund, Y ;
Schapire, RE .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 55 (01) :119-139
[26]   Stochastic gradient boosting [J].
Friedman, JH .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2002, 38 (04) :367-378
[27]   Gene selection for cancer classification using support vector machines [J].
Guyon, I ;
Weston, J ;
Barnhill, S ;
Vapnik, V .
MACHINE LEARNING, 2002, 46 (1-3) :389-422
[28]   Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning [J].
Han, H ;
Wang, WY ;
Mao, BH .
ADVANCES IN INTELLIGENT COMPUTING, PT 1, PROCEEDINGS, 2005, 3644 :878-887
[29]   Comprehensive Survey of IoT, Machine Learning, and Blockchain for Health Care Applications: A Topical Assessment for Pandemic Preparedness, Challenges, and Solutions [J].
Imran, Muhammad ;
Zaman, Umar ;
Imran ;
Imtiaz, Junaid ;
Fayaz, Muhammad ;
Gwak, Jeonghwan .
ELECTRONICS, 2021, 10 (20)
[30]  
Islam Md Ashiqul, 2020, Proceedings of the 3rd International Conference on Intelligent Sustainable Systems (ICISS 2020), P952, DOI 10.1109/ICISS49785.2020.9315878