Classification and Detection of Chronic Kidney Disease (CKD) Using Machine Learning Algorithms

被引:3
作者
Abuomar, O. [1 ]
Sogbe, P. [1 ]
机构
[1] Lewis Univ, Dept Engn Comp & Math Sci, Romeoville, IL 60446 USA
来源
INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND ENERGY TECHNOLOGIES (ICECET 2021) | 2021年
关键词
CKD; SOM; Neural Network; Classification; Clustering; KNOWLEDGE DISCOVERY;
D O I
10.1109/ICECET52533.2021.9698666
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The severity of chronic kidney disease (CKD) is increasing around the world especially for low-income countries where testing and prediction could be based on several medical comorbidities. This could require advanced equipment and expertise which might not be available during regular medical examinations. To quickly predict the severity of CKD using the least subset of easily available blood biochemical features at certain threshold values during medical examinations, we developed and compared several machine learning classification and clustering models as well as a neural network approach. The clinical and blood biochemical results from a chronic disease data set from Kaggle.com consisting of 24 dimensions of various cardinal features used in determining chronic kidney disease and 400 samples was utilized for this project. Seven predictive and classification models were established including Random Forest Classifier, K-Nearest Neighbors, Logistic Regression, Xgboost, Decision Tree, Neural networks, and Naive Bayes. Three clustering algorithms including Fuzzy c-means, hard k-means, and self-organizing maps (SOMs) were also established. The accuracy, sensitivity (recall), precision, f1 score and 10-folds cross validation results for all models were evaluated, and the outcomes were ranked and analyzed. Decision tree (DT) was used to produce the highest classification accuracy of 98.8% and a 10-fold cross validation value of 95.75%. SOMs produced the best clustering results ahead of K-means and FCM with accuracy of 85.0% and F1_score of 84.2%. In addition, minimum threshold values which could trigger CKD in patients were determined. The neural network model was able to predict CKD with 85.0% accuracy, precision of 83.0%, recall of 84.3, fi score of 83.8% and a 10-fold validation accuracy of 84.5%. Lastly, we used SOM to obtain unique CKD clusters with common features among four unique clusters which caused chronic kidney disease.
引用
收藏
页码:18 / 25
页数:8
相关论文
共 23 条
[1]   Data mining and knowledge discovery in materials science and engineering: A polymer nanocomposites case study [J].
AbuOmar, O. ;
Nouranian, S. ;
King, R. ;
Bouvard, J. L. ;
Toghiani, H. ;
Lacy, T. E. ;
Pittman, C. U., Jr. .
ADVANCED ENGINEERING INFORMATICS, 2013, 27 (04) :615-624
[2]  
analyticsvidhya, MACHINE LEARNING 101
[3]  
bmcnephrol.biomedcentral, MACHINE LEARNING ALG, DOI [10.1186/s12 882-020-02093-0, DOI 10.1186/S12882-020-02093-0]
[4]   Presence of early CKD-related metabolic complications predict progression of stage 3 CKD: a case-controlled study [J].
Chase, Herbert S. ;
Hirsch, Jamie S. ;
Mohan, Sumit ;
Rao, Maya K. ;
Radhakrishnan, Jai .
BMC NEPHROLOGY, 2014, 15
[5]   An end stage kidney disease predictor based on an artificial neural networks ensemble [J].
Di Noia, Tommaso ;
Ostuni, Vito Claudio ;
Pesce, Francesco ;
Binetti, Giulio ;
Naso, David ;
Schena, Francesco Paolo ;
Di Sciascio, Eugenio .
EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (11) :4438-4445
[6]  
docs.aws.amazon, AMAZON SAGEMAKER
[7]  
Dovgan E., PLOS ONE, V15, P1
[8]  
Fayyad U, 1996, AI MAG, V17, P37
[9]  
geeksforgeeks, NAIVE BAYES CLASSIFI
[10]   Increasing tendency of urine protein is a risk factor for rapid eGFR decline in patients with CKD: A machine learning-based prediction model by using a big database [J].
Inaguma, Daijo ;
Kitagawa, Akimitsu ;
Yanagiya, Ryosuke ;
Koseki, Akira ;
Iwamori, Toshiya ;
Kudo, Michiharu ;
Yuzawa, Yukio .
PLOS ONE, 2020, 15 (09)