eDiaPredict: An Ensemble-based Framework for Diabetes Prediction

被引:36
作者
Singh, Ashima [1 ]
Dhillon, Arwinder [1 ]
Kumar, Neeraj [1 ]
Hossain, M. Shamim [2 ,3 ]
Muhammad, Ghulam [4 ]
Kumar, Manoj [5 ]
机构
[1] Thapar Univ, Comp Sci & Engn Dept, Patiala, Punjab, India
[2] King Saud Univ, Res Chair Pervas & Mobile Comp, Riyadh 11543, Saudi Arabia
[3] King Saud Univ, Dept Software Engn, Coll Comp & Informat Sci, Riyadh 11543, Saudi Arabia
[4] King Saud Univ, Coll Comp & Informat Sci, Dept Comp Engn, Riyadh, Saudi Arabia
[5] SMVD Univ, Katra, India
关键词
Diabetes prediction; ensembled models; XGBoost; decision tree; random forest;
D O I
10.1145/3415155
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Medical systems incorporate modern computational intelligence in healthcare. Machine learning techniques are applied to predict the onset and reoccurrence of the disease, identify biomarkers for survivability analysis depending upon certain health conditions of the patient. Early prediction of diseases like diabetes is essential as the number of diabetic patients of all age groups is increasing rapidly. To identify underlying reasons for the onset of diabetes in its early stage has become a challenging task for medical practitioners. Continuously increasing diabetic patient data has necessitated for the applications of efficient machine learning algorithms, which learns from the trends of the underlying data and recognizes the critical conditions in patients. In this article, an ensemble-based framework named eDiaPredict is proposed. It uses ensemble modeling, which includes an ensemble of different machine learning algorithms comprising XGBoost, Random Forest, Support Vector Machine, Neural Network, and Decision tree to predict diabetes status among patients. The performance of eDiaPredict has been evaluated using various performance parameters like accuracy, sensitivity, specificity, Gini Index, precision, area under curve, area under convex hull, minimum error rate, and minimum weighted coefficient. The effectiveness of the proposed approach is shown by its application on the PIMA Indian diabetes dataset wherein an accuracy of 95% is achieved.
引用
收藏
页数:26
相关论文
共 50 条
[1]   A novel decision tree classification based on post-pruning with Bayes minimum risk [J].
Ahmed, Ahmed Mohamed ;
Rizaner, Ahmet ;
Ulusoy, Ali Hakan .
PLOS ONE, 2018, 13 (04)
[2]   Deep Learning for EEG motor imagery classification based on multi-layer CNNs feature fusion [J].
Amin, Syed Umar ;
Alsulaiman, Mansour ;
Muhammad, Ghulam ;
Mekhtiche, Mohamed Amine ;
Hossain, M. Shamim .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 101 :542-554
[3]   Cognitive Smart Healthcare for Pathology Detection and Monitoring [J].
Amin, Syed Umar ;
Hossain, M. Shamim ;
Muhammad, Ghulam ;
Alhussein, Musaed ;
Rahman, Md Abdur .
IEEE ACCESS, 2019, 7 :10745-10753
[4]  
Anand A, 2015, 2015 1ST INTERNATIONAL CONFERENCE ON NEXT GENERATION COMPUTING TECHNOLOGIES (NGCT), P673, DOI 10.1109/NGCT.2015.7375206
[5]  
[Anonymous], Ensemble Learning to Improve Machine Learning Results
[6]  
[Anonymous], DECISION TREE CLASSI
[7]  
[Anonymous], THINKING BUILDING XG
[8]  
[Anonymous], DOES CONTINUOUS GLUC
[9]  
[Anonymous], FEATURE SELECTION IS
[10]  
[Anonymous], 2014, WORKSH 28 AAAI C ART