Towards a Stacking Ensemble Model for Predicting Diabetes Mellitus using Combination of Machine Learning Techniques

被引:0
作者
Alzubaidi, Abdulaziz A. [1 ]
Halawani, Sami M. [1 ]
Jarrah, Mutasem [1 ]
机构
[1] King Abdulaziz Univ, Fac Comp & Informat Technol, Jeddah, Saudi Arabia
关键词
DM; Diabetes Mellitus; Stacking; Ensemble learning; Machine Learning; Random Forest (RF); Logistic Regression (LR); Extreme Gradient Boosting model (XGBoost);
D O I
10.14569/IJACSA.2023.0141236
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
-Diabetes Mellitus (DM) is a chronic disease affecting the world's population, it causes long-term issues such kidney failure, blindness, and heart disease, hurting one's quality of life. Diagnosing diabetes mellitus in an early stage is a challenge and a decisive decision for medical experts, as delay in diagnosis leads to complications in controlling the progression of the disease. Therefore, this research aims to develop a novel stacking ensemble model to predict diabetes mellitus a combination of machine learning models, where an ensemble of Prediction classifiers was used, such as Random Forest (RF), Logistic Regression (LR), as base learners' models, and the Extreme gradient Boosting model (XGBoost) as a Meta -Learner model. The results indicated that our proposed stacking model can predict diabetes mellitus with 83% accuracy on Pima dataset and 97% with DPD dataset. In conclusion, our proposed model can be used to build a diagnostic application for diabetes mellitus, as recommend testing our model on a huge and diverse dataset to obtain more accurate results.
引用
收藏
页码:348 / 358
页数:11
相关论文
共 36 条
[1]  
Andoh T, 2016, HANDBOOK OF HORMONES: COMPARATIVE ENDOCRINOLOGY FOR BASIC AND CLINICAL RESEARCH, P157, DOI 10.1016/B978-0-12-801028-0.00148-3
[2]  
[Anonymous], 1985, WHO TECH REP SER, P1
[3]   The impact of chronic diseases - The partner's perspective [J].
Baanders, Arianne N. ;
Heijmans, Monique J. W. M. .
FAMILY & COMMUNITY HEALTH, 2007, 30 (04) :305-317
[4]  
Berrar D., 2019, Encyclopedia of Bioinformatics and Computational Biology, V13, P542, DOI [DOI 10.1016/B978-0-12-809633-8.20349-X, 10.1016/B978-012-809633-8.20349-X]
[5]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]   Distinguishing between type 1 and type 2 diabetes [J].
Butler, Alexandra E. ;
Misselbrook, David .
BMJ-BRITISH MEDICAL JOURNAL, 2020, 370
[7]   Melanoma Detection Using XGB Classifier Combined with Feature Extraction and K-Means SMOTE Techniques [J].
Chang, Chih-Chi ;
Li, Yu-Zhen ;
Wu, Hui-Ching ;
Tseng, Ming-Hseng .
DIAGNOSTICS, 2022, 12 (07)
[8]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794
[9]   Transcriptome meta-analysis of peripheral lymphomononuclear cells indicates that gestational diabetes is closer to type 1 diabetes than to type 2 diabetes mellitus [J].
Collares, C. V. A. ;
Evangelista, A. F. ;
Xavier, D. J. ;
Takahashi, P. ;
Almeida, R. ;
Macedo, C. ;
Manoel-Caetano, F. ;
Foss, M. C. ;
Foss-Freitas, M. C. ;
Rassi, D. M. ;
Sakamoto-Hojo, E. T. ;
Passos, G. A. ;
Donadi, E. A. .
MOLECULAR BIOLOGY REPORTS, 2013, 40 (09) :5351-5358
[10]   Ensemble methods in machine learning [J].
Dietterich, TG .
MULTIPLE CLASSIFIER SYSTEMS, 2000, 1857 :1-15