Performance Analysis of Data Mining Classification Techniques to Predict Diabetes

被引:118
作者
Perveen, Sajida [1 ]
Shahbaz, Muhammad [1 ]
Guergachi, Aziz [2 ]
Keshavjee, Karim [3 ]
机构
[1] Univ Engn & Technol, Dept Comp Sci & Engn, Lahore, Pakistan
[2] Ryerson Univ, Ted Rogers Sch Informat Technol Management, Toronto, ON, Canada
[3] Univ Victoria, Sch Hlth Informat, Victoria, BC, Canada
来源
4TH SYMPOSIUM ON DATA MINING APPLICATIONS (SDMA2016) | 2016年 / 82卷
关键词
Diabetes Mellitus; Ensemble method; Base Learner; Bagging; Adaboost and Decision tree;
D O I
10.1016/j.procs.2016.04.016
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Diabetes Mellitus is one of the major health challenges all over the world. The prevalence of diabetes is increasing at a fast pace, deteriorating human, economic and social fabric. Prevention and prediction of diabetes mellitus is increasingly gaining interest in healthcare community. Although several clinical decision support systems have been proposed that incorporate several data mining techniques for diabetes prediction and course of progression. These conventional systems are typically based either just on a single classifier or a plain combination thereof. Recently extensive endeavors are being made for improving the accuracy of such systems using ensemble classifiers. This study follows the adaboost and bagging ensemble techniques using J48 (c4.5) decision tree as a base learner along with standalone data mining technique J48 to classify patients with diabetes mellitus using diabetes risk factors. This classification is done across three different ordinal adults groups in Canadian Primary Care Sentinel Surveillance network. Experimental result shows that, overall performance of adaboost ensemble method is better than bagging as well as standalone J48 decision tree. (C) 2016 Published by Elsevier B.V.
引用
收藏
页码:115 / 121
页数:7
相关论文
共 17 条
[1]  
[Anonymous], 2011, CHRONIC DIS INJURIES, V32, P1
[2]  
[Anonymous], 2014, C4 5 PROGRAMS MACHIN, V28
[3]  
[Anonymous], 2009, SCH INFORM SYSTEMS M, V5
[4]  
Anuja Kumari V., 2013, Int. J. Eng. Res. Appl., V3, P1797
[5]   Intelligible Support Vector Machines for Diagnosis of Diabetes Mellitus [J].
Barakat, Nahla H. ;
Bradley, Andrew P. ;
Barakat, Mohamed Nabil H. .
IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE, 2010, 14 (04) :1114-1120
[6]  
Brown G, 2005, J MACH LEARN RES, V6, P1621
[7]   Screening for Prediabetes Using Machine Learning Models [J].
Choi, Soo Beom ;
Kim, Won Jae ;
Yoo, Tae Keun ;
Park, Jee Soo ;
Chung, Jai Won ;
Lee, Yong-ho ;
Kang, Eun Seok ;
Kim, Deok Won .
COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2014, 2014
[8]   Ensemble methods in machine learning [J].
Dietterich, TG .
MULTIPLE CLASSIFIER SYSTEMS, 2000, 1857 :1-15
[9]   Evaluation of a risk factor scoring model in screening for undiagnosed diabetes in China population [J].
Dong, Jian-jun ;
Lou, Neng-jun ;
Zhao, Jia-jun ;
Zhang, Zhong-wen ;
Qiu, Lu-lu ;
Zhou, Ying ;
Liao, Lin .
JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE B, 2011, 12 (10) :846-852
[10]  
Freund Y., 1996, Machine Learning. Proceedings of the Thirteenth International Conference (ICML '96), P148