Tree-based classifier ensembles for early detection method of diabetes: an exploratory study

被引:33
作者
Tama, Bayu Adhi [1 ,2 ]
Rhee, Kyung-Hyune [1 ]
机构
[1] Pukyong Natl Univ, IT Convergence & Applicat Engn, 48513 Daeyon Campus,45 Yongso Ro, Busan, South Korea
[2] Univ Sriwijaya, Fac Comp Sci, Jln Raya Palembang Prabumulih Km,32 Ogan Ilir, Sumatera Selatan, Indonesia
基金
新加坡国家研究基金会;
关键词
Diabetes mellitus; Classifier ensembles; Benchmark; Early detection method; MULTIPLE COMPARISONS; TESTS;
D O I
10.1007/s10462-017-9565-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Diabetes is a lifestyle-driven disease which has become a critical health issue worldwide. In this paper, we conduct an exploratory study about early detection method of diabetes mellitus using various ensemble learning techniques. Eight tree-based machine learning algorithms, i.e. classification and regression tree, decision tree (C4.5), reduced error pruning tree, random tree, naive Bayes tree, functional tree, best-first decision tree and logistic model tree are employed as a base classifier in five different ensembles, i.e. bagging, boosting, random subspace, DECORATE, and rotation forest. The performance of ensembles and base classifiers are thoroughly benchmarked on three real-world datasets in term of area under receiver operating characteristic curve metric. Finally, we assess the performance differences among the classifiers using several statistical significant tests. We contribute to the existing literature regarding an extensive benchmark of tree-based classifier ensembles for early detection method of diabetes disease.
引用
收藏
页码:355 / 370
页数:16
相关论文
共 38 条
[1]   Prediction of diabetes mellitus based on boosting ensemble modeling [J].
Ali, Rahman ;
Siddiqi, Muhammad Hameed ;
Idris, Muhammad ;
Kang, Byeong Ho ;
Lee, Sungyoung .
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, 8867 :25-28
[2]   HMV: A medical decision support framework using multi-layer classifiers for disease prediction [J].
Bashir, Saba ;
Qamar, Usman ;
Khan, Farhan Hassan ;
Naseem, Lubna .
JOURNAL OF COMPUTATIONAL SCIENCE, 2016, 13 :10-25
[3]   IntelliHealth: A medical decision support application using a novel weighted multi-layer classifier ensemble framework [J].
Bashir, Saba ;
Qamar, Usman ;
Khan, Farhan Hassan .
JOURNAL OF BIOMEDICAL INFORMATICS, 2016, 59 :185-200
[4]   SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation [J].
Blewitt, Marnie E. ;
Gendrel, Anne-Valerie ;
Pang, Zhenyi ;
Sparrow, Duncan B. ;
Whitelaw, Nadia ;
Craig, Jeffrey M. ;
Apedaile, Anwyn ;
Hilton, Douglas J. ;
Dunwoodie, Sally L. ;
Brockdorff, Neil ;
Kay, Graham F. ;
Whitelaw, Emma .
NATURE GENETICS, 2008, 40 (05) :663-669
[5]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]  
Breiman L, 1996, MACH LEARN, V24, P123, DOI 10.1023/A:1018054314350
[7]   Predicting breast cancer survivability: a comparison of three data mining methods [J].
Delen, D ;
Walker, G ;
Kadam, A .
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2005, 34 (02) :113-127
[8]  
Demsar J, 2006, J MACH LEARN RES, V7, P1
[9]   Approximate statistical tests for comparing supervised classification learning algorithms [J].
Dietterich, TG .
NEURAL COMPUTATION, 1998, 10 (07) :1895-1923
[10]   MULTIPLE COMPARISONS USING RANK SUMS [J].
DUNN, OJ .
TECHNOMETRICS, 1964, 6 (03) :241-&