Considering diversity and accuracy simultaneously for ensemble pruning

被引:65
作者
Dai, Qun [1 ]
Ye, Rui [1 ]
Liu, Zhuan [1 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing 211106, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Ensemble selection; Greedy ensemble pruning (GEP) algorithm; Simultaneous diversity & accuracy (SDAcc); Diversity-focused-two (DFTwo); Accuracy-reinforcement (AccRein); SELECTION; CLASSIFIERS; ALGORITHM; MACHINE; ERROR;
D O I
10.1016/j.asoc.2017.04.058
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Diversity among individual classifiers is widely recognized to be a key factor to successful ensemble selection, while the ultimate goal of ensemble pruning is to improve its predictive accuracy. Diversity and accuracy are two important properties of an ensemble. Existing ensemble pruning methods always consider diversity and accuracy separately. However, in contrast, the two closely interrelate with each other, and should be considered simultaneously. Accordingly, three new measures, i.e., Simultaneous Diversity & Accuracy, Diversity-Focused-Two and Accuracy-Reinforcement, are developed for pruning the ensemble by greedy algorithm. The motivation for Simultaneous Diversity & Accuracy is to consider the difference between the subensemble and the candidate classifier, and simultaneously, to consider the accuracy of both of them. With Simultaneous Diversity & Accuracy, those difficult samples are not given up so as to further improve the generalization performance of the ensemble. The inspiration of devising Diversity-Focused-Two stems from the cognition that ensemble diversity attaches more importance to the difference among the classifiers in an ensemble. Finally, the proposal of Accuracy-Reinforcement reinforces the concern about ensemble accuracy. Extensive experiments verified the effectiveness and efficiency of the proposed three pruning measures. Through the investigation of this work, it is found that by considering diversity and accuracy simultaneously for ensemble pruning, well-performed selective ensemble with superior generalization capability can be acquired, which is the scientific value of this paper. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:75 / 91
页数:17
相关论文
共 51 条
[21]   Empirical analysis and evaluation of approximate techniques for pruning regression bagging ensembles [J].
Hernandez-Lobato, Daniel ;
Martinez-Munoz, Gonzalo ;
Suarez, Alberto .
NEUROCOMPUTING, 2011, 74 (12-13) :2250-2264
[22]   Extreme learning machine: Theory and applications [J].
Huang, Guang-Bin ;
Zhu, Qin-Yu ;
Siew, Chee-Kheong .
NEUROCOMPUTING, 2006, 70 (1-3) :489-501
[23]   Intelligent churn prediction in telecom: employing mRMR feature selection and RotBoost based ensemble classification [J].
Idris, Adnan ;
Khan, Asifullah ;
Lee, Yeon Soo .
APPLIED INTELLIGENCE, 2013, 39 (03) :659-672
[24]   A design framework for hierarchical ensemble of multiple feature extractors and multiple classifiers [J].
Kim, Kyounghoon ;
Lin, Helin ;
Choi, Jin Young ;
Choi, Kiyoung .
PATTERN RECOGNITION, 2016, 52 :1-16
[25]   Confidence ratio affinity propagation in ensemble selection of neural network classifiers for distributed privacy-preserving data mining [J].
Kokkinos, Yiannis ;
Margaritis, Konstantinos G. .
NEUROCOMPUTING, 2015, 150 :513-528
[26]   Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy [J].
Kuncheva, LI ;
Whitaker, CJ .
MACHINE LEARNING, 2003, 51 (02) :181-207
[27]   LibD3C: Ensemble classifiers with a clustering and dynamic selection strategy [J].
Lin, Chen ;
Chen, Wenqiang ;
Qiu, Cheng ;
Wu, Yunfeng ;
Krishnan, Sridhar ;
Zou, Quan .
NEUROCOMPUTING, 2014, 123 :424-435
[28]   Ensemble selection by GRASP [J].
Liu, Zhuan ;
Dai, Qun ;
Liu, Ningzhong .
APPLIED INTELLIGENCE, 2014, 41 (01) :128-144
[29]   Optimal selection of ensemble classifiers using measures of competence and diversity of base classifiers [J].
Lysiak, Rafal ;
Kurzynski, Marek ;
Woloszynski, Tomasz .
NEUROCOMPUTING, 2014, 126 :29-35
[30]  
Margineantu DD., 1997, PAPER PRESENTED ICML