Towards an optimally pruned classifier ensemble

被引:8
作者
Bhardwaj, Manju [1 ]
Bhatnagar, Vasudha [2 ]
机构
[1] Univ Delhi, Maitreyi Coll, Delhi 110007, India
[2] Univ Delhi, Dept Comp Sci, Delhi 110007, India
关键词
Optimal ensemble; Ensemble pruning; Brute Force search; Hill climbing; DIVERSITY; PREDICTION; SELECTION;
D O I
10.1007/s13042-014-0303-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Ensemble pruning is an important area of research in multiple classifier systems. Reducing ensemble size, by selecting diverse and accurate classifiers from a given pool is a popular strategy to improve ensemble performance. In this paper, we present Accu-Prune (AP) algorithm, a majority voting ensemble that uses accuracy ordering and reduced error pruning approach to identify an optimal ensemble from a given pool of classifiers. At each step, the ensemble is extended by adding two lower accuracy classifiers, implicitly adding diversity to the ensemble. The proposed approach closely mimics the results of the Brute Force (BF) search for optimal ensemble, while reducing the search space drastically. We propose that quality of an ensemble is determined by two factors-size and accuracy. Ideally, smaller ensembles are qualitywise preferable over large ensembles with same accuracy. Based on this notion, we design a deficit function to quantify the quality differential between two arbitrary ensembles. The function examines the performance and size difference between two ensembles to quantify the quality differential. Experimentation has been carried out on 25 UCI datasets and AP algorithm has been compared with BF search and other pruning algorithms. The deficit function is used to compare AP with BF search and a well known pruning algorithm, EPIC. Relevant statistical tests reveal that the generalization capability of AP algorithm is better than forward search and backward elimination, comparable to BF search and slightly inferior to EPIC. EPIC ensembles being significantly large, the quality differential between AP and EPIC ensembles is not significant. Thus, for limited memory applications, with tolerance for small amount of error, AP ensembles may be more appropriate.
引用
收藏
页码:699 / 718
页数:20
相关论文
共 36 条
[11]  
Fu Qiang, 2005, Journal of Zhejiang University: Science, V6, P387, DOI [DOI 10.1631/JZUS.2005.A0387, DOI 10.1007/BF02839405, 10.1631/jzus.2005.A0387]
[12]  
Giacinto G., 2000, P ICPR2000, P3
[13]  
Hall M., 2009, SIGKDD Explorations, V11, P10, DOI DOI 10.1145/1656274.1656278
[14]   COMPOUND DIVERSITY FUNCTIONS FOR ENSEMBLE SELECTION [J].
Ko, Albert Hung-Ren ;
Sabourin, Robert ;
Britto, Alceu De Souza, Jr. .
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2009, 23 (04) :659-686
[15]  
Kuncheva L.I., 2005, Information Fusion, V6, P3, DOI [10.1016/j.inffus.2004.04.009, https://doi.org/10.1016/j.inffus.2004.04.009]
[16]   Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy [J].
Kuncheva, LI ;
Whitaker, CJ .
MACHINE LEARNING, 2003, 51 (02) :181-207
[17]  
Lam L, 2000, LECT NOTES COMPUT SC, V1857, P77
[18]   Real-time VBR video traffic prediction for dynamic bandwidth allocation [J].
Liang, Y .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2004, 34 (01) :32-47
[19]  
Lu Z., 2010, P 16 ACM SIGKDD INT, P871, DOI [10.1145/1835804.1835914, DOI 10.1145/1835804.1835914]
[20]  
Margineantu DD, 1997, P 14 INT C MACH LEAR, P211