Using boosting to prune bagging ensembles

被引:91
作者
Martinez-Munoz, Gonzalo [1 ]
Suarez, Alberto [1 ]
机构
[1] Univ Autonoma Madrid, Escuela Politecn Super, E-28049 Madrid, Spain
关键词
machine learning; decision trees; bagging; boosting; ensembles; ensemble pruning;
D O I
10.1016/j.patrec.2006.06.018
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Boosting is used to determine the order in which classifiers are aggregated in a bagging ensemble. Early stopping in the aggregation of the classifiers in the ordered bagging ensemble allows the identification of subensembles that require less memory for storage, classify faster and can improve the generalization accuracy of the original bagging ensemble. In all the classification problems investigated pruned ensembles with 20% of the original classifiers show statistically significant improvements over bagging. In problems where boosting is superior to bagging, these improvements are not sufficient to reach the accuracy of the corresponding boosting ensembles. However, ensemble pruning preserves the performance of bagging in noisy classification tasks, where boosting often has larger generalization errors. Therefore, pruned bagging should generally be preferred to complete bagging and, if no information about the level of noise is available, it is a robust alternative to AdaBoost. (c) 2006 Elsevier B.V. All rights reserved.
引用
收藏
页码:156 / 165
页数:10
相关论文
共 24 条
[1]   Clustering ensembles of neural network models [J].
Bakker, B ;
Heskes, T .
NEURAL NETWORKS, 2003, 16 (02) :261-269
[2]  
Banfield R. E., 2005, Information Fusion, V6, P49, DOI 10.1016/j.inffus.2004.04.005
[3]   An empirical comparison of voting classification algorithms: Bagging, boosting, and variants [J].
Bauer, E ;
Kohavi, R .
MACHINE LEARNING, 1999, 36 (1-2) :105-139
[4]  
Blake C.L., 1998, UCI repository of machine learning databases
[5]   SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation [J].
Blewitt, Marnie E. ;
Gendrel, Anne-Valerie ;
Pang, Zhenyi ;
Sparrow, Duncan B. ;
Whitelaw, Nadia ;
Craig, Jeffrey M. ;
Apedaile, Anwyn ;
Hilton, Douglas J. ;
Dunwoodie, Sally L. ;
Brockdorff, Neil ;
Kay, Graham F. ;
Whitelaw, Emma .
NATURE GENETICS, 2008, 40 (05) :663-669
[6]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[7]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[8]  
Breiman L, 1998, ANN STAT, V26, P801
[9]  
Breiman L, 1996, rapport technique n 460
[10]   Cost-conscious classifier ensembles [J].
Demir, C ;
Alpaydin, E .
PATTERN RECOGNITION LETTERS, 2005, 26 (14) :2206-2214