Rotation forest:: A new classifier ensemble method

被引:1403
作者
Rodriguez, Juan J.
Kuncheva, Ludmila I.
机构
[1] Univ Burgos, Escuela Politecn Super, Burgos 09006, Spain
[2] Univ Wales, Sch Informat, Bangor LL57 1UT, Gwynedd, Wales
[3] Univ Valladolid, Dept Informat, Escuela Tecn Super Ingn Informat, E-47011 Valladolid, Spain
关键词
classifier ensembles; AdaBoost; bagging; random forest; feature extraction; PCA; kappa-error diagrams;
D O I
10.1109/TPAMI.2006.211
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a method for generating classifier ensembles based on feature extraction. To create the training data for a base classifier, the feature set is randomly split into K subsets (K is a parameter of the algorithm) and Principal Component Analysis (PCA) is applied to each subset. All principal components are retained in order to preserve the variability information in the data. Thus, K axis rotations take place to form the new features for a base classifier. The idea of the rotation approach is to encourage simultaneously individual accuracy and diversity within the ensemble. Diversity is promoted through the feature extraction for each base classifier. Decision trees were chosen here because they are sensitive to rotation of the feature axes, hence the name "forest." Accuracy is sought by keeping all principal components and also using the whole data set to train each base classifier. Using WEKA, we examined the Rotation Forest ensemble on a random selection of 33 benchmark data sets from the UCI repository and compared it with Bagging, AdaBoost, and Random Forest. The results were favorable to Rotation Forest and prompted an investigation into diversity-accuracy landscape of the ensemble models. Diversity-error diagrams revealed that Rotation Forest ensembles construct individual classifiers which are more accurate than these in AdaBoost and Random Forest, and more diverse than these in Bagging, sometimes more accurate as well.
引用
收藏
页码:1619 / 1630
页数:12
相关论文
共 47 条
[1]   Reducing multiclass to binary: A unifying approach for margin classifiers [J].
Allwein, EL ;
Schapire, RE ;
Singer, Y .
JOURNAL OF MACHINE LEARNING RESEARCH, 2001, 1 (02) :113-141
[2]  
[Anonymous], MACHINE LEARNING
[3]  
BANFIELD RE, 2004, P 5 INT WORKSH MULT
[4]   An empirical comparison of voting classification algorithms: Bagging, boosting, and variants [J].
Bauer, E ;
Kohavi, R .
MACHINE LEARNING, 1999, 36 (1-2) :105-139
[5]  
Blake C.L., 1998, UCI repository of machine learning databases
[6]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[7]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[8]  
Breiman L, 1998, ANN STAT, V26, P801
[9]   Ensemble methods in machine learning [J].
Dietterich, TG .
MULTIPLE CLASSIFIER SYSTEMS, 2000, 1857 :1-15
[10]  
Fern C. E., 2003, P 20 INT C MACH LEAN, P186