A Novel Classifier Ensemble Method Based on Subspace Enhancement for High-Dimensional Data Classification

被引:23
作者
Xu, Yuhong [1 ]
Yu, Zhiwen [1 ]
Cao, Wenming [1 ]
Chen, C. L. Philip [1 ]
机构
[1] South China Univ Technol, Sch Comp Sci & Engn, Guangzhou 510006, Guangdong, Peoples R China
关键词
Ensemble learning; feature transformation; subspace enhancement; high-dimensional data; classification; GENE-EXPRESSION; FEATURE-SELECTION; FACE RECOGNITION; VARIABLE SELECTION; ROTATION FOREST; NEURAL-NETWORKS; REGRESSION; LDA; ILLUMINATION; DIVERSITY;
D O I
10.1109/TKDE.2021.3087517
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
High-dimensional small-size data seriously affects the performance of classifiers. By combining classifiers, ensemble learning obtains higher accuracy and more robust predictions. However, these classifier ensemble methods suffer from several limitations: 1) ensemble with sample space suffers from noise and redundant features; 2) constructing sample subspace on small-size data leads to an insufficient description of sample space; 3) ensemble with random feature subspace leads to information loss, which will degrade the performance of classifiers; 4) most ensemble methods implement directly on the original feature space, which is defective in high-dimensional data with redundant and noisy features. To overcome the above limitations, a new classifier ensemble method based on subspace enhancement (CESE) is proposed for high-dimensional data classification. First, a superior subspace enhancement scheme (SSE) is designed to effectively implement feature selection and transformation for high-dimensional data, followed by generating multiple superior feature subspaces with diversity and discrimination, which enhances the representative ability of features. Second, we develop a mixed space enhancement process (MSE) based on multiscale rotation reconstruction and various subspace enhanced features of SSE. By using MSE, an effective feature fusion is constructed to obtain more diverse features. Furthermore, to improve the capability of our method, we design various feature combination strategies for enhanced features from both SSE and MSE. Comparative results on 33 high-dimensional data sets indicate that our approach CESE outperforms different mainstream integrated systems.
引用
收藏
页码:16 / 30
页数:15
相关论文
共 101 条
[1]   Forest CERN: A New Decision Forest Building Technique [J].
Adnan, Md. Nasim ;
Islam, Md. Zahidul .
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2016, PT I, 2016, 9651 :304-315
[2]  
Adnan MN, 2014, 2014 17TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), P25, DOI 10.1109/ICCITechn.2014.7073129
[3]   Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling [J].
Alizadeh, AA ;
Eisen, MB ;
Davis, RE ;
Ma, C ;
Lossos, IS ;
Rosenwald, A ;
Boldrick, JG ;
Sabet, H ;
Tran, T ;
Yu, X ;
Powell, JI ;
Yang, LM ;
Marti, GE ;
Moore, T ;
Hudson, J ;
Lu, LS ;
Lewis, DB ;
Tibshirani, R ;
Sherlock, G ;
Chan, WC ;
Greiner, TC ;
Weisenburger, DD ;
Armitage, JO ;
Warnke, R ;
Levy, R ;
Wilson, W ;
Grever, MR ;
Byrd, JC ;
Botstein, D ;
Brown, PO ;
Staudt, LM .
NATURE, 2000, 403 (6769) :503-511
[4]   Classifier Ensembles with the Extended Space Forest [J].
Amasyali, Mehmet Fatih ;
Ersoy, Okan K. .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (03) :549-562
[5]  
[Anonymous], 2009, INT C MACHINE LEARNI, DOI DOI 10.1145/1553374.1553520
[6]  
Asuncion A, 2007, UCI machine learning repository
[7]   A comparison of decision tree ensemble creation techniques [J].
Banfield, Robert E. ;
Hall, Lawrence O. ;
Bowyer, Kevin W. ;
Kegelmeyer, W. P. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2007, 29 (01) :173-180
[8]   Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection [J].
Belhumeur, PN ;
Hespanha, JP ;
Kriegman, DJ .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1997, 19 (07) :711-720
[9]   Speeding up incremental wrapper feature subset selection with Naive Bayes classifier [J].
Bermejo, Pablo ;
Gamez, Jose A. ;
Puerta, Jose M. .
KNOWLEDGE-BASED SYSTEMS, 2014, 55 :140-147
[10]   Some theory for Fisher's linear discriminant function, 'naive Bayes', and some alternatives when there are many more variables than observations [J].
Bickel, PJ ;
Levina, E .
BERNOULLI, 2004, 10 (06) :989-1010