Classifier Ensemble Based on Multiview Optimization for High-Dimensional Imbalanced Data Classification

被引:15
|
作者
Xu, Yuhong [1 ]
Yu, Zhiwen [1 ]
Chen, C. L. Philip [1 ]
机构
[1] South China Univ Technol, Sch Comp Sci & Engn, Guangzhou 510006, Peoples R China
关键词
Feature extraction; Optimization; Costs; Learning systems; Diversity reception; Data mining; Convolutional neural networks; Class imbalanced data; classification; ensemble learning; high-dimensional data; subview optimization; DATA-SETS; FEATURE-SELECTION; SMOTE; PERFORMANCE; PREDICTION; DIVERSITY; MACHINE; IMPROVE;
D O I
10.1109/TNNLS.2022.3177695
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
High-dimensional class imbalanced data have plagued the performance of classification algorithms seriously. Because of a large number of redundant/invalid features and the class imbalanced issue, it is difficult to construct an optimal classifier for high-dimensional imbalanced data. Classifier ensemble has attracted intensive attention since it can achieve better performance than an individual classifier. In this work, we propose a multiview optimization (MVO) to learn more effective and robust features from high-dimensional imbalanced data, based on which an accurate and robust ensemble system is designed. Specifically, an optimized subview generation (OSG) in MVO is first proposed to generate multiple optimized subviews from different scenarios, which can strengthen the classification ability of features and increase the diversity of ensemble members simultaneously. Second, a new evaluation criterion that considers the distribution of data in each optimized subview is developed based on which a selective ensemble of optimized subviews (SEOS) is designed to perform the subview selective ensemble. Finally, an oversampling approach is executed on the optimized view to obtain a new class rebalanced subset for the classifier. Experimental results on 25 high-dimensional class imbalanced datasets indicate that the proposed method outperforms other mainstream classifier ensemble methods.
引用
收藏
页码:870 / 883
页数:14
相关论文
共 50 条
  • [21] An Efficient Extraction-based Bagging Ensemble for High-dimensional data classification
    Huang, Hsiao-Yun
    Li, Yen-Chieh
    6TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS, AND THE 13TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS, 2012, : 1557 - 1560
  • [22] Research of Medical High-dimensional Imbalanced Data Classification-Ensemble Feature Selection Algorithm with Random Forest
    Zhu, Min
    Su, Bo
    Ning, Gangmin
    2017 INTERNATIONAL CONFERENCE ON SMART GRID AND ELECTRICAL AUTOMATION (ICSGEA), 2017, : 273 - 277
  • [23] A novel ensemble method for high-dimensional genomic data classification
    Espichan, Alexandra
    Villanueva, Edwin
    PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2018, : 2229 - 2236
  • [24] Ensemble of penalized logistic models for classification of high-dimensional data
    Ijaz, Musarrat
    Asghar, Zahid
    Gul, Asma
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2021, 50 (07) : 2072 - 2088
  • [25] Classification of High-Dimensional Data with Ensemble of Logistic Regression Models
    Lim, Noha
    Ahn, Hongshik
    Moon, Hojin
    Chen, James J.
    JOURNAL OF BIOPHARMACEUTICAL STATISTICS, 2010, 20 (01) : 160 - 171
  • [26] Multinomial naive Bayesian classifier with generalized Dirichlet priors for high-dimensional imbalanced data
    Wong, Tzu-Tsung
    Tsai, Hsing-Chen
    KNOWLEDGE-BASED SYSTEMS, 2021, 228
  • [27] New hard-thresholding rules based on data splitting in high-dimensional imbalanced classification
    Mojiri, Arezou
    Khalili, Abbas
    Hamadani, Ali Zeinal
    ELECTRONIC JOURNAL OF STATISTICS, 2022, 16 (01): : 814 - 861
  • [28] High-dimensional imbalanced biomedical data classification based on P-AdaBoost-PAUC algorithm
    Li, Xiao
    Li, Kewen
    JOURNAL OF SUPERCOMPUTING, 2022, 78 (14): : 16581 - 16604
  • [29] Research On Classification Method Of High-Dimensional Class-Imbalanced Data Sets Based On SVM
    Zhang, Chunkai
    Guo, Jianwei
    Lu, Junru
    2017 IEEE SECOND INTERNATIONAL CONFERENCE ON DATA SCIENCE IN CYBERSPACE (DSC), 2017, : 60 - 67
  • [30] High-dimensional imbalanced biomedical data classification based on P-AdaBoost-PAUC algorithm
    Xiao Li
    Kewen Li
    The Journal of Supercomputing, 2022, 78 : 16581 - 16604