Leveraging ensemble pruning for imbalanced data classification

被引:4
|
作者
Krawczyk, Bartosz [1 ]
Wozniak, Michal [2 ]
机构
[1] Virginia Commonwealth Univ, Dept Comp Sci, Med Coll Virginia Campus, Richmond, VA 23284 USA
[2] Wroclaw Univ Sci & Technol, Dept Syst & Comp Networks, Wroclaw, Poland
来源
2018 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC) | 2018年
关键词
machine learning; imbalanced data; ensemble learning; ensemble pruning; CLASSIFIERS; PERFORMANCE; DIVERSITY;
D O I
10.1109/SMC.2018.00084
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The effectiveness of machine learning algorithms depends on the quality of the supplied training data. Any problems embedded in the nature of data will result in obtaining incorrect classification models, especially imbalanced data distribution is among the most significant learning difficulties that can affect classifiers. As one of the classes has much more instances than the other, the learning process becomes biased towards it. Therefore, methods for alleviating the impact of skewed distributions are highly sought after. Ensemble learning has emerged as one of the leading paradigms for imbalanced data. Creation of an efficient pool of classifiers is not a trivial task and one needs to carefully select which classifiers should be combined to obtain the best predictive power. In this paper, we propose a compound ensemble pruning algorithm for imbalanced data. It aims to retain classifiers that offer the best performance on both minority and majority classes, and display a high level of diversity. Remaining learners are discarded from the pool. This is achieved by the means of a multi-criteria evolutionary algorithm. Extensive experimental study show that our proposal is able to create smaller ensembles than the state-of-the-art methods, while offering an improved robustness to imbalanced class distributions.
引用
收藏
页码:439 / 444
页数:6
相关论文
共 50 条
  • [1] Pruning support vectors for imbalanced data classification
    Chen, XW
    Gerlach, B
    Casasent, D
    PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), VOLS 1-5, 2005, : 1883 - 1888
  • [2] Ensemble Approach for the Classification of Imbalanced Data
    Nikulin, Vladimir
    McLachlan, Geoffrey J.
    Ng, Shu Kay
    AI 2009: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2009, 5866 : 291 - +
  • [3] Dynamic Ensemble Framework for Imbalanced Data Classification
    Zhu, Tuanfei
    Hu, Xingchen
    Liu, Xinwang
    Zhu, En
    Zhu, Xinzhong
    Xu, Huiying
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2025, 37 (05) : 2456 - 2471
  • [4] An Improved Ensemble Learning for Imbalanced Data Classification
    Yuan, Zhengwu
    Zhao, Pu
    PROCEEDINGS OF 2019 IEEE 8TH JOINT INTERNATIONAL INFORMATION TECHNOLOGY AND ARTIFICIAL INTELLIGENCE CONFERENCE (ITAIC 2019), 2019, : 408 - 411
  • [5] An Ensemble Tree Classifier for Highly Imbalanced Data Classification
    Shi, Peibei
    Wang, Zhong
    JOURNAL OF SYSTEMS SCIENCE & COMPLEXITY, 2021, 34 (06) : 2250 - 2266
  • [6] Imbalanced Data Classification Using Weighted Voting Ensemble
    Lu, Lin
    Wozniak, Michal
    IMAGE PROCESSING AND COMMUNICATIONS: TECHNIQUES, ALGORITHMS AND APPLICATIONS, 2020, 1062 : 82 - 91
  • [7] An Ensemble Tree Classifier for Highly Imbalanced Data Classification
    SHI Peibei
    WANG Zhong
    JournalofSystemsScience&Complexity, 2021, 34 (06) : 2250 - 2266
  • [8] Imbalanced Data Classification Method Based on Ensemble Learning
    Xiang, Yu
    Xie, Yongping
    COMMUNICATIONS, SIGNAL PROCESSING, AND SYSTEMS, CSPS 2018, VOL III: SYSTEMS, 2020, 517 : 18 - 24
  • [9] Adaptive ensemble of classifiers with regularization for imbalanced data classification
    Wang, Chen
    Deng, Chengyuan
    Yu, Zhoulu
    Hui, Dafeng
    Gong, Xiaofeng
    Luo, Ruisen
    INFORMATION FUSION, 2021, 69 : 81 - 102
  • [10] An Ensemble Tree Classifier for Highly Imbalanced Data Classification
    Peibei Shi
    Zhong Wang
    Journal of Systems Science and Complexity, 2021, 34 : 2250 - 2266