Evolutionary under-sampling based bagging ensemble method for imbalanced data classification

被引:52
|
作者
Sun, Bo [1 ,2 ]
Chen, Haiyan [1 ,2 ]
Wang, Jiandong [1 ]
Xie, Hua [2 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing 210016, Jiangsu, Peoples R China
[2] Nanjing Univ Aeronaut & Astronaut, Natl Key Lab ATFM, Nanjing 211106, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
class imbalanced problem; under-sampling; bagging; evolutionary under-sampling; ensemble learning; machine learning; data mining; SUPPORT VECTOR MACHINES; DATA-SETS; SMOTE; CLASSIFIERS; STRATEGIES;
D O I
10.1007/s11704-016-5306-z
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the class imbalanced learning scenario, traditional machine learning algorithms focusing on optimizing the overall accuracy tend to achieve poor classification performance especially for the minority class in which we are most interested. To solve this problem, many effective approaches have been proposed. Among them, the bagging ensemble methods with integration of the under-sampling techniques have demonstrated better performance than some other ones including the bagging ensemble methods integrated with the over-sampling techniques, the cost-sensitive methods, etc. Although these under-sampling techniques promote the diversity among the generated base classifiers with the help of random partition or sampling for the majority class, they do not take any measure to ensure the individual classification performance, consequently affecting the achievability of better ensemble performance. On the other hand, evolutionary under-sampling EUS as a novel undersampling technique has been successfully applied in searching for the best majority class subset for training a good-performance nearest neighbor classifier. Inspired by EUS, in this paper, we try to introduce it into the under-sampling bagging framework and propose an EUS based bagging ensemble method EUS-Bag by designing a new fitness function considering three factors to make EUS better suited to the framework. With our fitness function, EUS-Bag could generate a set of accurate and diverse base classifiers. To verify the effectiveness of EUS-Bag, we conduct a series of comparison experiments on 22 two-class imbalanced classification problems. Experimental results measured using recall, geometric mean and AUC all demonstrate its superior performance.
引用
收藏
页码:331 / 350
页数:20
相关论文
共 50 条
  • [31] Framework for the Classification of Imbalanced Structured Data Using Under-sampling and Convolutional Neural Network
    Yoon Sang Lee
    Chulhwan Chris Bang
    Information Systems Frontiers, 2022, 24 : 1795 - 1809
  • [32] A multi-manifold learning based instance weighting and under-sampling for imbalanced data classification problems
    Tayyebe Feizi
    Mohammad Hossein Moattar
    Hamid Tabatabaee
    Journal of Big Data, 10
  • [33] A novel two-phase clustering-based under-sampling method for imbalanced classification problems
    Farshidvard, A.
    Hooshmand, F.
    MirHassani, S. A.
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 213
  • [34] A New Hybrid Under-sampling Approach to Imbalanced Classification Problems
    Peng, Chun-Yang
    Park, You-Jin
    APPLIED ARTIFICIAL INTELLIGENCE, 2022, 36 (01)
  • [35] A multi-manifold learning based instance weighting and under-sampling for imbalanced data classification problems
    Feizi, Tayyebe
    Moattar, Mohammad Hossein
    Tabatabaee, Hamid
    JOURNAL OF BIG DATA, 2023, 10 (01)
  • [36] Framework for the Classification of Imbalanced Structured Data Using Under-sampling and Convolutional Neural Network
    Lee, Yoon Sang
    Bang, Chulhwan Chris
    INFORMATION SYSTEMS FRONTIERS, 2022, 24 (06) : 1795 - 1809
  • [37] A Cluster-Based Under-Sampling Algorithm for Class-Imbalanced Data
    Guzman-Ponce, A.
    Valdovinos, R. M.
    Sanchez, J. S.
    HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, HAIS 2020, 2020, 12344 : 299 - 311
  • [38] Automatic incident detection algorithm based on under-sampling for imbalanced traffic data
    Li, Miao-hua
    Chen, Shu-yan
    Lao, Ye-chun
    GREEN BUILDING, ENVIRONMENT, ENERGY AND CIVIL ENGINEERING, 2017, : 145 - 150
  • [39] Imbalanced Data Classification Method Based on Ensemble Learning
    Xiang, Yu
    Xie, Yongping
    COMMUNICATIONS, SIGNAL PROCESSING, AND SYSTEMS, CSPS 2018, VOL III: SYSTEMS, 2020, 517 : 18 - 24
  • [40] An Evolutionary Sampling Approach for Classification with Imbalanced Data
    Fernandes, Everlandio R. Q.
    de Carvalho, Andre C. P. L. F.
    Coelho, Andre L. V.
    2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015,