Evolutionary under-sampling based bagging ensemble method for imbalanced data classification

被引:52
|
作者
Sun, Bo [1 ,2 ]
Chen, Haiyan [1 ,2 ]
Wang, Jiandong [1 ]
Xie, Hua [2 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing 210016, Jiangsu, Peoples R China
[2] Nanjing Univ Aeronaut & Astronaut, Natl Key Lab ATFM, Nanjing 211106, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
class imbalanced problem; under-sampling; bagging; evolutionary under-sampling; ensemble learning; machine learning; data mining; SUPPORT VECTOR MACHINES; DATA-SETS; SMOTE; CLASSIFIERS; STRATEGIES;
D O I
10.1007/s11704-016-5306-z
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the class imbalanced learning scenario, traditional machine learning algorithms focusing on optimizing the overall accuracy tend to achieve poor classification performance especially for the minority class in which we are most interested. To solve this problem, many effective approaches have been proposed. Among them, the bagging ensemble methods with integration of the under-sampling techniques have demonstrated better performance than some other ones including the bagging ensemble methods integrated with the over-sampling techniques, the cost-sensitive methods, etc. Although these under-sampling techniques promote the diversity among the generated base classifiers with the help of random partition or sampling for the majority class, they do not take any measure to ensure the individual classification performance, consequently affecting the achievability of better ensemble performance. On the other hand, evolutionary under-sampling EUS as a novel undersampling technique has been successfully applied in searching for the best majority class subset for training a good-performance nearest neighbor classifier. Inspired by EUS, in this paper, we try to introduce it into the under-sampling bagging framework and propose an EUS based bagging ensemble method EUS-Bag by designing a new fitness function considering three factors to make EUS better suited to the framework. With our fitness function, EUS-Bag could generate a set of accurate and diverse base classifiers. To verify the effectiveness of EUS-Bag, we conduct a series of comparison experiments on 22 two-class imbalanced classification problems. Experimental results measured using recall, geometric mean and AUC all demonstrate its superior performance.
引用
收藏
页码:331 / 350
页数:20
相关论文
共 50 条
  • [21] A binary PSO-based ensemble under-sampling model for rebalancing imbalanced training data
    Li, Jinyan
    Wu, Yaoyang
    Fong, Simon
    Tallon-Ballesteros, Antonio J.
    Yang, Xin-she
    Mohammed, Sabah
    Wu, Feng
    JOURNAL OF SUPERCOMPUTING, 2022, 78 (05): : 7428 - 7463
  • [22] KA-Ensemble: towards imbalanced image classification ensembling under-sampling and over-sampling
    Hao Ding
    Bin Wei
    Zhaorui Gu
    Zhibin Yu
    Haiyong Zheng
    Bing Zheng
    Juan Li
    Multimedia Tools and Applications, 2020, 79 : 14871 - 14888
  • [23] KA-Ensemble: towards imbalanced image classification ensembling under-sampling and over-sampling
    Ding, Hao
    Wei, Bin
    Gu, Zhaorui
    Yu, Zhibin
    Zheng, Haiyong
    Zheng, Bing
    Li, Juan
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (21-22) : 14871 - 14888
  • [25] CUSBoost: Cluster-based Under-sampling with Boosting for Imbalanced Classification
    Rayhan, Farshid
    Ahmed, Sajid
    Mahbub, Asif
    Jani, Md. Rafsan
    Shatabda, Swakkhar
    Farid, Dewan Md.
    2017 2ND INTERNATIONAL CONFERENCE ON COMPUTATIONAL SYSTEMS AND INFORMATION TECHNOLOGY FOR SUSTAINABLE SOLUTION (CSITSS-2017), 2017, : 70 - 75
  • [26] Bagging of Xgboost Classifiers with Random Under-sampling and Tomek Link for Noisy Label-imbalanced Data
    Luo Ruisen
    Dian Songyi
    Wang Chen
    Cheng Peng
    Tang Zuodong
    Yu YanMei
    Wang Shixiong
    3RD INTERNATIONAL CONFERENCE ON AUTOMATION, CONTROL AND ROBOTICS ENGINEERING (CACRE 2018), 2018, 428
  • [27] An Under-sampling Method Based on Fuzzy Logic for Large Imbalanced Dataset
    Wong, Ginny Y.
    Leung, Frank H. F.
    Ling, Sai-Ho
    2014 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2014, : 1248 - 1252
  • [28] Cluster-based under-sampling approaches for imbalanced data distributions
    Yen, Show-Jane
    Lee, Yue-Shi
    EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (03) : 5718 - 5727
  • [29] A Noise-Filtered Under-Sampling Scheme for Imbalanced Classification
    Kang, Qi
    Chen, XiaoShuang
    Li, Sisi
    Zhou, MengChu
    IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (12) : 4263 - 4274
  • [30] A Selective Under-Sampling (SUS) Method For Imbalanced Regression
    Aleksic, Jovana
    Garcia-Remesal, Miguel
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2025, 82 : 111 - 136