Adaptive Ensemble Undersampling-Boost: A novel learning framework for imbalanced data

被引:47
作者
Lu, Wei [1 ]
Li, Zhe [1 ]
Chu, Jinghui [1 ]
机构
[1] Tianjin Univ, Sch Elect Informat Engn, Tianjin 300072, Peoples R China
基金
中国国家自然科学基金;
关键词
Classification; Imbalanced data sets; Real Adaboost; Voting algorithm; Adaptive decision boundary; Ensemble Undersampling; CLASSIFICATION; ALGORITHMS; SUPPORT;
D O I
10.1016/j.jss.2017.07.006
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
As one of the most challenging and attractive problems in the pattern recognition and machine intelligence field, imbalanced classification has received a large amount of research attention for many years. In binary classification tasks, one class usually tends to be underrepresented when it consists of far fewer patterns than the other class, which results in undesirable classification results, especially for the minority class. Several techniques, including resampling, boosting and cost-sensitive methods have been proposed to alleviate this problem. Recently, some ensemble methods that focus on combining individual techniques to obtain better performance have been observed to present better classification performance on the minority class. In this paper, we propose a novel ensemble framework called Adaptive Ensemble Undersampling-Boost for imbalanced learning. Our proposal combines the Ensemble of Undersampling (EUS) technique, Real Adaboost, cost-sensitive weight modification, and adaptive boundary decision strategy to build a hybrid algorithm. The superiority of our method over other state-of-the-art ensemble methods is demonstrated by experiments on 18 real world data sets with various data distributions and different imbalance ratios. Given the experimental results and further analysis, our proposal is proven to be a promising alternative that can be applied to various imbalanced classification domains. (C) 2017 Elsevier Inc. All rights reserved.
引用
收藏
页码:272 / 282
页数:11
相关论文
共 50 条
[21]   Using Graph-Based Ensemble Learning to Classify Imbalanced Data [J].
Qin, Anyong ;
Shang, Zhaowei ;
Tian, Jinyu ;
Zhang, Taiping ;
Wang, Yulong ;
Tang, Yuan Yan .
2017 3RD IEEE INTERNATIONAL CONFERENCE ON CYBERNETICS (CYBCONF), 2017, :265-270
[22]   A synthetic neighborhood generation based ensemble learning for the imbalanced data classification [J].
Zhi Chen ;
Tao Lin ;
Xin Xia ;
Hongyan Xu ;
Sha Ding .
Applied Intelligence, 2018, 48 :2441-2457
[23]   A Heterogeneous AdaBoost Ensemble Based Extreme Learning Machines for Imbalanced Data [J].
Abuassba, Adnan Omer ;
Zhang, Dezheng ;
Luo, Xiong .
INTERNATIONAL JOURNAL OF COGNITIVE INFORMATICS AND NATURAL INTELLIGENCE, 2019, 13 (03) :19-35
[24]   Imbalanced Learning of Fault Data Combined with Cloud Model and Ensemble Classification [J].
Ma S. ;
Zhao R. ;
Wu Y. .
Zhendong Ceshi Yu Zhenduan/Journal of Vibration, Measurement and Diagnosis, 2023, 43 (06) :1114-1120and1243
[25]   Ensemble learning based predictive modelling on a highly imbalanced multiclass data [J].
Vasti, Manka ;
Dev, Amita .
JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES, 2024, 45 (08) :2141-2164
[26]   A Genetic-Based Ensemble Learning Applied to Imbalanced Data Classification [J].
Klikowski, Jakub ;
Ksieniewicz, Pawel ;
Wozniak, Michal .
INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING (IDEAL 2019), PT II, 2019, 11872 :340-352
[27]   A Robust Enhanced Ensemble Learning Method for Breast Cancer Data Diagnosis on Imbalanced Data [J].
Wang, Zhenzhen ;
Xie, Junde ;
Zhang, Jia .
IEEE ACCESS, 2024, 12 :189776-189788
[28]   Tree-based space partition and merging ensemble learning framework for imbalanced problems [J].
Zhu, Zonghai ;
Wang, Zhe ;
Li, Dongdong ;
Du, Wenli .
INFORMATION SCIENCES, 2019, 503 :1-22
[29]   A Framework of Online Learning with Imbalanced Streaming Data [J].
Yan, Yan ;
Yang, Tianbao ;
Yang, Yi ;
Chen, Jianhui .
THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, :2817-2823
[30]   An Ensemble Learning Algorithm Based on Density Peaks Clustering and Fitness for Imbalanced Data [J].
Xu, Hui ;
Liu, Qicheng .
IEEE ACCESS, 2022, 10 :116120-116128