A survey on imbalanced learning: latest research, applications and future directions

被引:101
作者
Chen, Wuxing [1 ,2 ]
Yang, Kaixiang [3 ]
Yu, Zhiwen [3 ]
Shi, Yifan [4 ]
Chen, C. L. Philip [3 ]
机构
[1] South China Univ Technol, Sch Future Technol, Guangzhou 511442, Guangdong, Peoples R China
[2] Peng Cheng Lab, Shenzhen 518066, Guangdong, Peoples R China
[3] South China Univ Technol, Sch Comp Sci & Engn, Guangzhou 510006, Guangdong, Peoples R China
[4] Huaqiao Univ, Coll Engn, Quanzhou 362021, Fujian, Peoples R China
基金
中国国家自然科学基金;
关键词
Imbalanced learning; Ensemble learning; Multiclass imbalanced learning; Machine learning; Imbalance regression; Long-tailed learning; DATA CLASSIFICATION; ENSEMBLE; SMOTE; DATASETS; REGRESSION; ALGORITHM; SELECTION;
D O I
10.1007/s10462-024-10759-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Imbalanced learning constitutes one of the most formidable challenges within data mining and machine learning. Despite continuous research advancement over the past decades, learning from data with an imbalanced class distribution remains a compelling research area. Imbalanced class distributions commonly constrain the practical utility of machine learning and even deep learning models in tangible applications. Numerous recent studies have made substantial progress in the field of imbalanced learning, deepening our understanding of its nature while concurrently unearthing new challenges. Given the field's rapid evolution, this paper aims to encapsulate the recent breakthroughs in imbalanced learning by providing an in-depth review of extant strategies to confront this issue. Unlike most surveys that primarily address classification tasks in machine learning, we also delve into techniques addressing regression tasks and facets of deep long-tail learning. Furthermore, we explore real-world applications of imbalanced learning, devising a broad spectrum of research applications from management science to engineering, and lastly, discuss newly-emerging issues and challenges necessitating further exploration in the realm of imbalanced learning.
引用
收藏
页数:51
相关论文
共 219 条
[1]   Combining weighted SMOTE with ensemble learning for the class-imbalanced prediction of small business credit risk [J].
Abedin, Mohammad Zoynul ;
Guotai, Chi ;
Hajek, Petr ;
Zhang, Tong .
COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (04) :3559-3579
[2]   Concept Drift Detection in Data Stream Mining : A literature review [J].
Agrahari, Supriya ;
Singh, Anil Kumar .
JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (10) :9523-9540
[3]   A survey on learning from imbalanced data streams: taxonomy, challenges, empirical study, and reproducible experimental framework [J].
Aguiar, Gabriel ;
Krawczyk, Bartosz ;
Cano, Alberto .
MACHINE LEARNING, 2024, 113 (07) :4165-4243
[4]   An active learning budget-based oversampling approach for partially labeled multi-class imbalanced data streams [J].
Aguiar, Gabriel J. ;
Cano, Alberto .
38TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2023, 2023, :382-389
[5]  
Ahsan Reza, 2022, Informatics in Medicine Unlocked, DOI 10.1016/j.imu.2022.100860
[6]   Applying support vector machines to imbalanced datasets [J].
Akbani, R ;
Kwek, S ;
Japkowicz, N .
MACHINE LEARNING: ECML 2004, PROCEEDINGS, 2004, 3201 :39-50
[7]   Cost-sensitive Risk Induced Bayesian Inference Bagging (RIBIB) for credit card fraud detection [J].
Akila, S. ;
Reddy, Srinivasulu U. .
JOURNAL OF COMPUTATIONAL SCIENCE, 2018, 27 :247-254
[8]  
Alfhaid MA., 2021, Artif Intell, V9, P36
[9]  
Almas A., 2012, 2012 Seventh International Conference on Digital Information Management (ICDIM 2012), P7, DOI 10.1109/ICDIM.2012.6360115
[10]   Biased Random Forest For Dealing With the Class Imbalance Problem [J].
Bader-El-Den, Mohammed ;
Teitei, Eleman ;
Perry, Todd .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (07) :2163-2172