Multi-class WHMBoost: An ensemble algorithm for multi-class imbalanced data

被引:2
作者
Zhao, Jiakun [1 ]
Jin, Ju [1 ]
Zhang, Yibo [1 ]
Zhang, Ruifeng [1 ]
Chen, Si [1 ]
机构
[1] Xi An Jiao Tong Univ, Sch Software Engn, Xian, Shaanxi, Peoples R China
关键词
multi-class; imbalanced data; ensemble method; random balance based on average size; CLASSIFICATION;
D O I
10.3233/IDA-215874
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The imbalanced data problem is widespread in the real world. In the process of training machine learning models, ignoring imbalanced data problems will cause the performance of the model to deteriorate. At present, researchers have proposed many methods to deal with the imbalanced data problems, but these methods mainly focus on the imbalanced data problems in two-class classification tasks. Learning from multi-class imbalanced data sets is still an open problem. In this paper, an ensemble method for classifying multi-class imbalanced data sets is put forward, called multi-class WHMBoost. It is an extension of WHMBoost that we proposed earlier. We do not use the algorithm used in WHMBoost to process the data, but use random balance based on average size so as to balance the data distribution. The weak classifiers we use in the boosting algorithm are support vector machine and decision tree classifier. In the process of training the model, they participate in training with given weights in order to complement each other's advantages. On 18 multi-class imbalanced data sets, we compared the performance of multi-class WHMBoost with state of the art ensemble algorithms using MAUC, MG-mean and MMCC as evaluation criteria. The results demonstrate that it has obvious advantages compared with state of the art ensemble algorithms and can effectively deal with multi-class imbalanced data sets.
引用
收藏
页码:599 / 614
页数:16
相关论文
共 50 条
[31]   Performance Analysis of Binarization Strategies for Multi-class Imbalanced Data Classification [J].
Zak, Michal ;
Wozniak, Michal .
COMPUTATIONAL SCIENCE - ICCS 2020, PT IV, 2020, 12140 :141-155
[32]   Evidential Hybrid Re-sampling for Multi-class Imbalanced Data [J].
Grina, Fares ;
Elouedi, Zied ;
Lefevre, Eric .
INFORMATION PROCESSING AND MANAGEMENT OF UNCERTAINTY IN KNOWLEDGE-BASED SYSTEMS, IPMU 2022, PT II, 2022, 1602 :612-623
[33]   Combination of Multi-class SVM and Multi-class NDA for Face Recognition [J].
Abbasnejad, Iman ;
Zomorodian, M. Javad ;
Yazdi, Ehsan Tabatabaei .
2012 19TH INTERNATIONAL CONFERENCE MECHATRONICS AND MACHINE VISION IN PRACTICE (M2VIP), 2012, :408-413
[34]   A Stacked Ensemble Deep Learning Approach for Imbalanced Multi-Class Water Quality Index Prediction [J].
Wong, Wen Yee ;
Hasikin, Khairunnisa ;
Khairuddin, Anis Salwa Mohd ;
Razak, Sarah Abdul ;
Hizaddin, Hanee Farzana ;
Mokhtar, Mohd Istajib ;
Azizan, Muhammad Mokhzaini .
CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 76 (02) :1361-1384
[35]   Interval-based sparse ensemble multi-class classification algorithm for terahertz data [J].
Zheng, Chengyong ;
Zha, Xiaowen ;
Cai, Shengjie ;
Cui, Jing ;
Li, Qian ;
Ye, Zhijing .
HELIYON, 2024, 10 (06)
[36]   Logical Analysis of Multi-Class Data [J].
Felix Avila-Herrera, Juan ;
Subasi, Munevver Mine .
2015 XLI LATIN AMERICAN COMPUTING CONFERENCE (CLEI), 2015, :276-285
[37]   Enhancing Classification Performance of Multi-Class Imbalanced Data Using the OAA-DB Algorithm [J].
Jeatrakul, Piyasak ;
Wong, Kok Wai .
2012 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2012,
[38]   Re-sampling of multi-class imbalanced data using belief function theory and ensemble learning [J].
Grina, Fares ;
Elouedi, Zied ;
Lefevre, Eric .
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2023, 156 :1-15
[39]   A Multi-Class Cost Sensitivity AdaBoost Algorithm Using Multi-Class Cost Exponential Loss Function [J].
Zhai X. ;
Wang X. ;
Li R. ;
Jia Q. .
Hsi-An Chiao Tung Ta Hsueh/Journal of Xi'an Jiaotong University, 2017, 51 (08) :33-39
[40]   Multi-class AdaBoost [J].
Zhu, Ji ;
Zou, Hui ;
Rosset, Saharon ;
Hastie, Trevor .
STATISTICS AND ITS INTERFACE, 2009, 2 (03) :349-360