A survey of multi-class imbalanced data classification methods

被引:4
|
作者
Han, Meng [1 ]
Li, Ang [1 ]
Gao, Zhihui [1 ]
Mu, Dongliang [1 ]
Liu, Shujuan [1 ]
机构
[1] North Minzu Univ, Sch Comp Sci & Engn, Yinchuan, Ningxia, Peoples R China
关键词
Classification; multi-class imbalance data; data preprocessing method; algorithm-level classification method; EXTREME LEARNING-MACHINE; SELECTION; ALGORITHM; CNN;
D O I
10.3233/JIFS-221902
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In reality, the data generated in many fields are often imbalanced, such as fraud detection, network intrusion detection and disease diagnosis. The class with fewer instances in the data is called the minority class, and the minority class in some applications contains the significant information. So far, many classification methods and strategies for binary imbalanced data have been proposed, but there are still many problems and challenges in multi-class imbalanced data that need to be solved urgently. The classification methods for multi-class imbalanced data are analyzed and summarized in terms of data preprocessing methods and algorithm-level classification methods, and the performance of the algorithms using the same dataset is compared separately. In the data preprocessing methods, the methods of oversampling, under-sampling, hybrid sampling and feature selection are mainly introduced. Algorithm-level classification methods are comprehensively introduced in four aspects: ensemble learning, neural network, support vector machine and multi-class decomposition technique. At the same time, all data preprocessing methods and algorithm-level classification methods are analyzed in detail in terms of the techniques used, comparison algorithms, pros and cons, respectively. Moreover, the evaluation metrics commonly used for multi-class imbalanced data classification methods are described comprehensively. Finally, the future directions of multi-class imbalanced data classification are given.
引用
收藏
页码:2471 / 2501
页数:31
相关论文
共 50 条
  • [1] Survey on Highly Imbalanced Multi-class Data
    Hamid, Hakim Abdul
    Yusoff, Marina
    Mohamed, Azlinah
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (06) : 211 - 229
  • [2] Multi-class imbalanced big data classification on Spark
    Sleeman, William C.
    Krawczyk, Bartosz
    KNOWLEDGE-BASED SYSTEMS, 2021, 212
  • [3] A Novel Double Ensemble Algorithm for the Classification of Multi-Class Imbalanced Hyperspectral Data
    Quan, Daying
    Feng, Wei
    Dauphin, Gabriel
    Wang, Xiaofeng
    Huang, Wenjiang
    Xing, Mengdao
    REMOTE SENSING, 2022, 14 (15)
  • [4] Multi-class WHMBoost: An ensemble algorithm for multi-class imbalanced data
    Zhao, Jiakun
    Jin, Ju
    Zhang, Yibo
    Zhang, Ruifeng
    Chen, Si
    INTELLIGENT DATA ANALYSIS, 2022, 26 (03) : 599 - 614
  • [5] An adaptive multi-class imbalanced classification framework based on ensemble methods and deep network
    Jiang, Xuezheng
    Wang, Junyi
    Meng, Qinggang
    Saada, Mohamad
    Cai, Haibin
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (15) : 11141 - 11159
  • [6] Parameter-free classification in multi-class imbalanced data sets
    Cerf, Loic
    Gay, Dominique
    Selmaoui-Folcher, Nazha
    Cremilleux, Bruno
    Boulicaut, Jean-Francois
    DATA & KNOWLEDGE ENGINEERING, 2013, 87 : 109 - 129
  • [7] An Effective Ensemble Method for Multi-class Classification and Regression for Imbalanced Data
    Alam, Tahira
    Ahmed, Chowdhury Farhan
    Zahin, Sabit Anwar
    Khan, Muhammad Asif Hossain
    Islam, Maliha Tashfia
    ADVANCES IN DATA MINING: APPLICATIONS AND THEORETICAL ASPECTS (ICDM 2018), 2018, 10933 : 59 - 74
  • [8] An Effective Recursive Technique for Multi-Class Classification and Regression for Imbalanced Data
    Alam, Tahira
    Ahmed, Chowdhury Farhan
    Zahin, Sabit Anwar
    Khan, Muhammad Asif Hossain
    Islam, Maliha Tashfia
    IEEE ACCESS, 2019, 7 : 127615 - 127630
  • [9] GDHS: An efficient hybrid sampling method for multi-class imbalanced data classification
    Yan, Yuanting
    Lv, Yan
    Han, Shuangyue
    Yu, Chengjin
    Zhou, Peng
    Neurocomputing, 2025, 637
  • [10] Evolutionary Mahalanobis Distance-Based Oversampling for Multi-Class Imbalanced Data Classification
    Yao, Leehter
    Lin, Tung-Bin
    SENSORS, 2021, 21 (19)