A survey of multi-class imbalanced data classification methods

被引:4
|
作者
Han, Meng [1 ]
Li, Ang [1 ]
Gao, Zhihui [1 ]
Mu, Dongliang [1 ]
Liu, Shujuan [1 ]
机构
[1] North Minzu Univ, Sch Comp Sci & Engn, Yinchuan, Ningxia, Peoples R China
关键词
Classification; multi-class imbalance data; data preprocessing method; algorithm-level classification method; EXTREME LEARNING-MACHINE; SELECTION; ALGORITHM; CNN;
D O I
10.3233/JIFS-221902
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In reality, the data generated in many fields are often imbalanced, such as fraud detection, network intrusion detection and disease diagnosis. The class with fewer instances in the data is called the minority class, and the minority class in some applications contains the significant information. So far, many classification methods and strategies for binary imbalanced data have been proposed, but there are still many problems and challenges in multi-class imbalanced data that need to be solved urgently. The classification methods for multi-class imbalanced data are analyzed and summarized in terms of data preprocessing methods and algorithm-level classification methods, and the performance of the algorithms using the same dataset is compared separately. In the data preprocessing methods, the methods of oversampling, under-sampling, hybrid sampling and feature selection are mainly introduced. Algorithm-level classification methods are comprehensively introduced in four aspects: ensemble learning, neural network, support vector machine and multi-class decomposition technique. At the same time, all data preprocessing methods and algorithm-level classification methods are analyzed in detail in terms of the techniques used, comparison algorithms, pros and cons, respectively. Moreover, the evaluation metrics commonly used for multi-class imbalanced data classification methods are described comprehensively. Finally, the future directions of multi-class imbalanced data classification are given.
引用
收藏
页码:2471 / 2501
页数:31
相关论文
共 50 条
  • [21] A Novel and Effective Multi-Class Classification Method for Imbalanced Medical Transcriptions
    Bhardwaj, Priti
    Baliyan, Niyati
    IETE JOURNAL OF RESEARCH, 2024, 8 (6734-6744) : 6734 - 6744
  • [22] Online active learning method for multi-class imbalanced data stream
    Li, Ang
    Han, Meng
    Mu, Dongliang
    Gao, Zhihui
    Liu, Shujuan
    KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (04) : 2355 - 2391
  • [23] Multi-class SVM Classification Comparison for Health Service Satisfaction Survey Data in Bahasa
    Indrawan G.
    Setiawan H.
    Gunadi A.
    HighTech and Innovation Journal, 2022, 3 (04): : 425 - 442
  • [24] CLASSIFICATION OF LIDAR DATA BASED ON MULTI-CLASS SVM
    Samadzadegan, F.
    Bigdeli, B.
    Ramzi, P.
    2010 CANADIAN GEOMATICS CONFERENCE AND SYMPOSIUM OF COMMISSION I, ISPRS CONVERGENCE IN GEOMATICS - SHAPING CANADA'S COMPETITIVE LANDSCAPE, 2010, 38
  • [25] Evolutionary inversion of class distribution in overlapping areas for multi-class imbalanced learning
    Fernandes, Everlandio R. Q.
    de Carvalho, Andre C. P. L. F.
    INFORMATION SCIENCES, 2019, 494 : 141 - 154
  • [26] Fast Learning and Testing for Imbalanced Multi-Class Changes in Streaming Data by Dynamic Multi-Stratum Network
    Thakong, Mongkhon
    Phimoltares, Suphakant
    Jaiyen, Saichon
    Lursinsap, Chidchanok
    IEEE ACCESS, 2017, 5 : 10633 - 10648
  • [27] Accurate and efficient sequential ensemble learning for highly imbalanced multi-class data
    Vong, Chi-Man
    Du, Jie
    NEURAL NETWORKS, 2020, 128 : 268 - 278
  • [28] Global-local information based oversampling for multi-class imbalanced data
    Han, Mingming
    Guo, Husheng
    Li, Jinyan
    Wang, Wenjian
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (06) : 2071 - 2086
  • [29] Multi-class random forest model to classify wastewater treatment imbalanced data
    Distefano, Veronica
    Palma, Monica
    De Iaco, Sandra
    SOCIO-ECONOMIC PLANNING SCIENCES, 2024, 95
  • [30] Combined Cleaning and Resampling algorithm for multi-class imbalanced data with label noise
    Koziarski, Michal
    Wozniak, Michal
    Krawczyk, Bartosz
    KNOWLEDGE-BASED SYSTEMS, 2020, 204 (204)