MKC-SMOTE: A Novel Synthetic Oversampling Method for Multi-Class Imbalanced Data Classification

被引:0
|
作者
Wang, Jiao [1 ,2 ]
Awang, Norhashidah [1 ]
机构
[1] Univ Sains Malaysia, Sch Math Sci, George Town 11800, Malaysia
[2] Puer Univ, Sch Math & Stat, Puer 665000, Peoples R China
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Multi-class imbalanced dataset; classification; SMOTE algorithm; synthetic minority; oversampling; DATA-SETS;
D O I
10.1109/ACCESS.2024.3521120
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The learning of multi-class imbalance problems presents greater challenges and has fewer research results compared to binary imbalance problems. Resampling techniques are widely employed to address data imbalance problems. However, the majority of existing resampling methods are designed specifically for binary imbalance datasets and demonstrate significant limitations when applied to multi-class imbalance datasets. Therefore, this study introduces the MKC-SMOTE algorithm, a novel and effective method specifically tailored for multi-class imbalanced datasets. During the pre-processing phase, the algorithm takes into account the distribution of all classes and employs the k-nearest neighbors (kNN) algorithm to identify appropriate original samples for synthesizing minority class samples. It then utilizes an enhanced SMOTE algorithm for interpolation. In the post-processing phase, potentially misleading synthesized samples are eliminated by the undersampling technique. Consequently, the MKC-SMOTE algorithm generates high-quality minority class samples by strategically exploring the distributional regions of the classes. Extensive experiments were conducted on 21 real-world datasets, comparing the MKC-SMOTE algorithm with six imbalance problem handling methods and two classifiers. The results demonstrate that the MKC-SMOTE algorithm significantly enhances the classification performance of multi-class imbalanced datasets and outperforms several popular and state-of-the-art oversampling methods.
引用
收藏
页码:196929 / 196938
页数:10
相关论文
共 50 条
  • [31] Novel Oversampling Algorithm for Handling Imbalanced Data Classification Novel Oversampling Algorithm
    More, Anjali S.
    Rana, Dipti P.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (08) : 491 - 496
  • [32] Boosting methods for multi-class imbalanced data classification: an experimental review
    Jafar Tanha
    Yousef Abdi
    Negin Samadi
    Nazila Razzaghi
    Mohammad Asadpour
    Journal of Big Data, 7
  • [33] Boosting methods for multi-class imbalanced data classification: an experimental review
    Tanha, Jafar
    Abdi, Yousef
    Samadi, Negin
    Razzaghi, Nazila
    Asadpour, Mohammad
    JOURNAL OF BIG DATA, 2020, 7 (01)
  • [34] Improved multi-class classification approach for imbalanced big data on spark
    Tinku Singh
    Riya Khanna
    Manish Satakshi
    The Journal of Supercomputing, 2023, 79 : 6583 - 6611
  • [35] Improved multi-class classification approach for imbalanced big data on spark
    Singh, Tinku
    Khanna, Riya
    Satakshi
    Kumar, Manish
    JOURNAL OF SUPERCOMPUTING, 2023, 79 (06): : 6583 - 6611
  • [36] A new data complexity measure for multi-class imbalanced classification tasks
    Han, Mingming
    Guo, Husheng
    Wang, Wenjian
    PATTERN RECOGNITION, 2025, 157
  • [37] Parameter-free classification in multi-class imbalanced data sets
    Cerf, Loic
    Gay, Dominique
    Selmaoui-Folcher, Nazha
    Cremilleux, Bruno
    Boulicaut, Jean-Francois
    DATA & KNOWLEDGE ENGINEERING, 2013, 87 : 109 - 129
  • [38] Novel hybrid classification model for multi-class imbalanced lithology dataset
    Alyasin, Eman Ibrahim
    Ata, Oguz
    Mohammedqasim, Hayder
    OPTIK, 2022, 270
  • [39] Performance Analysis of Binarization Strategies for Multi-class Imbalanced Data Classification
    Zak, Michal
    Wozniak, Michal
    COMPUTATIONAL SCIENCE - ICCS 2020, PT IV, 2020, 12140 : 141 - 155
  • [40] An online ensemble classification algorithm for multi-class imbalanced data stream
    Han, Meng
    Li, Chunpeng
    Meng, Fanxing
    He, Feifei
    Zhang, Ruihua
    KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (11) : 6845 - 6880