MKC-SMOTE: A Novel Synthetic Oversampling Method for Multi-Class Imbalanced Data Classification

被引:0
|
作者
Wang, Jiao [1 ,2 ]
Awang, Norhashidah [1 ]
机构
[1] Univ Sains Malaysia, Sch Math Sci, George Town 11800, Malaysia
[2] Puer Univ, Sch Math & Stat, Puer 665000, Peoples R China
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Multi-class imbalanced dataset; classification; SMOTE algorithm; synthetic minority; oversampling; DATA-SETS;
D O I
10.1109/ACCESS.2024.3521120
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The learning of multi-class imbalance problems presents greater challenges and has fewer research results compared to binary imbalance problems. Resampling techniques are widely employed to address data imbalance problems. However, the majority of existing resampling methods are designed specifically for binary imbalance datasets and demonstrate significant limitations when applied to multi-class imbalance datasets. Therefore, this study introduces the MKC-SMOTE algorithm, a novel and effective method specifically tailored for multi-class imbalanced datasets. During the pre-processing phase, the algorithm takes into account the distribution of all classes and employs the k-nearest neighbors (kNN) algorithm to identify appropriate original samples for synthesizing minority class samples. It then utilizes an enhanced SMOTE algorithm for interpolation. In the post-processing phase, potentially misleading synthesized samples are eliminated by the undersampling technique. Consequently, the MKC-SMOTE algorithm generates high-quality minority class samples by strategically exploring the distributional regions of the classes. Extensive experiments were conducted on 21 real-world datasets, comparing the MKC-SMOTE algorithm with six imbalance problem handling methods and two classifiers. The results demonstrate that the MKC-SMOTE algorithm significantly enhances the classification performance of multi-class imbalanced datasets and outperforms several popular and state-of-the-art oversampling methods.
引用
收藏
页码:196929 / 196938
页数:10
相关论文
共 50 条
  • [1] Multi-class Imbalanced Data Oversampling for Vertebral Column Pathologies Classification
    Saez, Jose A.
    Quintian, Hector
    Krawczyk, Bartosz
    Wozniak, Michal
    Corchado, Emilio
    HYBRID ARTIFICIAL INTELLIGENT SYSTEMS (HAIS 2018), 2018, 10870 : 131 - 142
  • [2] An oversampling method for multi-class imbalanced data based on composite weights
    Deng, Mingyang
    Guo, Yingshi
    Wang, Chang
    Wu, Fuwei
    PLOS ONE, 2021, 16 (11):
  • [3] Adversarial oversampling for multi-class imbalanced data classification with convolutional neural networks
    Wojciechowski, Adam
    Lango, Mateusz
    FOURTH INTERNATIONAL WORKSHOP ON LEARNING WITH IMBALANCED DOMAINS: THEORY AND APPLICATIONS, VOL 183, 2022, 183 : 98 - 111
  • [4] SMOTE-IF: A Novel Resampling Method Based on SMOTE Using Isolation Forest Variants for Multi-Class Imbalanced Data
    Li, Ang
    Ma, Tingting
    Ye, Sen
    Liu, Xunyun
    2023 IEEE INTERNATIONAL CONFERENCES ON INTERNET OF THINGS, ITHINGS IEEE GREEN COMPUTING AND COMMUNICATIONS, GREENCOM IEEE CYBER, PHYSICAL AND SOCIAL COMPUTING, CPSCOM IEEE SMART DATA, SMARTDATA AND IEEE CONGRESS ON CYBERMATICS,CYBERMATICS, 2024, : 570 - 577
  • [5] A Combination Method for Multi-Class Imbalanced Data Classification
    Li, Hu
    Zou, Peng
    Han, Weihong
    Xia, Rongze
    2013 10TH WEB INFORMATION SYSTEM AND APPLICATION CONFERENCE (WISA 2013), 2013, : 365 - 368
  • [6] Evolutionary Mahalanobis Distance-Based Oversampling for Multi-Class Imbalanced Data Classification
    Yao, Leehter
    Lin, Tung-Bin
    SENSORS, 2021, 21 (19)
  • [7] Importance-SMOTE: a synthetic minority oversampling method for noisy imbalanced data
    Jie Liu
    Soft Computing, 2022, 26 : 1141 - 1163
  • [8] Importance-SMOTE: a synthetic minority oversampling method for noisy imbalanced data
    Liu, Jie
    SOFT COMPUTING, 2022, 26 (03) : 1141 - 1163
  • [9] SCUT: Multi-Class Imbalanced Data Classification using SMOTE and Cluster-based Undersampling
    Agrawal, Astha
    Viktor, Herna L.
    Paquet, Eric
    2015 7TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (IC3K), 2015, : 226 - 233
  • [10] A Novel and Effective Multi-Class Classification Method for Imbalanced Medical Transcriptions
    Bhardwaj, Priti
    Baliyan, Niyati
    IETE JOURNAL OF RESEARCH, 2024, 8 (6734-6744) : 6734 - 6744