Perturbation-based oversampling technique for imbalanced classification problems

被引:1
|
作者
Zhang, Jianjun [1 ,2 ]
Wang, Ting [3 ]
Ng, Wing W. Y. [1 ,2 ]
Pedrycz, Witold [4 ]
机构
[1] South China Univ Technol, Sch Comp Sci & Engn, Guangdong Prov Key Lab Computat Intelligence & Cy, Guangzhou 510006, Peoples R China
[2] Brain & Affect Cognit Res Ctr, Pazhou Lab, Guangzhou 510335, Peoples R China
[3] South China Univ Technol, Guangzhou Peoples Hosp 1, Sch Med, Dept Radiol, Guangzhou, Guangdong, Peoples R China
[4] Univ Alberta, Dept Elect & Comp Engn, Edmonton, AB, Canada
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Class imbalance; Learning; Oversampling; Perturbation; PERFORMANCE; CHALLENGES; ALGORITHMS; SMOTE;
D O I
10.1007/s13042-022-01662-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a simple yet effective idea, perturbation-based oversampling (POS), to tackle imbalanced classification problems. In this method, we perturb each feature of a given minority instance to generate a new instance. The originality and advantage of the POS is that a hyperparameter p is introduced to control the variance of the perturbation, which provides flexibility to adapt the algorithm to data with different characteristics. Experimental results yielded by using five types of classifiers and 11 performance metrics on 103 imbalanced datasets show that the POS offers comparable or better results than those yielded by 11 reference methods in terms of multiple performance metrics. An important finding of this work is that a simple perturbation-based oversampling method is able to yield better classification results than many advanced oversampling methods by controlling the variance of input perturbation. This reminds us it may need to conduct comparisons with simple oversampling methods, e.g., POS, when designing new oversampling approaches.
引用
收藏
页码:773 / 787
页数:15
相关论文
共 50 条
  • [1] Perturbation-based oversampling technique for imbalanced classification problems
    Jianjun Zhang
    Ting Wang
    Wing W. Y. Ng
    Witold Pedrycz
    International Journal of Machine Learning and Cybernetics, 2023, 14 : 773 - 787
  • [2] Ensembling perturbation-based oversamplers for imbalanced datasets
    Zhang, Jianjun
    Wang, Ting
    Ng, Wing W. Y.
    Pedrycz, Witold
    NEUROCOMPUTING, 2022, 479 : 1 - 11
  • [3] A novel synthetic minority oversampling technique based on relative and absolute densities for imbalanced classification
    Liu, Ruijuan
    APPLIED INTELLIGENCE, 2023, 53 (01) : 786 - 803
  • [4] OVERSAMPLING METHOD FOR IMBALANCED CLASSIFICATION
    Zheng, Zhuoyuan
    Cai, Yunpeng
    Li, Ye
    COMPUTING AND INFORMATICS, 2015, 34 (05) : 1017 - 1037
  • [5] A novel synthetic minority oversampling technique based on relative and absolute densities for imbalanced classification
    Ruijuan Liu
    Applied Intelligence, 2023, 53 : 786 - 803
  • [6] Model-Based Oversampling for Imbalanced Sequence Classification
    Gong, Zhichen
    Chen, Huanhuan
    CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2016, : 1009 - 1018
  • [7] Gaussian Distribution Based Oversampling for Imbalanced Data Classification
    Xie, Yuxi
    Qiu, Min
    Zhang, Haibo
    Peng, Lizhi
    Chen, Zhenxiang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (02) : 667 - 679
  • [8] Counterfactual-based minority oversampling for imbalanced classification
    Wang, Shu
    Luo, Hao
    Huang, Shanshan
    Li, Qingsong
    Liu, Li
    Su, Guoxin
    Liu, Ming
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 122
  • [9] An oversampling framework for imbalanced classification based on Laplacian eigenmaps
    Ye, Xiucai
    Li, Hongmin
    Imakura, Akira
    Sakurai, Tetsuya
    NEUROCOMPUTING, 2020, 399 : 107 - 116
  • [10] Imbalanced Learning with Oversampling based on Classification Contribution Degree
    Jiang, Zhenhao
    Yang, Jie
    Liu, Yan
    ADVANCED THEORY AND SIMULATIONS, 2021, 4 (05)