Oversampling Method for Imbalanced Data Using Credible Counterfactual

被引:0
|
作者
Gao, Feng [1 ]
Song, Mei [1 ]
Zhu, Yi [1 ]
机构
[1] School of Computer Science and Technology, Jiangsu Normal University, Jiangsu, Xuzhou,221000, China
关键词
Classification (of information) - Information use - Support vector machines;
D O I
10.3778/j.issn.1002-8331.2211-0413
中图分类号
学科分类号
摘要
A new method for imbalanced data sets on counterfactual is proposed (counterfactual,CF), and further removes the incredibilitycomposite samples, which aims to solve the problem of the traditional sampling method that cannot make full use of the data set information. Its core idea is to synthesize new samples based on the original instance features of the dataset. Compared with the traditional oversampling interpolation method, it can fully mine the boundary decision information in the data, so as to provide more useful information for the classifier and improve the classification performance. A lot of comparative experiments have been carried out on 9 KEEL and UCI unbalanced datasets, 5 different classifiers (SVM, DT, Logistic, RF, AdaBoost) and 4 traditional oversampling methods (SMOTE, B1- SMOTE, B2- SMOTE, ADASYN). The results show that the algorithm has higher AUC value、F1 value and G-mean value, which can effectively solve the class imbalance problem. © 2024 Editorial Department of Scientia Agricultura Sinica. All rights reserved.
引用
收藏
页码:165 / 171
相关论文
共 50 条
  • [21] Stacking density estimation and its oversampling method for continuously imbalanced data in chemometrics
    Zhao, Xin-Ru
    Yi, Lun-Zhao
    Fu, Guang-Hui
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2025, 261
  • [22] Importance-SMOTE: a synthetic minority oversampling method for noisy imbalanced data
    Jie Liu
    Soft Computing, 2022, 26 : 1141 - 1163
  • [23] A new imbalanced data oversampling method based on Bootstrap method and Wasserstein Generative Adversarial Network
    Hou, Binjie
    Chen, Gang
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2024, 21 (03) : 4309 - 4327
  • [24] Novel Oversampling Algorithm for Handling Imbalanced Data Classification Novel Oversampling Algorithm
    More, Anjali S.
    Rana, Dipti P.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (08) : 491 - 496
  • [25] Efficiency of oversampling methods for enhancing software defect prediction by using imbalanced data
    Benala, Tirimula Rao
    Tantati, Karunya
    INNOVATIONS IN SYSTEMS AND SOFTWARE ENGINEERING, 2023, 19 (03) : 247 - 263
  • [26] Efficiency of oversampling methods for enhancing software defect prediction by using imbalanced data
    Tirimula Rao Benala
    Karunya Tantati
    Innovations in Systems and Software Engineering, 2023, 19 : 247 - 263
  • [27] Oversampling Highly Imbalanced Indoor Positioning Data using Deep Generative Models
    Alhomayani, Fahad
    Mahoor, Mohammad H.
    2021 IEEE SENSORS, 2021,
  • [28] Gaussian Distribution Based Oversampling for Imbalanced Data Classification
    Xie, Yuxi
    Qiu, Min
    Zhang, Haibo
    Peng, Lizhi
    Chen, Zhenxiang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (02) : 667 - 679
  • [29] Noise-robust oversampling for imbalanced data classification
    Liu, Yongxu
    Liu, Yan
    Yu, Bruce X. B.
    Zhong, Shenghua
    Hu, Zhejing
    PATTERN RECOGNITION, 2023, 133
  • [30] Boosting imbalanced data learning with Wiener process oversampling
    Qian Li
    Gang Li
    Wenjia Niu
    Yanan Cao
    Liang Chang
    Jianlong Tan
    Li Guo
    Frontiers of Computer Science, 2017, 11 : 836 - 851