Oversampling Method for Imbalanced Data Using Credible Counterfactual

被引:0
|
作者
Gao, Feng [1 ]
Song, Mei [1 ]
Zhu, Yi [1 ]
机构
[1] School of Computer Science and Technology, Jiangsu Normal University, Jiangsu, Xuzhou,221000, China
关键词
Classification (of information) - Information use - Support vector machines;
D O I
10.3778/j.issn.1002-8331.2211-0413
中图分类号
学科分类号
摘要
A new method for imbalanced data sets on counterfactual is proposed (counterfactual,CF), and further removes the incredibilitycomposite samples, which aims to solve the problem of the traditional sampling method that cannot make full use of the data set information. Its core idea is to synthesize new samples based on the original instance features of the dataset. Compared with the traditional oversampling interpolation method, it can fully mine the boundary decision information in the data, so as to provide more useful information for the classifier and improve the classification performance. A lot of comparative experiments have been carried out on 9 KEEL and UCI unbalanced datasets, 5 different classifiers (SVM, DT, Logistic, RF, AdaBoost) and 4 traditional oversampling methods (SMOTE, B1- SMOTE, B2- SMOTE, ADASYN). The results show that the algorithm has higher AUC value、F1 value and G-mean value, which can effectively solve the class imbalance problem. © 2024 Editorial Department of Scientia Agricultura Sinica. All rights reserved.
引用
收藏
页码:165 / 171
相关论文
共 50 条
  • [41] Oversampling Methods Combined Clustering and Data Cleaning for Imbalanced Network Data
    Yang, Yang
    Zhao, Qian
    Ruan, Linna
    Gao, Zhipeng
    Huo, Yonghua
    Qiu, Xuesong
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2020, 26 (05): : 1139 - 1155
  • [42] An Improved Oversampling Method for imbalanced Data-SMOTE Based on Canopy and K-means
    Guo, Chaoyou
    Ma, Yankun
    Xu, Zhe
    Cao, Mengmeng
    Yao, Qian
    2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 1467 - 1469
  • [43] An oversampling method for imbalanced data based on spatial distribution of minority samples SD-KMSMOTE
    Wensheng Yang
    Chengsheng Pan
    Yanyan Zhang
    Scientific Reports, 12
  • [44] Anomaly detection and oversampling approach for classifying imbalanced data using CLUBS technique in IoT healthcare data
    Subha, S.
    Sathiaseelan, J. G. R.
    INTERNATIONAL JOURNAL OF INTELLIGENT ENGINEERING INFORMATICS, 2023, 11 (03) : 255 - 271
  • [45] An oversampling method for imbalanced data based on spatial distribution of minority samples SD-KMSMOTE
    Yang, Wensheng
    Pan, Chengsheng
    Zhang, Yanyan
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [46] Synthetic protein sequence oversampling method for classification and remote homology detection in imbalanced protein data
    Beigi, Majid M.
    Zell, Andreas
    BIOINFORMATICS RESEARCH AND DEVELOPMENT, PROCEEDINGS, 2007, 4414 : 263 - +
  • [47] Radial-Based oversampling for noisy imbalanced data classification
    Koziarski, Michal
    Krawczyk, Bartosz
    Wozniak, Michal
    NEUROCOMPUTING, 2019, 343 : 19 - 33
  • [48] Radial-Based Oversampling for Multiclass Imbalanced Data Classification
    Krawczyk, Bartosz
    Koziarski, Michal
    Wozniak, Michal
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (08) : 2818 - 2831
  • [49] An improved and random synthetic minority oversampling technique for imbalanced data
    Wei, Guoliang
    Mu, Weimeng
    Song, Yan
    Dou, Jun
    KNOWLEDGE-BASED SYSTEMS, 2022, 248
  • [50] Self-adaptive oversampling method based on the complexity of minority data in imbalanced datasets classification
    Tao, Xinmin
    Guo, Xinyue
    Zheng, Yujia
    Zhang, Xiaohan
    Chen, Zhiyu
    KNOWLEDGE-BASED SYSTEMS, 2023, 277