Oversampling Method for Imbalanced Data Using Credible Counterfactual

被引:0
|
作者
Gao, Feng [1 ]
Song, Mei [1 ]
Zhu, Yi [1 ]
机构
[1] School of Computer Science and Technology, Jiangsu Normal University, Jiangsu, Xuzhou,221000, China
关键词
Classification (of information) - Information use - Support vector machines;
D O I
10.3778/j.issn.1002-8331.2211-0413
中图分类号
学科分类号
摘要
A new method for imbalanced data sets on counterfactual is proposed (counterfactual,CF), and further removes the incredibilitycomposite samples, which aims to solve the problem of the traditional sampling method that cannot make full use of the data set information. Its core idea is to synthesize new samples based on the original instance features of the dataset. Compared with the traditional oversampling interpolation method, it can fully mine the boundary decision information in the data, so as to provide more useful information for the classifier and improve the classification performance. A lot of comparative experiments have been carried out on 9 KEEL and UCI unbalanced datasets, 5 different classifiers (SVM, DT, Logistic, RF, AdaBoost) and 4 traditional oversampling methods (SMOTE, B1- SMOTE, B2- SMOTE, ADASYN). The results show that the algorithm has higher AUC value、F1 value and G-mean value, which can effectively solve the class imbalance problem. © 2024 Editorial Department of Scientia Agricultura Sinica. All rights reserved.
引用
收藏
页码:165 / 171
相关论文
共 50 条
  • [1] Counterfactual-based minority oversampling for imbalanced classification
    Wang, Shu
    Luo, Hao
    Huang, Shanshan
    Li, Qingsong
    Liu, Li
    Su, Guoxin
    Liu, Ming
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 122
  • [2] Generative Oversampling Method (GenOMe) for Imbalanced Data on Apnea Detection using ECG Data
    Sanabila, H. R.
    Kusuma, Ilham
    Jatmiko, Wisnu
    2016 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND INFORMATION SYSTEMS (ICACSIS), 2016, : 572 - 577
  • [3] Oversampling for Imbalanced Data Classification Using Adversarial Network
    Lee, Sang-Kwang
    Hong, Seung-Jin
    Yang, Seong-Il
    2018 INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC), 2018, : 1255 - 1257
  • [4] OVERSAMPLING METHOD FOR IMBALANCED CLASSIFICATION
    Zheng, Zhuoyuan
    Cai, Yunpeng
    Li, Ye
    COMPUTING AND INFORMATICS, 2015, 34 (05) : 1017 - 1037
  • [5] Imbalanced Data Mining Using Oversampling and Cellular GEP Ensemble
    Jedrzejowicz, Joanna
    Jedrzejowicz, Piotr
    COMPUTATIONAL COLLECTIVE INTELLIGENCE (ICCCI 2021), 2021, 12876 : 360 - 372
  • [6] Oversampling imbalanced data in the string space
    Castellanos, Francisco J.
    Valero-Mas, Jose J.
    Calvo-Zaragoza, Jorge
    Rico-Juan, Juan R.
    PATTERN RECOGNITION LETTERS, 2018, 103 : 32 - 38
  • [7] Oversampling techniques for imbalanced data in regression
    Belhaouari, Samir Brahim
    Islam, Ashhadul
    Kassoul, Khelil
    Al-Fuqaha, Ala
    Bouzerdoum, Abdesselam
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 252
  • [8] Adaptive Oversampling for Imbalanced Data Classification
    Ertekin, Seyda
    INFORMATION SCIENCES AND SYSTEMS 2013, 2013, 264 : 261 - 269
  • [9] A three-way decision ensemble method for imbalanced data oversampling
    Yan, Yuan Ting
    Wu, Zeng Bao
    Du, Xiu Quan
    Chen, Jie
    Zhao, Shu
    Zhang, Yan Ping
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2019, 107 (1-16) : 1 - 16
  • [10] A new boundary-degree-based oversampling method for imbalanced data
    Yueqi Chen
    Witold Pedrycz
    Jie Yang
    Applied Intelligence, 2023, 53 : 26518 - 26541