Boosting the oversampling methods based on differential evolution strategies for imbalanced learning

被引:17
|
作者
Korkmaz, Sedat [1 ]
Sahman, Mehmet Akif [2 ]
Cinar, Ahmet Cevahir [3 ]
Kaya, Ersin [1 ]
机构
[1] Konya Tech Univ, Fac Engn & Nat Sci, Dept Comp Engn, Konya, Turkey
[2] Selcuk Univ, Fac Technol, Dept Elect & Elect Engn, Konya, Turkey
[3] Selcuk Univ, Fac Technol, Dept Comp Engn, Konya, Turkey
关键词
Imbalanced datasets; Differential evolution; Oversampling; Imbalanced learning; Class imbalance; Differential evolution strategies; PREPROCESSING METHOD; GLOBAL OPTIMIZATION; SOFTWARE TOOL; SMOTE; CLASSIFICATION; ALGORITHMS; KEEL;
D O I
10.1016/j.asoc.2021.107787
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The class imbalance problem is a challenging problem in the data mining area. To overcome the low classification performance related to imbalanced datasets, sampling strategies are used for balancing the datasets. Oversampling is a technique that increases the minority class samples in various proportions. In this work, these 16 different DE strategies are used for oversampling the imbalanced datasets for better classification. The main aim of this work is to determine the best strategy in terms of Area Under the receiver operating characteristic (ROC) Curve (AUC) and Geometric Mean (G-Mean) metrics. 44 imbalanced datasets are used in experiments. Support Vector Machines (SVM), k-Nearest Neighbor (kNN), and Decision Tree (DT) are used as a classifier in the experiments. The best results are produced by 6th Debohid Strategy (DSt6), 1th Debohid Strategy (DSt1), and 3th Debohid Strategy (DSt3) by using kNN, DT, and SVM classifiers, respectively. The obtained results outperform the 9 state-of-the-art oversampling methods in terms of AUC and G-Mean metrics (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:19
相关论文
共 50 条
  • [31] A review of boosting methods for imbalanced data classification
    Qiujie Li
    Yaobin Mao
    Pattern Analysis and Applications, 2014, 17 : 679 - 693
  • [32] Multiple Kernel Learning With Minority Oversampling for Classifying Imbalanced Data
    Wang, Ling
    Wang, Hongqiao
    Fu, Guangyuan
    IEEE ACCESS, 2021, 9 : 565 - 580
  • [33] On the Role of Cost-Sensitive Learning in Imbalanced Data Oversampling
    Krawczyk, Bartosz
    Wozniak, Michal
    COMPUTATIONAL SCIENCE - ICCS 2019, PT III, 2019, 11538 : 180 - 191
  • [34] Model-Based Oversampling for Imbalanced Sequence Classification
    Gong, Zhichen
    Chen, Huanhuan
    CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2016, : 1009 - 1018
  • [35] Gaussian Distribution Based Oversampling for Imbalanced Data Classification
    Xie, Yuxi
    Qiu, Min
    Zhang, Haibo
    Peng, Lizhi
    Chen, Zhenxiang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (02) : 667 - 679
  • [36] A Differential Evolution-Based Method for Class-Imbalanced Cost-Sensitive Learning
    Qiu, Chen
    Jiang, Liangxiao
    Kong, Ganggang
    2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015,
  • [37] Learning Adaptive Differential Evolution by Natural Evolution Strategies
    Zhang, Haotian
    Sun, Jianyong
    Tan, Kay Chen
    Xu, Zongben
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2023, 7 (03): : 872 - 886
  • [38] Counterfactual-based minority oversampling for imbalanced classification
    Wang, Shu
    Luo, Hao
    Huang, Shanshan
    Li, Qingsong
    Liu, Li
    Su, Guoxin
    Liu, Ming
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 122
  • [39] Explainability of SMOTE Based Oversampling for Imbalanced Dataset Problems
    Patil, Aum
    Framewala, Aman
    Kazi, Faruk
    2020 3RD INTERNATIONAL CONFERENCE ON INFORMATION AND COMPUTER TECHNOLOGIES (ICICT 2020), 2020, : 41 - 45
  • [40] An oversampling framework for imbalanced classification based on Laplacian eigenmaps
    Ye, Xiucai
    Li, Hongmin
    Imakura, Akira
    Sakurai, Tetsuya
    NEUROCOMPUTING, 2020, 399 : 107 - 116