Boosting the oversampling methods based on differential evolution strategies for imbalanced learning

被引:17
|
作者
Korkmaz, Sedat [1 ]
Sahman, Mehmet Akif [2 ]
Cinar, Ahmet Cevahir [3 ]
Kaya, Ersin [1 ]
机构
[1] Konya Tech Univ, Fac Engn & Nat Sci, Dept Comp Engn, Konya, Turkey
[2] Selcuk Univ, Fac Technol, Dept Elect & Elect Engn, Konya, Turkey
[3] Selcuk Univ, Fac Technol, Dept Comp Engn, Konya, Turkey
关键词
Imbalanced datasets; Differential evolution; Oversampling; Imbalanced learning; Class imbalance; Differential evolution strategies; PREPROCESSING METHOD; GLOBAL OPTIMIZATION; SOFTWARE TOOL; SMOTE; CLASSIFICATION; ALGORITHMS; KEEL;
D O I
10.1016/j.asoc.2021.107787
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The class imbalance problem is a challenging problem in the data mining area. To overcome the low classification performance related to imbalanced datasets, sampling strategies are used for balancing the datasets. Oversampling is a technique that increases the minority class samples in various proportions. In this work, these 16 different DE strategies are used for oversampling the imbalanced datasets for better classification. The main aim of this work is to determine the best strategy in terms of Area Under the receiver operating characteristic (ROC) Curve (AUC) and Geometric Mean (G-Mean) metrics. 44 imbalanced datasets are used in experiments. Support Vector Machines (SVM), k-Nearest Neighbor (kNN), and Decision Tree (DT) are used as a classifier in the experiments. The best results are produced by 6th Debohid Strategy (DSt6), 1th Debohid Strategy (DSt1), and 3th Debohid Strategy (DSt3) by using kNN, DT, and SVM classifiers, respectively. The obtained results outperform the 9 state-of-the-art oversampling methods in terms of AUC and G-Mean metrics (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] WOTBoost: Weighted Oversampling Technique in Boosting for imbalanced learning
    Zhang, Wenhao
    Ramezani, Ramin
    Naeim, Arash
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 2523 - 2531
  • [2] Boosting imbalanced data learning with Wiener process oversampling
    Qian Li
    Gang Li
    Wenjia Niu
    Yanan Cao
    Liang Chang
    Jianlong Tan
    Li Guo
    Frontiers of Computer Science, 2017, 11 : 836 - 851
  • [3] Boosting imbalanced data learning with Wiener process oversampling
    Li, Qian
    Li, Gang
    Niu, Wenjia
    Cao, Yanan
    Chang, Liang
    Tan, Jianlong
    Guo, Li
    FRONTIERS OF COMPUTER SCIENCE, 2017, 11 (05) : 836 - 851
  • [4] DEBOHID: A differential evolution based oversampling approach for highly imbalanced datasets
    Kaya, Ersin
    Korkmaz, Sedat
    Sahman, Mehmet Akif
    Cinar, Ahmet Cevahir
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 169
  • [5] A new oversampling approach based differential evolution on the safe set for highly imbalanced datasets
    Zhang, Jiaoni
    Li, Yanying
    Zhang, Baoshuang
    Wang, Xialin
    Gong, Huanhuan
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 234
  • [6] Radial-based oversampling based on differential evolution for imbalanced dataRadial-based oversampling based on differential...J. Chen et al.
    Jun Chen
    Meng Xia
    Zhijie Wang
    Applied Intelligence, 2025, 55 (7)
  • [7] Oversampling Algorithm based on Reinforcement Learning in Imbalanced Problems
    Zhou, Ying
    Shu, Jiangang
    Zhong, Xiaoxiong
    Huang, Xingsen
    Luo, Chenguang
    Ai, Jianwen
    2020 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2020,
  • [8] Imbalanced Learning with Oversampling based on Classification Contribution Degree
    Jiang, Zhenhao
    Yang, Jie
    Liu, Yan
    ADVANCED THEORY AND SIMULATIONS, 2021, 4 (05)
  • [9] Oversampling boosting for classification of imbalanced software defect data
    Li, Guangling
    Wang, Shihai
    PROCEEDINGS OF THE 35TH CHINESE CONTROL CONFERENCE 2016, 2016, : 4149 - 4154
  • [10] Improving Imbalanced Dataset Classification Using Oversampling and Gradient Boosting
    Cahyana, Nurheri
    Khomsah, Siti
    Aribowo, Agus Sasmito
    2019 5TH INTERNATIONAL CONFERENCE ON SCIENCE ININFORMATION TECHNOLOGY (ICSITECH): EMBRACING INDUSTRY 4.0 - TOWARDS INNOVATION IN CYBER PHYSICAL SYSTEM, 2019, : 217 - 222