An ensemble oversampling method for imbalanced classification with prior knowledge via generative adversarial network

被引:7
|
作者
Zhang, Yulin [1 ]
Liu, Yuchen [2 ]
Wang, Yan [2 ]
Yang, Jie [2 ]
机构
[1] Shandong Univ Sci & Technol, Coll Math & Syst Sci, Qingdao 266590, Shandong, Peoples R China
[2] Dalian Univ Technol, Sch Math Sci, Dalian 116024, Peoples R China
基金
中国国家自然科学基金;
关键词
Imbalanced data; Oversampling; Generative adversarial network; Bagging; SMOTE; MODEL; GAN;
D O I
10.1016/j.chemolab.2023.104775
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Currently, an increasing number of real-world applications show characteristics of class-imbalance classification suffering from severe class distribution skewing, thus requiring brand new algorithms to learn from imbalanced datasets. In this paper, a novel oversampling method using GAN framework is proposed for numerical imbalanced data, namely G-GAN. In the method, a Gaussian distribution of minority samples is estimated to get prior knowledge of minority class for the latent space of GAN. In order to increase the randomness of the generated samples, noises are obtained by a mixed strategy, that is, some noises of generator obey Gaussian distribution and others obey random distribution. Then G-GAN is trained to generate dispersive positive samples with the idea of Bagging, which could avoid the occurrence of overfitting. G-GAN is different from other literatures in that GAN does not directly generate minority samples, but adds the distribution information of minority samples to the latent space of GAN, and then generates minority samples. Compared with 11 commonly used oversampling methods, G-GAN obtains promising results in terms of G-mean, AUC, F-measure and ROC utilizing three classifiers on 11 benchmark imbalanced datasets. Furthermore, G-GAN is also validated on AUC metrics of a real Diabetes imbalanced dataset. The results demonstrate that G-GAN can provide great potential for imbalanced classification in the two numerical experiments.
引用
收藏
页数:9
相关论文
共 50 条
  • [21] Classification of Imbalanced Dataset using Generative Adversarial Nets
    Ozmen, Emirhan
    Cogun, Fuat
    Altiparmak, Fatih
    2020 28TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2020,
  • [22] Ensemble Deep Learning Classification Method Based on Generative Adversarial Networks
    Shen, Haoyuan
    Lin, Chenglong
    Ma, Yizhong
    Xie, En
    2024 16TH INTERNATIONAL CONFERENCE ON COMPUTER AND AUTOMATION ENGINEERING, ICCAE 2024, 2024, : 46 - 53
  • [23] Binary imbalanced data classification based on diversity oversampling by generative models
    Zhai, Junhai
    Qi, Jiaxing
    Shen, Chu
    INFORMATION SCIENCES, 2022, 585 : 313 - 343
  • [24] Fault diagnosis of wind turbines with generative adversarial network-based oversampling method
    Yang, Shuai
    Zhou, Yifei
    Chen, Xu
    Deng, Chunyan
    Li, Chuan
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2023, 34 (04)
  • [25] Imbalanced Fault Classification of Bearing via Wasserstein Generative Adversarial Networks with Gradient Penalty
    Han, Baokun
    Jia, Sixiang
    Liu, Guifang
    Wang, Jinrui
    SHOCK AND VIBRATION, 2020, 2020
  • [26] Imbalanced corporate bond default modeling using generative adversarial networks oversampling techniques
    Yao X.
    Li K.
    Yu L.
    Xitong Gongcheng Lilun yu Shijian/System Engineering Theory and Practice, 2022, 42 (10): : 2617 - 2634
  • [27] Cantonese porcelain classification and image synthesis by ensemble learning and generative adversarial network
    Chen, Steven Szu-Chi
    Cui, Hui
    Du, Ming-han
    Fu, Tie-ming
    Sun, Xiao-hong
    Ji, Yi
    Duh, Henry
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2019, 20 (12) : 1632 - 1643
  • [28] Cantonese porcelain classification and image synthesis by ensemble learning and generative adversarial network
    Steven Szu-Chi Chen
    Hui Cui
    Ming-han Du
    Tie-ming Fu
    Xiao-hong Sun
    Yi Ji
    Henry Duh
    Frontiers of Information Technology & Electronic Engineering, 2019, 20 : 1632 - 1643
  • [29] An improved generative adversarial network to oversample imbalanced datasets
    Pan, Tingting
    Pedrycz, Witold
    Yang, Jie
    Wang, Jian
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 132
  • [30] Distribution Enhancement for Imbalanced Data with Generative Adversarial Network
    Chen, Yueqi
    Pedrycz, Witold
    Pan, Tingting
    Wang, Jian
    Yang, Jie
    ADVANCED THEORY AND SIMULATIONS, 2024, 7 (09)