DOSS: Dual Over Sampling Strategy for Imbalanced Data Classification

被引:0
作者
Wang, Qiushi [1 ]
Lee, Kee Jin [1 ]
Hong, Jihoon [1 ]
机构
[1] ASTAR, Mfg Execut & Control Grp, Singapore Inst Mfg Technol SIMTech, Singapore, Singapore
来源
IECON 2018 - 44TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY | 2018年
关键词
imbalanced classification; oversampling; cGAN; SMOTE;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Imbalanced datasets are often encountered in process monitoring, where the data reflecting abnormal events like machine failures is less than the data reflecting normal events. The former is called the minority class and the later is referred as the majority class. Classical machine learning algorithms are still facing challenges in solving this problem. In order to improve the classification accuracy, oversampling techniques rebalance the dataset by supplying the minority class with synthetic samples. However, the latent sample spaces of both classes are broad, the majority class might be under-represented as well. In this paper, we propose a dual oversampling strategy (DOSS) to generate samples for both classes. For the majority class, synthetic samples are generated according to the data distribution, which is approximated by conditional Generative Adversarial Network (cGAN). For the minority class, Synthetic Minority Over-sampling Technique (SMOTE) is applied as the oversampling method. The proposed strategy is compared with others that either only the minority class is oversampled or both classes are oversampled with different strategies. Recall, G-mean and F-measure are used as the metrics. The experimental results on 12 benchmark datasets show the improved performance of our proposed strategy. DOSS is further applied to detect the faulty stages of an injection moulding machine where the prediction of DOSS achieves a better accuracy.
引用
收藏
页码:5389 / 5394
页数:6
相关论文
共 10 条
  • [1] [Anonymous], 2016, KDD16 P 22 ACM, DOI DOI 10.1145/2939672.2939785
  • [2] Batista GE., 2004, ACM SIGKDD EXPL NEWS, V6, P20, DOI [DOI 10.1145/1007730.1007735, 10.1145/1007730.1007735]
  • [3] SMOTE: Synthetic minority over-sampling technique
    Chawla, Nitesh V.
    Bowyer, Kevin W.
    Hall, Lawrence O.
    Kegelmeyer, W. Philip
    [J]. 2002, American Association for Artificial Intelligence (16)
  • [4] Effective data generation for imbalanced learning using conditional generative adversarial networks
    Douzas, Georgios
    Bacao, Fernando
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2018, 91 : 464 - 471
  • [5] Generative Adversarial Networks
    Goodfellow, Ian
    Pouget-Abadie, Jean
    Mirza, Mehdi
    Xu, Bing
    Warde-Farley, David
    Ozair, Sherjil
    Courville, Aaron
    Bengio, Yoshua
    [J]. COMMUNICATIONS OF THE ACM, 2020, 63 (11) : 139 - 144
  • [6] Learning from class-imbalanced data: Review of methods and applications
    Guo Haixiang
    Li Yijing
    Shang, Jennifer
    Gu Mingyun
    Huang Yuanyue
    Bing, Gong
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2017, 73 : 220 - 239
  • [7] Learning from Imbalanced Data
    He, Haibo
    Garcia, Edwardo A.
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (09) : 1263 - 1284
  • [8] Ke GL, 2017, ADV NEUR IN, V30
  • [9] Classification of weld flaws with imbalanced class data
    Liao, T. Warren
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2008, 35 (03) : 1041 - 1052
  • [10] Mathew J, 2015, IEEE IND ELEC, P1127, DOI 10.1109/IECON.2015.7392251