Conditional self-attention generative adversarial network with differential evolution algorithm for imbalanced data classification

被引:14
作者
Niu, Jiawei [1 ]
Liu, Zhunga [1 ]
Pan, Quan [1 ]
Yang, Yanbo [1 ]
LI, Yang [1 ]
机构
[1] Northwestern Polytech Univ, Dept Automat, Xian 710072, Peoples R China
关键词
Classification; Generative adversarial net-work; Imbalanced data; Optimization; Over-sampling; NEURAL-NETWORKS;
D O I
10.1016/j.cja.2022.09.014
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
Imbalanced data classification is an important research topic in real-world applications, like fault diagnosis in an aircraft manufacturing system. The over-sampling method is often used to solve this problem. It generates samples according to the distance between minority data. However, the traditional over-sampling method may change the original data distribution, which is harmful to the classification performance. In this paper, we propose a new method called Conditional SelfAttention Generative Adversarial Network with Differential Evolution (CSAGAN-DE) for imbalanced data classification. The new method aims at improving the classification performance of minority data by enhancing the quality of the generation of minority data. In CSAGAN-DE, the minority data are fed into the self-attention generative adversarial network to approximate the data distribution and create new data for the minority class. Then, the differential evolution algorithm is employed to automatically determine the number of generated minority data for achieving a satisfactory classification performance. Several experiments are conducted to evaluate the performance of the new CSAGAN-DE method. The results show that the new method can efficiently improve the classification performance compared with other related methods.(c) 2022 Chinese Society of Aeronautics and Astronautics. Production and hosting by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
引用
收藏
页码:303 / 315
页数:13
相关论文
共 53 条
[1]   Evaluation of three classification models to predict risk class of cattle cohorts developing bovine respiratory disease within the first 14 days on feed using on-arrival and/or pre-arrival information [J].
Amrine, David E. ;
McLellan, Jiena G. ;
White, Brad J. ;
Larson, Robert L. ;
Renter, David G. ;
Sanderson, Mike .
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2019, 156 :439-446
[2]   Deep Over-sampling Framework for Classifying Imbalanced Data [J].
Ando, Shin ;
Huang, Chun Yuan .
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2017, PT I, 2017, 10534 :770-785
[3]   MWMOTE-Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning [J].
Barua, Sukarna ;
Islam, Md. Monirul ;
Yao, Xin ;
Murase, Kazuyuki .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (02) :405-425
[4]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[5]  
Bunkhumpornpat C., 2009, ADV KNOWLEDGE DISCOV, P475, DOI [DOI 10.1007/978-3-642-01307-2_43, 10.1007/978-3-642-01307-243]
[6]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[7]  
Chen Musheng, 2017, Journal of Computer Applications, V37, P535, DOI 10.11772/j.issn.1001-9081.2017.02.0535
[8]   NEAREST NEIGHBOR PATTERN CLASSIFICATION [J].
COVER, TM ;
HART, PE .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1967, 13 (01) :21-+
[9]  
Davis J., 2006, The Mountain Press
[10]   Redundancy-driven modified Tomek-link based undersampling: A solution to class imbalance [J].
Devi, Debashree ;
Biswas, Saroj Kr. ;
Purkayastha, Biswajit .
PATTERN RECOGNITION LETTERS, 2017, 93 :3-12