A Local Adaptive Minority Selection and Oversampling Method for Class-Imbalanced Fault Diagnostics in Industrial Systems

被引:61
作者
Wu, Zhenyu [1 ]
Lin, Wenfang [2 ]
Fu, Binghao [1 ]
Guo, Juchuan [1 ]
Ji, Yang [2 ]
Pecht, Michael [3 ]
机构
[1] Beijing Univ Posts & Telecommun, Engn Res Ctr Informat Network, Minist Educ, Beijing 100876, Peoples R China
[2] Beijing Univ Posts & Telecommun, Key Lab Universal Wireless Commun, Minist Educ, Beijing 100876, Peoples R China
[3] Univ Maryland, Ctr Adv Life Cycle Engn, College Pk, MD 20742 USA
关键词
Prediction algorithms; Prognostics and health management; Wind turbines; Mathematical model; Task analysis; Machine learning algorithms; Fault diagnosis; Class-imbalance learning; fault diagnostics; machine learning; prognostics and health management (PHM); synthetic oversampling; CLASSIFICATION; MACHINE; SCHEME; SMOTE;
D O I
10.1109/TR.2019.2942049
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Data-driven fault diagnostics of industrial systems suffer from class-imbalanced problems, which is a common challenge for machine learning algorithms as it is difficult to learn the features of the minority class samples. Synthetic oversampling methods are commonly used to tackle these problems by generating minority class samples to balance the majority and minority classes. Two major issues will influence the performance of oversampling methods which are how to choose the most appropriate existing minority seed samples, and how to synthesize new samples from seed samples effectively. However, many existing oversampling methods are not accurate and effective enough to generate new samples when dealing with high-dimensional faulty samples with different imbalanced ratios, since they do not take these two factors into consideration at the same time. This article develops a novel adaptive oversampling technique: expectation maximization (EM)-based local-weighted minority oversampling technique for industrial fault diagnostics. This method uses a local-weighted minority oversampling strategy to identify hard-to-learn informative minority fault samples and an EM-based imputation algorithm to generate fault samples based on the distribution of minority samples. To validate the performance of the developed method, experiments were conducted on two real-world datasets. The results show that the developed method can achieve better performances, in terms of F-measure, Matthews correlation coefficient (MCC), and Mean (average of F-measure and MCC) values, on multiclass imbalanced fault diagnostics in different imbalance ratios than state-of-arts' baseline sampling techniques.
引用
收藏
页码:1195 / 1206
页数:12
相关论文
共 32 条
[1]  
[Anonymous], 2016, COMPUT STAT DATA ANA
[2]   Boosted Near-miss Under-sampling on SVM ensembles for concept detection in large-scale imbalanced datasets [J].
Bao, Lei ;
Juan, Cao ;
Li, Jintao ;
Zhang, Yongdong .
NEUROCOMPUTING, 2016, 172 :198-206
[3]  
Barua Sukarna, 2013, Advances in Knowledge Discovery and Data Mining. 17th Pacific-Asia Conference (PAKDD 2013). Proceedings, P317, DOI 10.1007/978-3-642-37456-2_27
[4]   MWMOTE-Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning [J].
Barua, Sukarna ;
Islam, Md. Monirul ;
Yao, Xin ;
Murase, Kazuyuki .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (02) :405-425
[5]  
Barua S, 2011, LECT NOTES COMPUT SC, V7063, P735, DOI 10.1007/978-3-642-24958-7_85
[6]   FSVM-CIL: Fuzzy Support Vector Machines for Class Imbalance Learning [J].
Batuwita, Rukshan ;
Palade, Vasile .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2010, 18 (03) :558-571
[7]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[8]   Start Globally, Optimize Locally, Predict Globally: Improving Performance on Imbalanced Data [J].
Cieslak, David A. ;
Chawla, Nitesh V. .
ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2008, :143-152
[9]   Data-driven fault detection and isolation scheme for a wind turbine benchmark [J].
de Bessa, Iury Valente ;
Palhares, Reinaldo Martinez ;
Silveira Vasconcelos D'Angelo, Marcos Flavio ;
Chaves Filho, Joao Edgar .
RENEWABLE ENERGY, 2016, 87 :634-645
[10]   Approximate statistical tests for comparing supervised classification learning algorithms [J].
Dietterich, TG .
NEURAL COMPUTATION, 1998, 10 (07) :1895-1923