A New Safe-Level Enabled Borderline-SMOTE for Condition Recognition of Imbalanced Dataset

被引:4
作者
Chen, Chao [1 ]
Shen, Wei [1 ,2 ]
Yang, Chenhao [1 ]
Fan, Wei [1 ]
Liu, Xin [3 ]
Li, Ying [4 ]
机构
[1] Jiangsu Univ, Sch Mech Engn, Zhenjiang 212013, Peoples R China
[2] Shanghai Jiao Tong Univ, Sch Mech Engn, Shanghai 200030, Peoples R China
[3] Jilin Univ, Sch Mech & Aerosp Engn, Changchun 130000, Peoples R China
[4] Minist Agr & Rural Affairs, Nanjing Inst Agr Mechanizat, Nanjing 210095, Peoples R China
关键词
Boundary data; condition recognition; imbalanced classification; light gradient boosting machine (LightGBM); safe-level synthetic minority oversampling technique (SMOTE); synthetic minority oversampling technique; PREDICTION;
D O I
10.1109/TIM.2023.3289545
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Machine learning (ML)-based classification strategy has been successfully applied in actual industrial monitoring but it is often hindered when the dataset is imbalanced. Technically, the misclassification phenomenon, as a serious performance degradation of generalization ability, often occurs in minority class. For this problem, borderline-synthetic minority oversampling technique (B-SMOTE), which aims to enrich the quantity of minority samples around decision boundaries, has received considerable attention. However, most imbalanced classification techniques under the framework of B-SMOTE generate instances by a random weight number from 0 to 1, which may result in an authentic reduction of newly born samples. Herein, a novel oversampling strategy, which aims to provide new safety criteria and reassign the threshold of weight coefficient, is proposed to boost the authenticity of generated samples and classification accuracy. In addition, light gradient boosting machine (LightGBM) is adopted to build the classification model. Related experiments show the effectiveness and superiority of the proposed method in handling imbalanced classification tasks.
引用
收藏
页数:10
相关论文
共 43 条
[1]   HCAB-SMOTE: A Hybrid Clustered Affinitive Borderline SMOTE Approach for Imbalanced Data Binary Classification [J].
Al Majzoub, Hisham ;
Elgedawy, Islam ;
Akaydin, Oyku ;
Ulukok, Mehtap Kose .
ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2020, 45 (04) :3205-3222
[2]   An approach for classification of highly imbalanced data using weighting and undersampling [J].
Anand, Ashish ;
Pugalenthi, Ganesan ;
Fogel, Gary B. ;
Suganthan, P. N. .
AMINO ACIDS, 2010, 39 (05) :1385-1391
[3]   Dynamics Model Validation Using Time-Domain Metrics [J].
Ao, Dan ;
Hu, Zhen ;
Mahadevan, Sankaran .
Journal of Verification, Validation and Uncertainty Quantification, 2017, 2 (01)
[4]   MWMOTE-Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning [J].
Barua, Sukarna ;
Islam, Md. Monirul ;
Yao, Xin ;
Murase, Kazuyuki .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (02) :405-425
[5]  
Batista G.E.A.P.A., 2004, SIGKDD Explorations, V6, P20, DOI [10.1145/1007730.1007735, DOI 10.1145/1007730.1007735]
[6]  
Bunkhumpornpat C, 2009, LECT NOTES ARTIF INT, V5476, P475, DOI 10.1007/978-3-642-01307-2_43
[7]  
Chawla N.V., 2004, SIGKDD Explorations, V6, P1
[8]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[9]   SMOTEBoost: Improving prediction of the minority class in boosting [J].
Chawla, NV ;
Lazarevic, A ;
Hall, LO ;
Bowyer, KW .
KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2003, PROCEEDINGS, 2003, 2838 :107-119
[10]   Deep imbalanced regression using cost-sensitive learning and deep feature transfer for bearing remaining useful life estimation [J].
Ding, Yifei ;
Jia, Minping ;
Zhuang, Jichao ;
Ding, Peng .
APPLIED SOFT COMPUTING, 2022, 127