Bayesian Optimization Cost-Sensitive XGBoost Learning Algorithm for Imbalanced Data in Semiconductor Industry

被引:0
作者
Shamsudin, Haziqah [1 ]
Yusof, Umi Kalsom [1 ]
Kashif, Fizza [1 ]
Isa, Iza Sazanita [1 ,2 ]
机构
[1] Univ Sains Malaysia, Sch Comp Sci, George Town, Malaysia
[2] Univ Teknol MARA, Coll Engn, Ctr Elect Engn Studies, George Town, Malaysia
来源
JORDAN JOURNAL OF ELECTRICAL ENGINEERING | 2023年 / 9卷 / 04期
关键词
XGBoost learning algorithm; Cost-sensitivity; Imbalanced data; Semiconductor classification; Ensembled model; CLASSIFICATION;
D O I
10.5455/jjee.204-1671971895
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes an improved ensemble learning model based on extreme gradient boosting (XGBoost) with Bayesian optimization cost-sensitive learning algorithm for dealing with highly imbalanced data in the semiconductor process to achieve the highest possible pass and fail accuracy or recall for the classification performances. Most of the existing models are biased toward the majority class neglecting the minority class. The proposed Bayesian optimization cost-sensitive XGboost model is configured to be applied to the semiconductor dataset. The obtained experimental results - based on benchmarking semiconductor industry dataset - show 91.46% and 23.08% for the pass and fail accuracies, respectively. This confirms that the proposed model is significant for imbalanced cases in semiconductor applications. Moreover, this investigation reveals that the proposed model is able not only to maintain the performance of the majority class, but also to classify well the minority class.
引用
收藏
页码:552 / 565
页数:14
相关论文
共 50 条
[21]   Cost-Sensitive Awareness-Based SAR Automatic Target Recognition for Imbalanced Data [J].
Cao, Changjie ;
Cui, Zongyong ;
Wang, Liying ;
Wang, Jielei ;
Cao, Zongjie ;
Yang, Jianyu .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[22]   An Effective Imbalanced JPEG Steganalysis Scheme Based on Adaptive Cost-Sensitive Feature Learning [J].
Jia, Ju ;
Zhai, Liming ;
Ren, Weixiang ;
Wang, Lina ;
Ren, Yanzhen .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (03) :1038-1052
[23]   Cost-Sensitive Active Learning for Incomplete Data [J].
Wang, Min ;
Yang, Chunyu ;
Zhao, Fei ;
Min, Fan ;
Wang, Xizhao .
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (01) :405-416
[24]   Cost Sensitive Improved Levenberg Marquardt Algorithm for Imbalanced Data [J].
Shinde, S. B. ;
Sayyad, S. S. .
2016 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH, 2016, :318-321
[25]   Machine learning based novel cost-sensitive seizure detection classifier for imbalanced EEG data sets [J].
Mohammad Khubeb Siddiqui ;
Xiaodi Huang ;
Ruben Morales-Menendez ;
Nasir Hussain ;
Khudeja Khatoon .
International Journal on Interactive Design and Manufacturing (IJIDeM), 2020, 14 :1491-1509
[26]   Machine learning based novel cost-sensitive seizure detection classifier for imbalanced EEG data sets [J].
Siddiqui, Mohammad Khubeb ;
Huang, Xiaodi ;
Morales-Menendez, Ruben ;
Hussain, Nasir ;
Khatoon, Khudeja .
INTERNATIONAL JOURNAL OF INTERACTIVE DESIGN AND MANUFACTURING - IJIDEM, 2020, 14 (04) :1491-1509
[27]   Improved cost-sensitive representation of data for solving the imbalanced big data classification problem [J].
Fattahi, Mahboubeh ;
Moattar, Mohammad Hossein ;
Forghani, Yahya .
JOURNAL OF BIG DATA, 2022, 9 (01)
[28]   Multiscale cost-sensitive learning-based assembly quality prediction approach under imbalanced data [J].
Wang, Tianyue ;
Hu, Bingtao ;
Feng, Yixiong ;
Gong, Hao ;
Zhong, Ruirui ;
Yang, Chen ;
Tan, Jianrong .
ADVANCED ENGINEERING INFORMATICS, 2024, 62
[29]   IMCStacking: Cost-sensitive stacking learning with feature inverse mapping for imbalanced problems [J].
Cao, Chenjie ;
Wang, Zhe .
KNOWLEDGE-BASED SYSTEMS, 2018, 150 :27-37
[30]   Cost-sensitive Bayesian network classifiers [J].
Jiang, Liangxiao ;
Li, Chaoqun ;
Wang, Shasha .
PATTERN RECOGNITION LETTERS, 2014, 45 :211-216