Bayesian Optimization Cost-Sensitive XGBoost Learning Algorithm for Imbalanced Data in Semiconductor Industry

被引:0
作者
Shamsudin, Haziqah [1 ]
Yusof, Umi Kalsom [1 ]
Kashif, Fizza [1 ]
Isa, Iza Sazanita [1 ,2 ]
机构
[1] Univ Sains Malaysia, Sch Comp Sci, George Town, Malaysia
[2] Univ Teknol MARA, Coll Engn, Ctr Elect Engn Studies, George Town, Malaysia
来源
JORDAN JOURNAL OF ELECTRICAL ENGINEERING | 2023年 / 9卷 / 04期
关键词
XGBoost learning algorithm; Cost-sensitivity; Imbalanced data; Semiconductor classification; Ensembled model; CLASSIFICATION;
D O I
10.5455/jjee.204-1671971895
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes an improved ensemble learning model based on extreme gradient boosting (XGBoost) with Bayesian optimization cost-sensitive learning algorithm for dealing with highly imbalanced data in the semiconductor process to achieve the highest possible pass and fail accuracy or recall for the classification performances. Most of the existing models are biased toward the majority class neglecting the minority class. The proposed Bayesian optimization cost-sensitive XGboost model is configured to be applied to the semiconductor dataset. The obtained experimental results - based on benchmarking semiconductor industry dataset - show 91.46% and 23.08% for the pass and fail accuracies, respectively. This confirms that the proposed model is significant for imbalanced cases in semiconductor applications. Moreover, this investigation reveals that the proposed model is able not only to maintain the performance of the majority class, but also to classify well the minority class.
引用
收藏
页码:552 / 565
页数:14
相关论文
共 50 条
[41]   A descriptive study of variable discretization and cost-sensitive logistic regression on imbalanced credit data [J].
Zhang, Lili ;
Ray, Herman ;
Priestley, Jennifer ;
Tan, Soon .
JOURNAL OF APPLIED STATISTICS, 2020, 47 (03) :568-581
[42]   Cost-sensitive meta-learning framework [J].
Shilbayeh, Samar Ali ;
Vadera, Sunil .
JOURNAL OF MODELLING IN MANAGEMENT, 2021, :987-1007
[43]   Analysis of preprocessing vs. cost-sensitive learning for imbalanced classification. Open problems on intrinsic data characteristics [J].
Lopez, Victoria ;
Fernandez, Alberto ;
Moreno-Torres, Jose G. ;
Herrera, Francisco .
EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (07) :6585-6608
[44]   Cost-sensitive and hybrid-attribute measure multi-decision tree over imbalanced data sets [J].
Li, Fenglian ;
Zhang, Xueying ;
Zhang, Xiqian ;
Du, Chunlei ;
Xu, Yue ;
Tian, Yu-Chu .
INFORMATION SCIENCES, 2018, 422 :242-256
[45]   A Novel Uncertainty Sampling Algorithm for Cost-sensitive Multiclass Active Learning [J].
Huang, Kuan-Hao ;
Lin, Hsuan-Tien .
2016 IEEE 16TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2016, :925-930
[46]   Online cost-sensitive neural network classifiers for non-stationary and imbalanced data streams [J].
Ghazikhani, Adel ;
Monsefi, Reza ;
Yazdi, Hadi Sadoghi .
NEURAL COMPUTING & APPLICATIONS, 2013, 23 (05) :1283-1295
[47]   Robust SVM for Cost-Sensitive Learning [J].
Jiangzhang Gan ;
Jiaye Li ;
Yangcai Xie .
Neural Processing Letters, 2022, 54 :2737-2758
[48]   COST-SENSITIVE MULTI-VIEW LEARNING MACHINE [J].
Wang, Zhe ;
Lu, Mingzhe ;
Niu, Zengxin ;
Xue, Xiangyang ;
Gao, Daqi .
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2014, 28 (03)
[49]   Roulette sampling for cost-sensitive learning [J].
Sheng, Victor S. ;
Ling, Charles X. .
MACHINE LEARNING: ECML 2007, PROCEEDINGS, 2007, 4701 :724-+
[50]   Robust SVM for Cost-Sensitive Learning [J].
Gan, Jiangzhang ;
Li, Jiaye ;
Xie, Yangcai .
NEURAL PROCESSING LETTERS, 2022, 54 (04) :2737-2758