Semi-supervised support vector regression based on self-training with label uncertainty: An application to virtual metrology in semiconductor manufacturing

被引:79
作者
Kang, Pilsung [1 ]
Kim, Dongil [2 ]
Cho, Sungzoon [3 ]
机构
[1] Korea Univ, Sch Ind Management Engn, Seoul 02841, South Korea
[2] Korea Inst Ind Technol, Smart Mfg Technol Grp, Cheonan 31056, South Korea
[3] Seoul Natl Univ, Dept Ind Engn, Seoul 08826, South Korea
基金
新加坡国家研究基金会;
关键词
Semi-supervised learning; Support vector regression; Probabilistic local reconstruction; Data generation; Virtual metrology; Semiconductor manufacturing; RECONSTRUCTION; SVM;
D O I
10.1016/j.eswa.2015.12.027
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dataset size continues to increase and data are being collected from numerous applications. Because collecting labeled data is expensive and time consuming, the amount of unlabeled data is increasing. Semi-supervised learning (SSL) has been proposed to improve conventional supervised learning methods by training from both unlabeled and labeled data. In contrast to classification problems, the estimation of labels for unlabeled data presents added uncertainty for regression problems. In this paper, a semi supervised support vector regression (SS-SVR) method based on self-training is proposed. The proposed method addresses the uncertainty of the estimated labels for unlabeled data. To measure labeling uncertainty, the label distribution of the unlabeled data is estimated with two probabilistic local reconstruction (PLR) models. Then, the training data are generated by oversampling from the unlabeled data and their estimated label distribution. The sampling rate is different based on uncertainty. Finally, expected margin-based pattern selection (EMPS) is employed to reduce training complexity. We verify the proposed method with 30 regression datasets and a real-world problem: virtual metrology (VM) in semiconductor manufacturing. The experiment results show that the proposed method improves the accuracy by 8% compared with conventional supervised SVR, and the training time for the proposed method is 20% shorter than that of the benchmark methods. (C) 2015 Elsevier Ltd. All rights reserved.
引用
收藏
页码:85 / 106
页数:22
相关论文
共 50 条
  • [41] Semi-supervised Discriminant Analysis and Sparse Representation-based self-training for Face Recognition
    Jiang, Jiang
    Gan, Haitao
    Jiang, Liangwei
    Gao, Changxin
    Sang, Nong
    OPTIK, 2014, 125 (09): : 2170 - 2174
  • [42] A Three-Stage Self-Training Framework for Semi-Supervised Semantic Segmentation
    Ke, Rihuan
    Aviles-Rivero, Angelica, I
    Pandey, Saurabh
    Reddy, Saikumar
    Schonlieb, Carola-Bibiane
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 1805 - 1815
  • [43] Pseudo Label Fusion With Uncertainty Estimation for Semi-Supervised Cropping Box Regression
    Pan, Zhiyu
    Cui, Jiahao
    Wang, Kewei
    Wu, Yizheng
    Cao, Zhiguo
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 8157 - 8171
  • [44] STDPboost: A Self-Training Method Based on Density Peaks and Improved Adaboost for Semi-Supervised Classification
    Lin, Xu
    Li, Junnan
    IEEE ACCESS, 2023, 11 : 72974 - 72989
  • [45] A Robust Semi-Supervised Broad Learning System Guided by Ensemble-Based Self-Training
    Guo, Jifeng
    Chen, C. L. Philip
    IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (11) : 6410 - 6422
  • [46] Joint Self-Training and Rebalanced Consistency Learning for Semi-Supervised Change Detection
    Zhang, Xueting
    Huang, Xin
    Li, Jiayi
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [47] Semi-supervised Multitask Learning via Self-training and Maximum Entropy Discrimination
    Chao, Guoqing
    Sun, Shiliang
    NEURAL INFORMATION PROCESSING, ICONIP 2012, PT III, 2012, 7665 : 340 - 347
  • [48] A novel semi-supervised self-training method based on resampling for Twitter fake account identification
    Zeng, Ziming
    Li, Tingting
    Sun, Shouqiang
    Sun, Jingjing
    Yin, Jie
    DATA TECHNOLOGIES AND APPLICATIONS, 2022, 56 (03) : 409 - 428
  • [49] Granulation-based self-training for the semi-supervised classification of remote-sensing images
    Prem Shankar Singh Aydav
    Sonajharia Minz
    Granular Computing, 2020, 5 : 309 - 327
  • [50] Combination of Active Learning and Semi-Supervised Learning under a Self-Training Scheme
    Fazakis, Nikos
    Kanas, Vasileios G.
    Aridas, Christos K.
    Karlos, Stamatis
    Kotsiantis, Sotiris
    ENTROPY, 2019, 21 (10)