Evolutionary optimization based pseudo labeling for semi-supervised soft sensor development of industrial processes

被引:35
作者
Jin, Huaiping [1 ,2 ]
Li, Zheng [1 ,2 ]
Chen, Xiangguang [3 ]
Qian, Bin [1 ,2 ]
Yang, Biao [1 ,2 ]
Yang, Jianwen [4 ]
机构
[1] Kunming Univ Sci & Technol, Fac Informat Engn & Automat, Dept Automat, Kunming 650500, Yunnan, Peoples R China
[2] Kunming Univ Sci & Technol, Yunnan Key Lab Artificial Intelligence, Kunming 650500, Yunnan, Peoples R China
[3] Beijing Inst Technol, Sch Chem & Chem Engn, Dept Chem Engn, Beijing 100081, Peoples R China
[4] Sichuan Haite Yamei Aviat Technol Co Ltd, Chengdu 610041, Peoples R China
基金
中国国家自然科学基金;
关键词
Soft sensor; Semi-supervised learning; Pseudo labeling; Evolutionary optimization; Ensemble learning; EXTREME LEARNING-MACHINE; MODELING METHOD; REGRESSION; FRAMEWORK; PREDICTION; MIXTURE; REGULARIZATION; CLASSIFIER; ANALYTICS;
D O I
10.1016/j.ces.2021.116560
中图分类号
TQ [化学工业];
学科分类号
0817 ;
摘要
Data-based soft sensors have been widely applied in industrial processes for enabling online prediction of difficult-to-measure variables. However, there exists a common phenomenon of "unlabeled data rich, but labeled data poor" in many practical processes, which has become the main bottleneck of developing high-performance data-based soft sensors. To address this issue, two novel semi-supervised soft sensor methods, namely evolutionary optimization based pseudo labeling method (EOPL) and ensemble EOPL method (EnEOPL), are proposed. The proposed methods first formulate the issue of pseudo labeling for unlabeled data as an optimization problem, where the labels of unlabeled data (denoting pseudolabels) serve as the decision variables. Then, an evolutionary optimization approach is used to solve the optimization problem, which utilizes Gaussian process regression (GPR) as the base learner. Next, a new GPR model is built by the enlarged labeled training set which combines the labeled data and high-confidence pseudo-labeled data together. Furthermore, by exploiting ensemble learning framework, EOPL is extended to EnEOPL in order to enhance the prediction performance. Two case studies demonstrate that the proposed methods are superior to traditional pseudo-labeling style semi-supervised methods. (C) 2021 Elsevier Ltd. All rights reserved.
引用
收藏
页数:16
相关论文
共 60 条
  • [1] [Anonymous], 2018, IEEE T IND INFORM, DOI [DOI 10.1109/TII.2017.2737827, 10.1109/TII.2017.2737827]
  • [2] Bansal J.C., 2019, Evolutionary and swarm intelligence algorithms
  • [3] Regularization and semi-supervised learning on large graphs
    Belkin, M
    Matveeva, I
    Niyogi, P
    [J]. LEARNING THEORY, PROCEEDINGS, 2004, 3120 : 624 - 638
  • [4] Belkin M, 2006, J MACH LEARN RES, V7, P2399
  • [5] Blum A., 1998, Proceedings of the Eleventh Annual Conference on Computational Learning Theory, P92, DOI 10.1145/279943.279962
  • [6] Chen DD, 2018, PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P2014
  • [7] Chong Y., 2020, NEUROCOMPUTING
  • [8] Dasgupta D., 2013, Evolutionary Algorithms in Engineering Applications
  • [9] A PLANT-WIDE INDUSTRIAL-PROCESS CONTROL PROBLEM
    DOWNS, JJ
    VOGEL, EF
    [J]. COMPUTERS & CHEMICAL ENGINEERING, 1993, 17 (03) : 245 - 255
  • [10] Semi-supervised dynamic latent variable modeling: I/O probabilistic slow feature analysis approach
    Fan, Lei
    Kodamana, Hariprasad
    Huang, Biao
    [J]. AICHE JOURNAL, 2019, 65 (03) : 964 - 979