Augmented Industrial Data-Driven Modeling Under the Curse of Dimensionality

被引:20
作者
Jiang, Xiaoyu [1 ,2 ]
Kong, Xiangyin [1 ,2 ]
Ge, Zhiqiang [1 ,2 ]
机构
[1] Zhejiang Univ, Coll Control Sci & Engn, State Key Lab Ind Control Technol, Hangzhou 310027, Peoples R China
[2] Peng Cheng Lab, Shenzhen 518000, Peoples R China
基金
中国国家自然科学基金;
关键词
Curse of dimensionality; data augmentation; data-driven modeling; industrial processes; machine learning; FAULT-DIAGNOSIS; KERNEL;
D O I
10.1109/JAS.2023.123396
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The curse of dimensionality refers to the problem of increased sparsity and computational complexity when dealing with high-dimensional data. In recent years, the types and variables of industrial data have increased significantly, making data-driven models more challenging to develop. To address this problem, data augmentation technology has been introduced as an effective tool to solve the sparsity problem of high-dimensional industrial data. This paper systematically explores and discusses the necessity, feasibility, and effectiveness of augmented industrial data-driven modeling in the context of the curse of dimensionality and virtual big data. Then, the process of data augmentation modeling is analyzed, and the concept of data boosting augmentation is proposed. The data boosting augmentation involves designing the reliability weight and actual-virtual weight functions, and developing a double weighted partial least squares model to optimize the three stages of data generation, data fusion, and modeling. This approach significantly improves the inter-pretability, effectiveness, and practicality of data augmentation in the industrial modeling. Finally, the proposed method is verified using practical examples of fault diagnosis systems and virtual measurement systems in the industry. The results demonstrate the effectiveness of the proposed approach in improving the accuracy and robustness of data-driven models, making them more suitable for real-world industrial applications.
引用
收藏
页码:1445 / 1461
页数:17
相关论文
共 49 条
  • [1] The curse(s) of dimensionality
    Altman, Naomi
    Krzywinski, Martin
    [J]. NATURE METHODS, 2018, 15 (06) : 399 - 400
  • [2] AN INTRODUCTION TO KERNEL AND NEAREST-NEIGHBOR NONPARAMETRIC REGRESSION
    ALTMAN, NS
    [J]. AMERICAN STATISTICIAN, 1992, 46 (03) : 175 - 185
  • [3] Balasubramanian M, 2002, SCIENCE, V295
  • [4] LEARNABILITY AND THE VAPNIK-CHERVONENKIS DIMENSION
    BLUMER, A
    EHRENFEUCHT, A
    HAUSSLER, D
    WARMUTH, MK
    [J]. JOURNAL OF THE ACM, 1989, 36 (04) : 929 - 965
  • [5] Bühlmann P, 2011, SPRINGER SER STAT, P1, DOI 10.1007/978-3-642-20192-9
  • [6] SMOTE: Synthetic minority over-sampling technique
    Chawla, Nitesh V.
    Bowyer, Kevin W.
    Hall, Lawrence O.
    Kegelmeyer, W. Philip
    [J]. 2002, American Association for Artificial Intelligence (16)
  • [7] Cho JH, 2016, Arxiv, DOI arXiv:1511.06348
  • [8] A PLANT-WIDE INDUSTRIAL-PROCESS CONTROL PROBLEM
    DOWNS, JJ
    VOGEL, EF
    [J]. COMPUTERS & CHEMICAL ENGINEERING, 1993, 17 (03) : 245 - 255
  • [9] Minimizing Convolutional Neural Network Training Data With Proper Data Augmentation for Inline Defect Classification
    Fujishiro, Akihiro
    Nagamura, Yoshikazu
    Usami, Tatsuya
    Inoue, Masao
    [J]. IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, 2021, 34 (03) : 333 - 339
  • [10] A comparative study of just-in-time-learning based methods for online soft sensor modeling
    Ge, Zhiqiang
    Song, Zhihuan
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2010, 104 (02) : 306 - 317