Review and big data perspectives on robust data mining approaches for industrial process modeling with outliers and missing data

被引:212
作者
Zhu, Jinlin [1 ,2 ]
Ge, Zhiqiang [1 ]
Song, Zhihuan [1 ]
Gao, Furong [2 ]
机构
[1] Zhejiang Univ, Coll Control Sci & Engn, State Key Lab Ind Control Technol, Hangzhou, Zhejiang, Peoples R China
[2] Hong Kong Univ Sci & Technol, Dept Chem & Biol Engn, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Data mining; Robustness; Process modeling; Statistical process monitoring; Big data analytics; PRINCIPAL COMPONENT ANALYSIS; SUPPORT VECTOR REGRESSION; EXTREME LEARNING-MACHINE; PROJECTION-PURSUIT APPROACH; SENSOR NETWORK DESIGN; FAULT-DIAGNOSIS; DATA-DRIVEN; FEATURE-SELECTION; KALMAN FILTER; SOFT SENSOR;
D O I
10.1016/j.arcontrol.2018.09.003
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Industrial process data are usually mixed with missing data and outliers which can greatly affect the statistical explanation abilities for traditional data-driven modeling methods. In this sense, more attention should be paid on robust data mining methods so as to investigate those stable and reliable modeling prototypes for decision-making. This paper gives a systematic review of various state-of-the-art data preprocessing tricks as well as robust principal component analysis methods for process understanding and monitoring applications. Afterwards, comprehensive robust techniques have been discussed for various circumstances with diverse process characteristics. Finally, big data perspectives on potential challenges and opportunities have been highlighted for future explorations in the community. (C) 2018 Elsevier Ltd. All rights reserved.
引用
收藏
页码:107 / 133
页数:27
相关论文
共 268 条
[1]   Semisupervised Least Squares Support Vector Machine [J].
Adankon, Mathias M. ;
Cheriet, Mohamed ;
Biem, Alain .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2009, 20 (12) :1858-1870
[2]   Approximate Inference in State-Space Models With Heavy-Tailed Noise [J].
Agamennoni, Gabriel ;
Nieto, Juan I. ;
Nebot, Eduardo M. .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2012, 60 (10) :5024-5037
[3]   A comprehensive survey of numeric and symbolic outlier mining techniques [J].
Agyemang, Malik ;
Barker, Ken ;
Alhajj, Rada .
INTELLIGENT DATA ANALYSIS, 2006, 10 (06) :521-538
[4]   A MapReduce-based distributed SVM algorithm for automatic image annotation [J].
Alham, Nasullah Khalid ;
Li, Maozhen ;
Liu, Yang ;
Hammoud, Suhel .
COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2011, 62 (07) :2801-2811
[5]   REDUNDANT SENSOR NETWORK DESIGN FOR LINEAR-PROCESSES [J].
ALI, Y ;
NARASIMHAN, S .
AICHE JOURNAL, 1995, 41 (10) :2237-2249
[6]   A comparison of different procedures for principal component analysis in the presence of outliers [J].
Alkan, B. Baris ;
Atakan, Cemal ;
Alkan, Nesrin .
JOURNAL OF APPLIED STATISTICS, 2015, 42 (08) :1716-1722
[7]   The N-way Toolbox for MATLAB [J].
Andersson, CA ;
Bro, R .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2000, 52 (01) :1-4
[8]   Outlier mining in large high-dimensional data sets [J].
Angiulli, F ;
Pizzuti, C .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (02) :203-215
[9]  
[Anonymous], 2008, P 14 ACM SIGKDD INT
[10]  
[Anonymous], 1974, Outliers in statistical data