Industrial PLS model variable selection using moving window variable importance in projection

被引:50
作者
Lu, Bo [1 ]
Castillo, Ivan [2 ]
Chiang, Leo [2 ]
Edgar, Thomas F. [1 ]
机构
[1] Univ Texas Austin, McKetta Dept Chem Engn, Austin, TX 78712 USA
[2] Dow Chem Co USA, Analyt Technol Ctr, Freeport, TX 77541 USA
关键词
PLS regression; Variable selection; Model reduction; Partial least squares; Multivariate statistics; Process monitoring; Soft sensors; Inferential sensors; LEAST-SQUARES REGRESSION; GENETIC ALGORITHM; ELIMINATION; IDENTIFICATION; SENSORS;
D O I
10.1016/j.chemolab.2014.03.020
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Soft sensors (or inferential sensors) have been demonstrated to be an effective solution for monitoring quality performance and control applications in the chemical industry. One of the key issues during the development of soft sensor models is the selection of relevant variables from a large array of measurements. A subset of variables that are selected based on first principles and statistical correlations eases the model development process. The resulting model will perform better and will be easier to maintain during the deployment stage. In the current literature, data-driven variable selection methods have been investigated within the context of spectroscopic data and bioinformatics. In these studies, the variable selection methods assume that the inherent correlation in the entire data set remains fixed. This is not the case in common industrial processes. In this paper, existing variable selection methods based on partial least squares (PLS) will first be evaluated. Second, we will present a new approach called moving window variable importance in projection (MW-VIP) to target the selection of correlations present in segments or small clusters. Finally, a set of new evaluation criteria will be presented along with industrial data set modeling results to demonstrate the effectiveness of our proposed approach. (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:90 / 109
页数:20
相关论文
共 38 条
[1]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[2]   Variable selection in regression-a tutorial [J].
Andersen, C. M. ;
Bro, R. .
JOURNAL OF CHEMOMETRICS, 2010, 24 (11-12) :728-737
[3]   Multimodel inference - understanding AIC and BIC in model selection [J].
Burnham, KP ;
Anderson, DR .
SOCIOLOGICAL METHODS & RESEARCH, 2004, 33 (02) :261-304
[4]   Elimination of uninformative variables for multivariate calibration [J].
Centner, V ;
Massart, DL ;
deNoord, OE ;
deJong, S ;
Vandeginste, BM ;
Sterna, C .
ANALYTICAL CHEMISTRY, 1996, 68 (21) :3851-3858
[5]  
Chiang L.H., 2001, Fault detection and diagnosis in industrial systems, DOI DOI 10.1007/978-1-4471-0347-9
[6]   Genetic algorithms combined with discriminant analysis for key variable identification [J].
Chiang, LH ;
Pell, RJ .
JOURNAL OF PROCESS CONTROL, 2004, 14 (02) :143-155
[7]   Sparse partial least squares regression for simultaneous dimension reduction and variable selection [J].
Chun, Hyonho ;
Keles, Suenduez .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2010, 72 :3-25
[8]   SIMPLS - AN ALTERNATIVE APPROACH TO PARTIAL LEAST-SQUARES REGRESSION [J].
DEJONG, S .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1993, 18 (03) :251-263
[9]   Identification of faulty sensors using principal component analysis [J].
Dunia, R ;
Qin, SJ ;
Edgar, TF ;
McAvoy, TJ .
AICHE JOURNAL, 1996, 42 (10) :2797-2812
[10]  
Eriksson L., 2006, Multi and Megavariate Data Analysis, Part I: Basic Principles and Applications, V2nd, P1