Improved variable reduction in partial least squares modelling based on Predictive-Property-Ranked Variables and adaptation of partial least squares complexity

被引:50
作者
Andries, Jan P. M. [2 ]
Vander Heyden, Yvan [3 ]
Buydens, Lutgarde M. C. [1 ]
机构
[1] Radboud Univ Nijmegen, Inst Mol & Mat, NL-6525 AJ Nijmegen, Netherlands
[2] Univ Profess Educ, Dept Life Sci, Avans Hgsk, NL-4800 RA Breda, Netherlands
[3] Vrije Univ Brussel VIB, Dept Analyt Chem & Pharmaceut Technol, Pharmaceut Res Ctr, B-1090 Brussels, Belgium
关键词
Variable reduction; PLS1; PPRVR-CAM; UVE-GA-PLS; UVE-iPLS; Wilcoxon signed rank test; NEAR-INFRARED SPECTROSCOPY; RELEVANT SPECTRAL REGIONS; SELECTIVITY RATIO PLOT; MULTIVARIATE CALIBRATION; WAVELENGTH SELECTION; PLS-REGRESSION; GENETIC ALGORITHMS; UNINFORMATIVE VARIABLES; EXPLANATORY VARIABLES; RETENTION PREDICTION;
D O I
10.1016/j.aca.2011.06.037
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
The calibration performance of partial least squares for one response variable (PLS1) can be improved by elimination of uninformative variables. Many methods are based on so-called predictive variable properties, which are functions of various PLS-model parameters, and which may change during the variable reduction process. In these methods variable reduction is made on the variables ranked in descending order for a given variable property. The methods start with full spectrum modelling. Iteratively, until a specified number of remaining variables is reached, the variable with the smallest property value is eliminated; a new PLS model is calculated, followed by a renewed ranking of the variables. The Stepwise Variable Reduction methods using Predictive-Property-Ranked Variables are denoted as SVR-PPRV. In the existing SVR-PPRV methods the PLS model complexity is kept constant during the variable reduction process. In this study, three new SVR-PPRV methods are proposed, in which a possibility for decreasing the PLS model complexity during the variable reduction process is build in. Therefore we denote our methods as PPRVR-CAM methods (Predictive-Property-Ranked Variable Reduction with Complexity Adapted Models). The selective and predictive abilities of the new methods are investigated and tested, using the absolute PLS regression coefficients as predictive property. They were compared with two modifications of existing SVR-PPRV methods (with constant PLS model complexity) and with two reference methods: uninformative variable elimination followed by either a genetic algorithm for PLS (UVE-GA-PLS) or an interval PLS (UVE-iPLS). The performance of the methods is investigated in conjunction with two data sets from near-infrared sources (NIR) and one simulated set. The selective and predictive performances of the variable reduction methods are compared statistically using the Wilcoxon signed rank test. The three newly developed PPRVR-CAM methods were able to retain significantly smaller numbers of informative variables than the existing SVR-PPRV. UVE-GA-PLS and UVE-iPLS methods without loss of prediction ability. Contrary to UVE-GA-PLS and UVE-iPLS, there is no variability in the number of retained variables in each PPRV(R) method. Renewed variable ranking, after deletion of a variable, followed by remodelling, combined with the possibility to decrease the PLS model complexity, is beneficial. A preferred PPRVR-CAM method is proposed. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:292 / 305
页数:14
相关论文
共 60 条
[1]   Comparison of different variable selection methods conducted on NIR transmission measurements on intact tablets [J].
Abrahamsson, C ;
Johansson, J ;
Sparén, A ;
Lindgren, F .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2003, 69 (1-2) :3-12
[2]  
[Anonymous], CHEMOAC STAND FUNCT
[3]   Selecting the best variables for classifying production batches into two quality levels [J].
Anzanello, Michel J. ;
Albin, Susan L. ;
Chaovalitwongse, Wanpracha A. .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2009, 97 (02) :111-117
[4]   Genetic algorithm-based method for selecting wavelengths and model size for use with partial least-squares regression: Application to near-infrared spectroscopy [J].
Bangalore, AS ;
Shaffer, RE ;
Small, GW ;
Arnold, MA .
ANALYTICAL CHEMISTRY, 1996, 68 (23) :4200-4212
[5]   NIR spectroscopy: a rapid-response analytical tool [J].
Blanco, M ;
Villarroya, I .
TRAC-TRENDS IN ANALYTICAL CHEMISTRY, 2002, 21 (04) :240-250
[6]   Analysis of water in food by near infrared spectroscopy [J].
Büning-Pfaue, H .
FOOD CHEMISTRY, 2003, 82 (01) :107-115
[7]   Modelling the quality of enantiomeric separations using Mutual Information as an alternative variable selection technique [J].
Caetano, Sonia ;
Krier, Catherine ;
Verleysen, Michel ;
Heyden, Yvan Vander .
ANALYTICA CHIMICA ACTA, 2007, 602 (01) :37-46
[8]   A variable selection method based on uninformative variable elimination for multivariate calibration of near-infrared spectra [J].
Cai, Wensheng ;
Li, Yankun ;
Shao, Xueguang .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2008, 90 (02) :188-194
[9]   Elimination of uninformative variables for multivariate calibration [J].
Centner, V ;
Massart, DL ;
deNoord, OE ;
deJong, S ;
Vandeginste, BM ;
Sterna, C .
ANALYTICAL CHEMISTRY, 1996, 68 (21) :3851-3858
[10]   Performance of some variable selection methods when multicollinearity is present [J].
Chong, IG ;
Jun, CH .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2005, 78 (1-2) :103-112