Improved variable reduction in partial least squares modelling based on Predictive-Property-Ranked Variables and adaptation of partial least squares complexity

被引:50
作者
Andries, Jan P. M. [2 ]
Vander Heyden, Yvan [3 ]
Buydens, Lutgarde M. C. [1 ]
机构
[1] Radboud Univ Nijmegen, Inst Mol & Mat, NL-6525 AJ Nijmegen, Netherlands
[2] Univ Profess Educ, Dept Life Sci, Avans Hgsk, NL-4800 RA Breda, Netherlands
[3] Vrije Univ Brussel VIB, Dept Analyt Chem & Pharmaceut Technol, Pharmaceut Res Ctr, B-1090 Brussels, Belgium
关键词
Variable reduction; PLS1; PPRVR-CAM; UVE-GA-PLS; UVE-iPLS; Wilcoxon signed rank test; NEAR-INFRARED SPECTROSCOPY; RELEVANT SPECTRAL REGIONS; SELECTIVITY RATIO PLOT; MULTIVARIATE CALIBRATION; WAVELENGTH SELECTION; PLS-REGRESSION; GENETIC ALGORITHMS; UNINFORMATIVE VARIABLES; EXPLANATORY VARIABLES; RETENTION PREDICTION;
D O I
10.1016/j.aca.2011.06.037
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
The calibration performance of partial least squares for one response variable (PLS1) can be improved by elimination of uninformative variables. Many methods are based on so-called predictive variable properties, which are functions of various PLS-model parameters, and which may change during the variable reduction process. In these methods variable reduction is made on the variables ranked in descending order for a given variable property. The methods start with full spectrum modelling. Iteratively, until a specified number of remaining variables is reached, the variable with the smallest property value is eliminated; a new PLS model is calculated, followed by a renewed ranking of the variables. The Stepwise Variable Reduction methods using Predictive-Property-Ranked Variables are denoted as SVR-PPRV. In the existing SVR-PPRV methods the PLS model complexity is kept constant during the variable reduction process. In this study, three new SVR-PPRV methods are proposed, in which a possibility for decreasing the PLS model complexity during the variable reduction process is build in. Therefore we denote our methods as PPRVR-CAM methods (Predictive-Property-Ranked Variable Reduction with Complexity Adapted Models). The selective and predictive abilities of the new methods are investigated and tested, using the absolute PLS regression coefficients as predictive property. They were compared with two modifications of existing SVR-PPRV methods (with constant PLS model complexity) and with two reference methods: uninformative variable elimination followed by either a genetic algorithm for PLS (UVE-GA-PLS) or an interval PLS (UVE-iPLS). The performance of the methods is investigated in conjunction with two data sets from near-infrared sources (NIR) and one simulated set. The selective and predictive performances of the variable reduction methods are compared statistically using the Wilcoxon signed rank test. The three newly developed PPRVR-CAM methods were able to retain significantly smaller numbers of informative variables than the existing SVR-PPRV. UVE-GA-PLS and UVE-iPLS methods without loss of prediction ability. Contrary to UVE-GA-PLS and UVE-iPLS, there is no variability in the number of retained variables in each PPRV(R) method. Renewed variable ranking, after deletion of a variable, followed by remodelling, combined with the possibility to decrease the PLS model complexity, is beneficial. A preferred PPRVR-CAM method is proposed. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:292 / 305
页数:14
相关论文
共 60 条
[41]   A Backward Variable Selection method for PLS regression (BVSPLS) [J].
Pierna, Juan Antonio Fernandez ;
Abbas, Ouissam ;
Baeten, Vincent ;
Dardenne, Pierre .
ANALYTICA CHIMICA ACTA, 2009, 642 (1-2) :89-93
[42]   The evaluation of two-step multivariate adaptive regression splines for chromatographic retention prediction of peptides [J].
Put, Raf ;
Vander Heyden, Yvan .
PROTEOMICS, 2007, 7 (10) :1664-1677
[43]   Biomarker discovery in mass spectral profiles by means of selectivity ratio plot [J].
Rajalahti, Tarja ;
Arneberg, Reidar ;
Berven, Frode S. ;
Myhr, Kjell-Morten ;
Ulvik, Rune J. ;
Kvalheim, Olav M. .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2009, 95 (01) :35-48
[44]   Discriminating Variable Test and Selectivity Ratio Plot: Quantitative Tools for Interpretation and Variable (Biomarker) Selection in Complex Spectral or Chromatographic Profiles [J].
Rajalahti, Tarja ;
Arneberg, Reidar ;
Kroksveen, Ann C. ;
Berle, Magnus ;
Myhr, Kjell-Morten ;
Kvalheim, Olav M. .
ANALYTICAL CHEMISTRY, 2009, 81 (07) :2581-2590
[45]   COVPROC method:: strategy in modeling dynamic systems [J].
Reinikainen, SP ;
Höskuldsson, A .
JOURNAL OF CHEMOMETRICS, 2003, 17 (02) :130-139
[46]   A method for near-infrared spectral calibration of complex plant samples with wavelet transform and elimination of uninformative variables [J].
Shao, XG ;
Wang, F ;
Chen, D ;
Su, QD .
ANALYTICAL AND BIOANALYTICAL CHEMISTRY, 2004, 378 (05) :1382-1387
[47]   VALIDATION OF REGRESSION-MODELS - METHODS AND EXAMPLES [J].
SNEE, RD .
TECHNOMETRICS, 1977, 19 (04) :415-428
[48]   Theoretical justification of wavelength selection in PLS calibration development of a new algorithm [J].
Spiegelman, CH ;
McShane, MJ ;
Goetz, MJ ;
Motamedi, M ;
Yue, QL ;
Coté, GL .
ANALYTICAL CHEMISTRY, 1998, 70 (01) :35-44
[49]   Development of robust calibration models in near infra-red spectrometric applications [J].
Swierenga, H ;
Wülfert, F ;
de Noord, OE ;
de Weijer, AP ;
Smilde, AK ;
Buydens, LMC .
ANALYTICA CHIMICA ACTA, 2000, 411 (1-2) :121-135
[50]   Sorting variables by using informative vectors as a strategy for feature selection in multivariate regression [J].
Teofilo, Reinaldo F. ;
Martins, Joao Paulo A. ;
Ferreira, Marcia M. C. .
JOURNAL OF CHEMOMETRICS, 2009, 23 (1-2) :32-48