Generalization of Powered-Partial-Least-Squares

被引:2
作者
Lavoie, Francis B. [1 ]
Muteki, Koji [2 ]
Gosselin, Ryan [1 ]
机构
[1] Univ Sherbrooke, Fac Engn, Dept Chem & Biotechnol Engn, 2500 Boul Univ, Sherbrooke, PQ J1K 2R1, Canada
[2] Pfizer Worldwide Res & Dev, SPECTech Grp, Eastern Point Rd, Groton, CT 06340 USA
关键词
PLS; Robust regression; Spectral analysis; Variable selection; VARIABLE SELECTION METHODS; PLS-REGRESSION; WAVELENGTH SELECTION; MULTIVARIATE CALIBRATION; INFRARED-SPECTROSCOPY; GENETIC ALGORITHMS; REDUCTION; TOOL; QUALITY; MODELS;
D O I
10.1016/j.chemolab.2018.05.006
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Indahl originally proposed a variant to Wold's PLS1 algorithm in which weight coefficients were all modified by an exponent coefficient. This led to Powered-PLS (P-PLS). The aim of this paper is to revisit Indahi's P-PLS algorithm in order to make a robust and fast regression methodology calculating easy to interpret models. We first demonstrate that P-PLS is in fact a regression based on correlation maximization, but constrained by weight coefficients originally calculated in standard PLS1. From that, we propose a generalization of P-PLS by replacing the power transformation function by beta Cumulative Density Functions (beta-CDFs), leading to our proposed regression methodology called beta-PLS. With two public datasets, we demonstrate that P-PLS and even more beta-PIS regressions outperform standard PLS1 in terms of cross-validation performances in the case where the number of calibration observations is largely lower than the number of variables in X.
引用
收藏
页码:1 / 11
页数:11
相关论文
共 54 条
[51]  
Kohavi R., Et al., A study of cross-validation and bootstrap for accuracy estimation and model selection, Ijcai, 14, pp. 1137-1145, (1995)
[52]  
Westad F., Martens H., Variable selection in near infrared spectroscopy based on significance testing in partial least squares regression, J. Near Infrared Spectrosc., 8, 2, pp. 117-124, (2000)
[53]  
Broadhurst D., Goodacre R., Jones A., Rowland J.J., Kell D.B., Genetic algorithms as a method for variable selection in multiple linear regression and partial least squares regression, with applications to pyrolysis mass spectrometry, Anal. Chim. Acta, 348, 1-3, pp. 71-86, (1997)
[54]  
Norris K.H., Ritchie G.E., Assuring specificity for a multivariate near-infrared (nir) calibration: the example of the chambersburg shoot-out 2002 data set, J. Pharmaceut. Biomed. Anal., 48, 3, pp. 1037-1041, (2008)