A modification of the ICOMP criterion for estimation of optimum complexity of PCR models

被引:2
作者
Capron, X
Walczak, B
de Noord, OE
Massart, DL
机构
[1] Vrije Univ Brussels, Inst Pharmaceut, ChemoAC, Dept Pharmaceut & Biomed Anal, B-1090 Brussels, Belgium
[2] Shell Global Solut Int BV, Shell Res & Technol Ctr, NL-1030 BN Amsterdam, Netherlands
关键词
complexity optimization; cross-validation; information criterion; atypical samples;
D O I
10.1002/cem.934
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The predictive ability of a PCR bilinear regression model is highly dependent on the number of latent variables selected. A non-optimal complexity is likely to result in a model yielding unsatisfactory predictions, due to a high bias or high variance of the coefficients of regression. The popular cross-validation methods such as leave-one-out cross-validation (LOOCV) and Monte-Carlo cross-validation (MCCV) are not always able to retain the proper number of latent variables, especially when atypical samples are present in the data. Also, they are computationally intensive, particularly for large data sets. In this study, the information complexity criterion ICOMP is modified in order to select the optimal PCR model. The results obtained demonstrate that this information criterion behaves at least as good as the cross-validation approaches, and usually outperforms them in terms of model selection and computation time, whether atypical samples are present in the data or not. Copyright (C) 2006 John Wiley & Sons, Ltd.
引用
收藏
页码:308 / 316
页数:9
相关论文
共 13 条
[1]  
Akaike H., 1973, 2 INT S INFORM THEOR, P267, DOI [DOI 10.1007/978-1-4612-1694-0_15, 10.1007/978-1-4612-1694-0_15]
[2]  
[Anonymous], 1989, MULTIVARIATE CALIBRA
[3]   STANDARD NORMAL VARIATE TRANSFORMATION AND DE-TRENDING OF NEAR-INFRARED DIFFUSE REFLECTANCE SPECTRA [J].
BARNES, RJ ;
DHANOA, MS ;
LISTER, SJ .
APPLIED SPECTROSCOPY, 1989, 43 (05) :772-777
[4]   Informational complexity criteria for regression models [J].
Bozdogan, H ;
Haughton, DMA .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 1998, 28 (01) :51-76
[5]  
Bozdogan H., 1988, Icomp: A New Model-Selection Criteria
[6]   Selection and weighting of samples in multivariate regression model updating [J].
Capron, X ;
Walczak, B ;
de Noord, OE ;
Massart, DL .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2005, 76 (02) :205-214
[7]   THE H-PRINCIPLE IN MODELING WITH APPLICATIONS TO CHEMOMETRICS [J].
HOSKULDSSON, A .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1992, 14 (1-3) :139-153
[8]   Two data sets of near infrared spectra [J].
Kalivas, JH .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1997, 37 (02) :255-259
[9]  
Massart D.L., 1997, HDB CHEMOMETRICS QUA, DOI 10.1021/ci980427d
[10]   Calculation of PLS prediction intervals using efficient recursive relations for the Jacobian matrix [J].
Serneels, S ;
Lemberge, P ;
Van Espen, PJ .
JOURNAL OF CHEMOMETRICS, 2004, 18 (02) :76-80