Construction of stable multivariate calibration models using unsupervised segmented principal component regression

被引:13
作者
Hemmateenejad, Bahram [1 ,2 ]
Karimi, Sadegh [2 ]
机构
[1] Shiraz Univ Med Sci, Med & Nat Prod Chem Res Ctr, Shiraz, Iran
[2] Shiraz Univ, Dept Chem, Shiraz, Iran
关键词
multivariate calibration; principal component regression; segmented PCR; clustering; self-organization map; LEAST-SQUARES REGRESSION; NEAR-INFRARED-SPECTRA; WAVELENGTH SELECTION; CORRELATION RANKING; DATA SETS; PREDICTION; PLS; MIXTURES; PHENOL; PCR;
D O I
10.1002/cem.1390
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In multivariate spectral calibration by principal component regression (PCR), the principal components (PCs) are calculated from the response data measured at all employed instrument channels; however some channels are redundant and their responses do not possess useful information. Thus, the extracted PCs possess mixed information from both useful and redundant channels. In this work, we propose a segmentation approach based on unsupervised pattern recognition to identify the most informative spectral region and then to construct a stable multivariate calibration model by PCR. In this method, the instrument channels are clustered into different segments via Kohonen self-organization map. The spectral data of each segment are then subjected to PCA and the derived PCs are used as input variables for an inverse least square (ILS) regression model employing stepwise selection of the informative PCs. The proposed method was evaluated by the analysis of four simulated and six experimental data sets. It was found that our proposed method can model the above data sets with prediction errors lower than conventional partial least squares (PLS) and PCR methods. In addition, the prediction ability of our method was better than the previously reported models for these data sets. Copyright (C) 2011 John Wiley & Sons, Ltd. Supporting information may be found in the online version of this article.
引用
收藏
页码:139 / 150
页数:12
相关论文
共 29 条
[1]   Genetic algorithm applied to the selection of principal components [J].
Barros, AS ;
Rutledge, DN .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1998, 40 (01) :65-81
[2]   Genetic algorithms applied to the selection of factors in principal component regression [J].
Depczynski, U ;
Frost, VJ ;
Molt, K .
ANALYTICA CHIMICA ACTA, 2000, 420 (02) :217-227
[3]  
FERRE J, 1995, COMPUT STAT DATA AN, V19, P669
[4]  
Héberger K, 1999, J CHEMOMETR, V13, P473, DOI 10.1002/(SICI)1099-128X(199905/08)13:3/4<473::AID-CEM558>3.3.CO
[5]  
2-N
[6]   Correlation ranking procedure for factor selection in PC-ANN modeling and application to ADMETox evaluation [J].
Hemmateenejad, B .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2005, 75 (02) :231-245
[7]   Optimal QSAR analysis of the carcinogenic activity of drugs by correlation ranking and genetic algorithm-based [J].
Hemmateenejad, B .
JOURNAL OF CHEMOMETRICS, 2004, 18 (11) :475-485
[8]   A comparative study between PCR and PLS in simultaneous spectrophotometric determination of diphenylamine, aniline, and phenol: Effect of wavelength selection [J].
Hemmateenejad, Bahram ;
Akhond, Morteza ;
Samari, Fayezeh .
SPECTROCHIMICA ACTA PART A-MOLECULAR AND BIOMOLECULAR SPECTROSCOPY, 2007, 67 (3-4) :958-965
[9]   A segmented principal component analysis-regression approach to quantitative structure-activity relationship modeling [J].
Hemmateenejad, Bahram ;
Elyasi, Maryam .
ANALYTICA CHIMICA ACTA, 2009, 646 (1-2) :30-38
[10]   Two data sets of near infrared spectra [J].
Kalivas, JH .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1997, 37 (02) :255-259