Key wavelengths screening using competitive adaptive reweighted sampling method for multivariate calibration

被引:1571
作者
Li, Hongdong [1 ]
Liang, Yizeng [1 ]
Xu, Qingsong [2 ]
Cao, Dongsheng [1 ]
机构
[1] Cent S Univ, Coll Chem & Chem Engn, Res Ctr Modernizat Tradit Chinese Med, Changsha 410083, Peoples R China
[2] Cent S Univ, Sch Math Sci, Changsha 410083, Peoples R China
关键词
Wavelength selection; Monte Carlo; Adaptive reweighted sampling; Model sampling; Near infrared; Multivariate calibration; PARTIAL LEAST-SQUARES; UNINFORMATIVE VARIABLE ELIMINATION; SUCCESSIVE PROJECTIONS ALGORITHM; GENETIC-ALGORITHM; REGRESSION APPLICATION; SELECTION; OPTIMIZATION; COMPONENTS; MODELS; TOOL;
D O I
10.1016/j.aca.2009.06.046
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
By employing the simple but effective principle 'survival of the fittest' on which Darwin's Evolution Theory is based, a novel strategy for selecting an optimal combination of key wavelengths of multi-component spectral data, named competitive adaptive reweighted sampling (CARS), is developed. Key wavelengths are defined as the wavelengths with large absolute coefficients in a multivariate linear regression model, such as partial least squares (PLS). In the present work, the absolute values of regression coefficients of PLS model are used as an index for evaluating the importance of each wavelength. Then, based on the importance level of each wavelength, CARS sequentially selects N subsets of wavelengths from N Monte Carlo (MC) sampling runs in an iterative and competitive manner. In each sampling run, a fixed ratio (e.g. 80%) of samples is first randomly selected to establish a calibration model. Next, based on the regression coefficients, a two-step procedure including exponentially decreasing function (EDF) based enforced wavelength selection and adaptive reweighted sampling (ARS) based competitive wavelength selection is adopted to select the key wavelengths. Finally, cross validation (CV) is applied to choose the subset with the lowest root mean square error of CV (RMSECV). The performance of the proposed procedure is evaluated using one simulated dataset together with one near infrared dataset of two properties. The results reveal an outstanding characteristic of CARS that it can usually locate an optimal combination of some key wavelengths which are interpretable to the chemical property of interest. Additionally, our study shows that better prediction is obtained by CARS when compared to full spectrum PLS modeling, Monte Carlo uninformative variable elimination (MC-UVE) and moving window partial least squares regression (MWPLSR). (C) 2009 Elsevier B.V. All rights reserved.
引用
收藏
页码:77 / 84
页数:8
相关论文
共 38 条
  • [1] RELATIONSHIP BETWEEN VARIABLE SELECTION AND DATA AUGMENTATION AND A METHOD FOR PREDICTION
    ALLEN, DM
    [J]. TECHNOMETRICS, 1974, 16 (01) : 125 - 127
  • [2] The successive projections algorithm for variable selection in spectroscopic multicomponent analysis
    Araújo, MCU
    Saldanha, TCB
    Galvao, RKH
    Yoneyama, T
    Chame, HC
    Visani, V
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2001, 57 (02) : 65 - 73
  • [3] Genetic algorithm-based method for selecting wavelengths and model size for use with partial least-squares regression: Application to near-infrared spectroscopy
    Bangalore, AS
    Shaffer, RE
    Small, GW
    Arnold, MA
    [J]. ANALYTICAL CHEMISTRY, 1996, 68 (23) : 4200 - 4212
  • [4] OCCAM RAZOR
    BLUMER, A
    EHRENFEUCHT, A
    HAUSSLER, D
    WARMUTH, MK
    [J]. INFORMATION PROCESSING LETTERS, 1987, 24 (06) : 377 - 380
  • [5] A variable selection method based on uninformative variable elimination for multivariate calibration of near-infrared spectra
    Cai, Wensheng
    Li, Yankun
    Shao, Xueguang
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2008, 90 (02) : 188 - 194
  • [6] Elimination of uninformative variables for multivariate calibration
    Centner, V
    Massart, DL
    deNoord, OE
    deJong, S
    Vandeginste, BM
    Sterna, C
    [J]. ANALYTICAL CHEMISTRY, 1996, 68 (21) : 3851 - 3858
  • [7] Bayesian linear regression and variable selection for spectroscopic calibration
    Chen, Tao
    Martin, Elaine
    [J]. ANALYTICA CHIMICA ACTA, 2009, 631 (01) : 13 - 21
  • [8] Least angle regression - Rejoinder
    Efron, B
    Hastie, T
    Johnstone, I
    Tibshirani, R
    [J]. ANNALS OF STATISTICS, 2004, 32 (02) : 494 - 499
  • [9] PARTIAL LEAST-SQUARES REGRESSION - A TUTORIAL
    GELADI, P
    KOWALSKI, BR
    [J]. ANALYTICA CHIMICA ACTA, 1986, 185 : 1 - 17
  • [10] Gemperline P. J., 1989, J CHEMOMETR, V3, P343, DOI DOI 10.1002/CEM.1180030204