Variable selection in multivariate calibration based on clustering of variable concept

被引:23
作者
Farrokhnia, Maryam [1 ]
Karimi, Sadegh [2 ]
机构
[1] Bushehr Univ Med Sci, Persian Gulf Marine Biotechnol Res Ctr, Bushehr, Iran
[2] Persian Gulf Univ, Coll Sci, Dept Chem, Bushehr, Iran
关键词
Variable selection; Partial least square; Clustering of variable-partial least square; Self organization map; Interval based partial least square; PARTIAL LEAST-SQUARES; WAVELENGTH INTERVAL SELECTION; NEAR-INFRARED SPECTROSCOPY; DATA DIMENSION REDUCTION; REGRESSION; CLASSIFICATION; ALGORITHMS; PEPTIDES; SPECTRA; MODELS;
D O I
10.1016/j.aca.2015.11.002
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Recently we have proposed a new variable selection algorithm, based on clustering of variable concept (CLoVA) in classification problem With the same idea, this new concept has been applied to a regression problem and then the obtained results have been compared with conventional variable selection strategies for PLS. The basic idea behind the clustering of variable is that, the instrument channels are clustered into different clusters via clustering algorithms. Then, the spectral data of each cluster are subjected to PLS regression. Different real data sets (Cargill corn, Biscuit dough, ACE QSAR, Soy, and Tablet) have been used to evaluate the influence of the clustering of variables on the prediction performances of PLS. Almost in the all cases, the statistical parameter especially in prediction error shows the superiority of CLoVA-PLS respect to other variable selection strategies. Finally the synergy clustering of variable (sCLoVA-PLS), which is used the combination of cluster, has been proposed as an efficient and modification of CLoVA algorithm. The obtained statistical parameter indicates that variable clustering can split useful part from redundant ones, and then based on informative cluster; stable model can be reached. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:70 / 81
页数:12
相关论文
共 36 条
  • [1] A new and efficient variable selection algorithm based on ant colony optimization. Applications to near infrared spectroscopy/partial least-squares analysis
    Allegrini, Franco
    Olivieri, Alejandro C.
    [J]. ANALYTICA CHIMICA ACTA, 2011, 699 (01) : 18 - 25
  • [2] Variable selection in discriminant partial least-squares analysis
    Alsberg, BK
    Kell, DB
    Goodacre, R
    [J]. ANALYTICAL CHEMISTRY, 1998, 70 (19) : 4126 - 4133
  • [3] Variable selection in regression-a tutorial
    Andersen, C. M.
    Bro, R.
    [J]. JOURNAL OF CHEMOMETRICS, 2010, 24 (11-12) : 728 - 737
  • [4] A comparison of nine PLS1 algorithms
    Andersson, Martin
    [J]. JOURNAL OF CHEMOMETRICS, 2009, 23 (9-10) : 518 - 529
  • [5] Variable selection in near-infrared spectroscopy: Benchmarking of feature selection methods on biodiesel data
    Balabin, Roman M.
    Smirnov, Sergey V.
    [J]. ANALYTICA CHIMICA ACTA, 2011, 692 (1-2) : 63 - 72
  • [6] Classification of GC-MS measurements of wines by combining data dimension reduction and variable selection techniques
    Ballabio, Davide
    Skov, Thomas
    Leardi, Riccardo
    Bro, Rasmus
    [J]. JOURNAL OF CHEMOMETRICS, 2008, 22 (7-8) : 457 - 463
  • [7] Genetic algorithm-based method for selecting wavelengths and model size for use with partial least-squares regression: Application to near-infrared spectroscopy
    Bangalore, AS
    Shaffer, RE
    Small, GW
    Arnold, MA
    [J]. ANALYTICAL CHEMISTRY, 1996, 68 (23) : 4200 - 4212
  • [8] Dual stacked partial least squares for analysis of near-infrared spectra
    Bi, Yiming
    Xie, Qiong
    Peng, Silong
    Tang, Liang
    Hu, Yong
    Tan, Jie
    Zhao, Yuhui
    Li, Changwen
    [J]. ANALYTICA CHIMICA ACTA, 2013, 792 : 19 - 27
  • [9] Breiman L, 1996, MACH LEARN, V24, P49
  • [10] Bayesian wavelet regression on curves with application to a spectroscopic calibration problem
    Brown, PJ
    Fearn, T
    Vannucci, M
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (454) : 398 - 408