Comparison of the variable importance in projection (VIP) and of the selectivity ratio (SR) methods for variable selection and interpretation

被引:433
|
作者
Farres, Mireia [1 ]
Platikanov, Stefan [1 ]
Tsakovski, Stefan [2 ]
Tauler, Roma [1 ]
机构
[1] CSIC, IDAEA, Dept Environm Chem, ES-08034 Barcelona, Spain
[2] Univ Sofia, Fac Chem, Dept Analyt Chem, Sofia 1164, Bulgaria
基金
欧洲研究理事会;
关键词
variable importance in projection; selectivity ratio; variable selection; partial least squares; PARTIAL LEAST-SQUARES; MASS-SPECTRAL PROFILES; MICROARRAY DATA; REGRESSION; CLASSIFICATION; IDENTIFICATION; PERFORMANCE; INDEX; WATER; PANEL;
D O I
10.1002/cem.2736
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This study compares the application of two variable selection methods in partial least squares regression (PLSR), the variable importance in projection (VIP) method and the selectivity ratio (SR) method. For this purpose, three different data sets were analysed: (a) physiochemical water quality parameters related to sensorial data, (b) gas chromatography-mass spectrometry (GC-MS) chemical (organic compound) profiles from fossil sea sediment samples related to sea surface temperature (SST) changes, and (c) exposed genes of Daphnia magna female samples related to their total offspring production. Correlation coefficients (r), levels of significance (p-value) and interpretation of the underlying experimental phenomena allowed the discussion about the best approach for variable selection in each case. The comparison of the two variable selection methods in the first water quality data set showed that the SR method is more accurate for sensorial prediction. For the climate data set, when raw total ion current (TIC) GC-MS chromatograms were considered, variables selected using the VIP method were easier to interpret compared with those selected by the SR method. However, when only some chromatographic peak areas (concentrations) were considered, the SR method was more efficient for prediction, and the VIP method selected the most relevant variables for the interpretation of SST changes. Finally, for the transcriptomic data set, the SR method was found again to be more reliable for prediction purposes. Copyright (c) 2015 John Wiley & Sons, Ltd.
引用
收藏
页码:528 / 536
页数:9
相关论文
共 50 条
  • [41] Assessing Variable Importance for Best Subset Selection
    Seedorff, Jacob
    Cavanaugh, Joseph E.
    ENTROPY, 2024, 26 (09)
  • [42] Variable Selection Methods in QSAR: An Overview
    Perez Gonzalez, Maykel
    Teran, Carmen
    Saiz-Urra, Liane
    Teijeira, Marta
    CURRENT TOPICS IN MEDICINAL CHEMISTRY, 2008, 8 (18) : 1606 - 1627
  • [43] Empirical Bayes methods in variable selection
    Bar, Haim
    Liu, Kangyan
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2019, 11 (02)
  • [44] An assessment of the jackknife and bootstrap procedures on uncertainty estimation in the variable importance in the projection metric
    Afanador, N. L.
    Tran, T. N.
    Buydens, L. M. C.
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2014, 137 : 162 - 172
  • [45] RECYCLING-ORIENTED CHARACTERIZATION OF PET WASTE STREAM BY SWIR HYPERSPECTRAL IMAGING AND VARIABLE SELECTION METHODS
    Bonifazi, Giuseppe
    Capobianco, Giuseppe
    Cucuzza, Paola
    Serranti, Silvia
    Uzzo, Andrea
    DETRITUS, 2022, 18 : 42 - 49
  • [46] Comparison of different measurement techniques and variable selection methods for FT-MIR in wine analysis
    Friedel, Matthias
    Patz, Claus-Dieter
    Dietrich, Helmut
    FOOD CHEMISTRY, 2013, 141 (04) : 4200 - 4207
  • [47] Comparison of variable selection methods in predictive models applied to near-infrared and genomic data
    Ferreira, R. A.
    Peternelli, L. A.
    GENETICS AND MOLECULAR RESEARCH, 2021, 20 (03):
  • [48] A numeric comparison of variable selection algorithms for supervised learning
    Palombo, G.
    Narsky, I.
    NUCLEAR INSTRUMENTS & METHODS IN PHYSICS RESEARCH SECTION A-ACCELERATORS SPECTROMETERS DETECTORS AND ASSOCIATED EQUIPMENT, 2009, 612 (01): : 187 - 195
  • [49] Marriage between variable selection and prediction methods to model plant disease risk
    Suarez, Franco
    Bruno, Cecilia
    Giannini, Franca Kurina
    Pecci, M. Paz Gimenez
    Pardina, Patricia Rodriguez
    Balzarini, Monica
    EUROPEAN JOURNAL OF AGRONOMY, 2023, 151
  • [50] LASSO-type instrumental variable selection methods with an application to Mendelian randomization
    Qasim, Muhammad
    Mansson, Kristofer
    Balakrishnan, Narayanaswamy
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2025, 34 (02) : 201 - 223