A new and efficient variable selection algorithm based on ant colony optimization. Applications to near infrared spectroscopy/partial least-squares analysis

被引:113
作者
Allegrini, Franco [1 ]
Olivieri, Alejandro C. [1 ]
机构
[1] Univ Nacl Rosario, Dept Quim Analit, Fac Ciencias Bioquim & Farmaceut, Inst Quim Rosario,IQUIR CONICET, RA-2000 Rosario, Santa Fe, Argentina
关键词
Ant colony optimization; Variable selection; Near infrared spectroscopy; Partial least-squares regression; WAVELENGTH SELECTION; GENETIC ALGORITHM; PLS-REGRESSION; PREDICTION; MODELS; SIZE; QSAR; TOOL;
D O I
10.1016/j.aca.2011.04.061
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
A new variable selection algorithm is described, based on ant colony optimization (ACO). The algorithm aim is to choose, from a large number of available spectral wavelengths, those relevant to the estimation of analyte concentrations or sample properties when spectroscopic analysis is combined with multivariate calibration techniques such as partial least-squares (PLS) regression. The new algorithm employs the concept of cooperative pheromone accumulation, which is typical of ACO selection methods, and optimizes PLS models using a pre-defined number of variables, employing a Monte Carlo approach to discard irrelevant sensors. The performance has been tested on a simulated system, where it shows a significant superiority over other commonly employed selection methods, such as genetic algorithms. Several near infrared spectroscopic experimental data sets have been subjected to the present ACO algorithm, with PLS leading to improved analytical figures of merit upon wavelength selection. The method could be helpful in other chemometric activities such as classification or quantitative structure-activity relationship (QSAR) problems. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:18 / 25
页数:8
相关论文
共 42 条
[1]  
[Anonymous], 2004, ANT COLONY OPTIMIZAT
[2]  
[Anonymous], 2010, MATLAB 7 10
[3]  
*ASTM, 2001, D269999 ASTM
[4]   Genetic algorithm-based method for selecting wavelengths and model size for use with partial least-squares regression: Application to near-infrared spectroscopy [J].
Bangalore, AS ;
Shaffer, RE ;
Small, GW ;
Arnold, MA .
ANALYTICAL CHEMISTRY, 1996, 68 (23) :4200-4212
[5]   A new genetic algorithm applied to the near infrared analysis of gasolines [J].
Boschetti, CE ;
Olivieri, AC .
JOURNAL OF NEAR INFRARED SPECTROSCOPY, 2004, 12 (02) :85-91
[6]   Genetic algorithms as a method for variable selection in multiple linear regression and partial least squares regression, with applications to pyrolysis mass spectrometry [J].
Broadhurst, D ;
Goodacre, R ;
Jones, A ;
Rowland, JJ ;
Kell, DB .
ANALYTICA CHIMICA ACTA, 1997, 348 (1-3) :71-86
[7]   Critical factors limiting the interpretation of regression vectors in multivariate calibration [J].
Brown, Christopher D. ;
Green, Robert L. .
TRAC-TRENDS IN ANALYTICAL CHEMISTRY, 2009, 28 (04) :506-514
[8]   Interpretation of regression coefficients under a latent variable regression model [J].
Burnham, AJ ;
MacGregor, JF ;
Viveros, R .
JOURNAL OF CHEMOMETRICS, 2001, 15 (04) :265-284
[9]   Elimination of uninformative variables for multivariate calibration [J].
Centner, V ;
Massart, DL ;
deNoord, OE ;
deJong, S ;
Vandeginste, BM ;
Sterna, C .
ANALYTICAL CHEMISTRY, 1996, 68 (21) :3851-3858
[10]   Performance of some variable selection methods when multicollinearity is present [J].
Chong, IG ;
Jun, CH .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2005, 78 (1-2) :103-112