An efficient variable selection method based on random frog for the multivariate calibration of NIR spectra

被引:17
作者
Sun, Jingjing [1 ,2 ]
Yang, Wude [1 ]
Feng, Meichen [1 ]
Liu, Qifang [3 ]
Kubar, Muhammad Saleem [1 ]
机构
[1] Shanxi Agr Univ, Coll Agr, South Min Xian Rd, Taigu, Shanxi, Peoples R China
[2] Shanxi Agr Univ, Coll Arts & Sci, South Min Xian Rd, Taigu, Shanxi, Peoples R China
[3] Shanxi Agr Univ, Coll Informat Sci & Engn, South Min Xian Rd, Taigu, Shanxi, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
PARTIAL LEAST-SQUARES; WAVELENGTH INTERVAL SELECTION; NEAR-INFRARED SPECTROSCOPY; SUCCESSIVE PROJECTIONS ALGORITHM; POPULATION ANALYSIS; GENETIC ALGORITHMS; SUBSET-SELECTION; PLS-REGRESSION; ELIMINATION; TOOL;
D O I
10.1039/d0ra00922a
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Variable selection is a critical step for spectrum modeling. In this study, a new method of variable interval selection based on random frog (RF), known as Interval Selection based on Random Frog (ISRF), is developed. In the ISRF algorithm, RF is used to search the most likely informative variables and then, a local search is applied to expand the interval width of the informative variables. Through multiple runs and visualization of the results, the best informative interval variables are obtained. This method was tested on three near infrared (NIR) datasets. Four variable selection methods, namely, genetic algorithm PLS (GA-PLS), random frog, interval random frog (iRF) and interval variable iterative space shrinkage approach (iVISSA) were used for comparison. The results show that the proposed method is very efficient to find the best interval variables and improve the model's prediction performance and interpretation.
引用
收藏
页码:16245 / 16253
页数:9
相关论文
共 54 条
[1]   Variable selection and interpretation in correlation principal components [J].
Al-Kandari, NM ;
Jolliffe, IT .
ENVIRONMETRICS, 2005, 16 (06) :659-672
[2]   The successive projections algorithm for variable selection in spectroscopic multicomponent analysis [J].
Araújo, MCU ;
Saldanha, TCB ;
Galvao, RKH ;
Yoneyama, T ;
Chame, HC ;
Visani, V .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2001, 57 (02) :65-73
[3]   Forward selection of explanatory variables [J].
Blanchet, F. Guillaume ;
Legendre, Pierre ;
Borcard, Daniel .
ECOLOGY, 2008, 89 (09) :2623-2632
[4]   A variable selection method based on uninformative variable elimination for multivariate calibration of near-infrared spectra [J].
Cai, Wensheng ;
Li, Yankun ;
Shao, Xueguang .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2008, 90 (02) :188-194
[5]  
Candes E, 2007, ANN STAT, V35, P2313, DOI 10.1214/009053606000001523
[6]   Elimination of uninformative variables for multivariate calibration [J].
Centner, V ;
Massart, DL ;
deNoord, OE ;
deJong, S ;
Vandeginste, BM ;
Sterna, C .
ANALYTICAL CHEMISTRY, 1996, 68 (21) :3851-3858
[7]   Determination of total polyphenols content in green tea using FT-NIR spectroscopy and different PLS algorithms [J].
Chen, Quansheng ;
Zhao, Jiewen ;
Liu, Muhua ;
Cai, Jianrong ;
Liu, Jianhua .
JOURNAL OF PHARMACEUTICAL AND BIOMEDICAL ANALYSIS, 2008, 46 (03) :568-573
[8]   Measurement of total flavone content in snow lotus (Saussurea involucrate) using near infrared spectroscopy combined with interval PLS and genetic algorithm [J].
Chen, Quansheng ;
Jiang, Pei ;
Zhao, Jiewen .
SPECTROCHIMICA ACTA PART A-MOLECULAR AND BIOMOLECULAR SPECTROSCOPY, 2010, 76 (01) :50-55
[9]   A bootstrapping soft shrinkage approach for variable selection in chemical modeling [J].
Deng, Bai-Chuan ;
Yun, Yong-Huan ;
Cao, Dong-Sheng ;
Yin, Yu-Long ;
Wang, Wei-Ting ;
Lu, Hong-Mei ;
Luo, Qian-Yi ;
Liang, Yi-Zeng .
ANALYTICA CHIMICA ACTA, 2016, 908 :63-74
[10]   A new method for wavelength interval selection that intelligently optimizes the locations, widths and combinations of the intervals [J].
Deng, Bai-Chuan ;
Yun, Yong-Huan ;
Ma, Pan ;
Lin, Chen-Chen ;
Ren, Da-Bing ;
Liang, Yi-Zeng .
ANALYST, 2015, 140 (06) :1876-1885