A Bootstrapping Soft Shrinkage Approach and Interval Random Variables Selection Hybrid Model for Variable Selection in Near-Infrared Spectroscopy

被引:9
作者
Gamal Al-Kaf, Hasan Ali [1 ]
Mohammed Alduais, Nayef Abdulwahab [2 ]
Saad, Abdul-Malik H. Y. [3 ]
Chia, Kim Seng [4 ]
Mohsen, Abdulqader M. [5 ]
Alhussian, Hitham [6 ]
Haidar Mahdi, Ammar Abdo Mohammed [5 ,7 ]
Wan Salam, Wan Saiful-Islam [7 ]
机构
[1] Univ Teknol Petronas, Dept Comp & Informat Sci, Seri Iskandar 32610, Perak, Malaysia
[2] Univ Tun Hussein Onn Malaysia, Fac Comp Sci & Informat Technol FSKTM, Parit Raja 86400, Malaysia
[3] Univ Sains Malaysia USM, Sch Elect & Elect Engn, Nibong Tebal 14300, Malaysia
[4] Univ Tun Hussein Onn Malaysia, Fac Elect & Elect Engn, Batu Pahat 86400, Malaysia
[5] Univ Sci & Technol, Dept Comp Sci, Sanaa, Yemen
[6] Univ Teknol Petronas, Inst Autonomous Syst IAS, Ctr Res Data Sci CERDAS, Seri Iskandar 32610, Perak, Malaysia
[7] Univ Tun Hussein Onn Malaysia, Fac Mech & Mfg Engn, Parit Raja 86400, Malaysia
关键词
Input variables; Spectroscopy; Proteins; Sociology; Statistics; Analytical models; Adaptation models; Hybrid variable selection; model population analysis; weighted bootstrap sampling; partial least squares; near infrared spectroscopy; GENETIC ALGORITHM-PLS; POPULATION ANALYSIS; RANDOM FROG; STRATEGY; CALIBRATION; REGRESSION; OPTIMIZES; NETWORK; SUBSET;
D O I
10.1109/ACCESS.2020.3023681
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
High dimensionality problem in spectra datasets is a significant challenge to researchers and requires the design of effective methods that can extract the optimal variable subset that can improve the accuracy of predictions or classifications. In this study, a hybrid variable selection method, based on the incremental number of variables using bootstrapping soft shrinkage method (BOSS) and interval random variable selection (IRVS) method is proposed and named BOSS-IRVS. The BOSS method is used to determine the informative intervals, while the IRVS method is used to search for informative variables in the informative interval determined by BOSS method. The proposed BOSS-IRVS method was tested using seven different public accessible near-infrared (NIR) spectroscopic datasets of corn, diesel fuel, soy, wheat protein, and hemoglobin types. The performance of the proposed method was compared with that of two outstanding variable selection methods i.e. BOSS and hybrid variable selection strategy based on continuous shrinkage of variable space (VCPA-IRIV). The experimental results showed clearly that the proposed method BOSS-IRVS outperforms VCPA-IRIV and BOSS methods in all tested datasets and improved the percentage of the prediction accuracy, by 15.4 and 15.3 for corn moisture,13.4 and 49.8 for corn oil, 41.5 and 50.6 for corn protein, 12.6 and 5.6 for soy moisture, 0.6 and 6.3 for total diesel fuel, 19.9 and 14.3 for wheat protein, and 5.8 and 20.3 for hemoglobin.
引用
收藏
页码:168036 / 168052
页数:17
相关论文
共 38 条
[1]   Energy and RSSI Based Fuzzy Inference System for Cluster Head Selection in Wireless Sensor Networks [J].
Al-Kashoash, Hayder A. A. ;
Rahman, Zain-Aldeen S. A. ;
Alhamdawee, Ehsan .
INTERNATIONAL CONFERENCE OF INFORMATION AND COMMUNICATION TECHNOLOGY (ICICT 2019), 2019, :102-105
[2]   NTR calibration in non-linear systems:: different PLS approaches and artificial neural networks [J].
Blanco, M ;
Coello, J ;
Iturriaga, H ;
Maspoch, S ;
Pagès, J .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2000, 50 (01) :75-82
[3]   A bootstrap-based strategy for spectral interval selection in PLS regression [J].
Bras, Ligia P. ;
Lopes, Marta ;
Ferreira, Ana P. ;
Menezes, Jose C. .
JOURNAL OF CHEMOMETRICS, 2008, 22 (11-12) :695-700
[4]  
Candes E, 2007, ANN STAT, V35, P2313, DOI 10.1214/009053606000001523
[5]   A combination strategy of random forest and back propagation network for variable selection in spectral calibration [J].
Chen, Huazhou ;
Liu, Xiaoke ;
Jia, Zhen ;
Liu, Zhenyao ;
Shi, Kai ;
Cai, Ken .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2018, 182 :101-108
[6]   A bootstrapping soft shrinkage approach for variable selection in chemical modeling [J].
Deng, Bai-Chuan ;
Yun, Yong-Huan ;
Cao, Dong-Sheng ;
Yin, Yu-Long ;
Wang, Wei-Ting ;
Lu, Hong-Mei ;
Luo, Qian-Yi ;
Liang, Yi-Zeng .
ANALYTICA CHIMICA ACTA, 2016, 908 :63-74
[7]   Model population analysis in chemometrics [J].
Deng, Bai-Chuan ;
Yun, Yong-Huan ;
Liang, Yi-Zeng .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2015, 149 :166-176
[8]   A new method for wavelength interval selection that intelligently optimizes the locations, widths and combinations of the intervals [J].
Deng, Bai-Chuan ;
Yun, Yong-Huan ;
Ma, Pan ;
Lin, Chen-Chen ;
Ren, Da-Bing ;
Liang, Yi-Zeng .
ANALYST, 2015, 140 (06) :1876-1885
[9]   A novel variable selection approach that iteratively optimizes variable space using weighted binary matrix sampling [J].
Deng, Bai-chuan ;
Yun, Yong-huan ;
Liang, Yi-zeng ;
Yi, Lun-zhao .
ANALYST, 2014, 139 (19) :4836-4845
[10]   TRANSFER OF CALIBRATION FUNCTION IN NEAR-INFRARED SPECTROSCOPY [J].
FORINA, M ;
DRAVA, G ;
ARMANINO, C ;
BOGGIA, R ;
LANTERI, S ;
LEARDI, R ;
CORTI, P ;
CONTI, P ;
GIANGIACOMO, R ;
GALLIENA, C ;
BIGONI, R ;
QUARTARI, I ;
SERRA, C ;
FERRI, D ;
LEONI, O ;
LAZZERI, L .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1995, 27 (02) :189-203