Comparison of Gaussian process regression, partial least squares, random forest and support vector machines for a near infrared calibration of paracetamol samples

被引:4
|
作者
Sow, Aminata [1 ]
Traore, Issiaka [1 ]
Diallo, Tidiane [2 ,3 ]
Traore, Mohamed [4 ]
Ba, Abdramane [1 ]
机构
[1] Univ Sci Tech & Technol Bamako, Fac Sci & Tech FST, Lab Opt Spect & Sci Atmospher LOSSA, Bamako, Mali
[2] Univ Sci Tech & Technol Bamako, Fac Pharm, Dept Sci Medicament, Bamako, Mali
[3] Lab Natl Sante LNS, Bamako, Mali
[4] Ecole Natl Ingn Abderhamane Baba Toure, Bamako, Mali
关键词
Paracetamol; Near Infrared Spectroscopy; Data preprocessing; Nonlinear regression models; Linear regression techniques; COMPONENTS; TABLETS;
D O I
10.1016/j.rechem.2022.100508
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
In this article, we analyze the near-infrared (NIR) spectra of fifty-eight (58) commercial tablets of 500 mg of paracetamol from different origins (that is, with different batch numbers) in the local markets in Bamako. The NIR spectra were recorded in the spectral range 930 nm-1700 nm. The samples are divided into forty-eight (48) samples forming the set of calibration (training set) and ten (10) samples used as the validation or test set. To perform multivariate calibration, we apply-three nonlinear regression techniques (Gaussian processes regression (GPR), Random Forest (RF), Support vector machine (KSVM)), along with the traditional linear partial leastsquares regression (PLSR) to several data pretreatments of the 58 samples. The results show that the three nonlinear regression calibrations have better prediction performance than PLS as far as RMSE is concerned. To decide the best regression model, we avoid R2 since this quantity is not a good parameter for this purpose. We will instead consider RMSE when comparing the different multivariate models. Additionally, to assess the impact of data preprocessing, we apply the above regression techniques to the original data, Multi-scattering correction (MSC), standard variate normalization (SNV) correction, smoothing correction, first derivative (FD), and second derivative correction (SD). The overall results reveal that Gaussian Processes Regression (GPR) applied to smooth correction gives the lowest RMSEP = 2.303053e-06 for validation (prediction) and RMSEC = 2.112316e-06 for calibration. In our investigation, one also notices that the developed GPR model is more accurate and exhibits enhanced behavior no matter which data preprocessing is used. All in all, GPR can be seen as an alternative powerful regression tool for NIR spectra of paracetamol samples. The statistical parameters of the proposed model are compared to the results of some other models reported in the literature.
引用
收藏
页数:7
相关论文
共 50 条
  • [11] Comparison of Bayesian regression models and partial least squares regression for the development of infrared prediction equations
    Bonfatti, V.
    Tiezzi, F.
    Miglior, F.
    Carnier, P.
    JOURNAL OF DAIRY SCIENCE, 2017, 100 (09) : 7306 - 7319
  • [12] Analysis of elements in wine using near infrared spectroscopy and partial least squares regression
    Cozzolino, D.
    Kwiatkowski, M. J.
    Dambergs, R. G.
    Cynkar, W. U.
    Janik, L. J.
    Skouroumounis, G.
    Gishen, A.
    TALANTA, 2008, 74 (04) : 711 - 716
  • [13] Construction of global and robust near-infrared calibration models based on hybrid calibration sets using Partial Least Squares (PLS) regression
    Ni, Lijun
    Xiao, Lixia
    Yao, Heming
    Ge, Jiong
    Zhang, Liguo
    Luan, Shaorong
    ANALYTICAL LETTERS, 2019, 52 (07) : 1177 - 1194
  • [14] Monitoring Process Water Quality Using Near Infrared Spectroscopy and Partial Least Squares Regression with Prediction Uncertainty Estimation
    Skou, Peter B.
    Berg, Thilo A.
    Aunsbjerg, Stina D.
    Thaysen, Dorrit
    Rasmussen, Morten A.
    van den Berg, Frans
    APPLIED SPECTROSCOPY, 2017, 71 (03) : 410 - 421
  • [15] Quantification of antioxidants in polyethylene by near infrared (NIR) analysis and partial least squares (PLS) regression
    Camacho, W
    Karlsson, S
    INTERNATIONAL JOURNAL OF POLYMER ANALYSIS AND CHARACTERIZATION, 2002, 7 (1-2) : 41 - 51
  • [16] Quantitative Analysis of Near-Infrared Spectra by Wavelet-based Interferences Removal and Least Squares Support Vector Regression
    Ding, Yingqiang
    Peng, Dan
    JOURNAL OF COMPUTERS, 2012, 7 (04) : 880 - 889
  • [17] Determination of nicotine in tobacco samples by near-infrared spectroscopy and boosting partial least squares
    Tan, Chao
    Wang, Jinyue
    Wu, Tong
    Qin, Xin
    Li, Menglong
    VIBRATIONAL SPECTROSCOPY, 2010, 54 (01) : 35 - 41
  • [18] Improving the Robustness and Stability of Partial Least Squares Regression for Near-infrared Spectral Analysis
    Shao Xueguang
    Chen Da
    Xu Heng
    Liu Zhichao
    Cai Wensheng
    CHINESE JOURNAL OF CHEMISTRY, 2009, 27 (07) : 1328 - 1332
  • [19] Analysis of Oil Yield from Oil Shale Minerals Based on Near-infrared Spectroscopy with Least Squares Support Vector Machines
    Zhang Fudong
    Liu Jie
    Wang Zhihong
    CHEMICAL JOURNAL OF CHINESE UNIVERSITIES-CHINESE, 2016, 37 (10): : 1792 - 1798
  • [20] Prediction of enological parameters and discrimination of rice wine age using least-squares support vector machines and near infrared spectroscopy
    Yu, Haiyan
    Lin, Hongran
    Xu, Huirong
    Ying, Yibin
    Li, Bobin
    Pan, Xingxiang
    JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY, 2008, 56 (02) : 307 - 313