Comparison of Gaussian process regression, partial least squares, random forest and support vector machines for a near infrared calibration of paracetamol samples

被引:4
|
作者
Sow, Aminata [1 ]
Traore, Issiaka [1 ]
Diallo, Tidiane [2 ,3 ]
Traore, Mohamed [4 ]
Ba, Abdramane [1 ]
机构
[1] Univ Sci Tech & Technol Bamako, Fac Sci & Tech FST, Lab Opt Spect & Sci Atmospher LOSSA, Bamako, Mali
[2] Univ Sci Tech & Technol Bamako, Fac Pharm, Dept Sci Medicament, Bamako, Mali
[3] Lab Natl Sante LNS, Bamako, Mali
[4] Ecole Natl Ingn Abderhamane Baba Toure, Bamako, Mali
关键词
Paracetamol; Near Infrared Spectroscopy; Data preprocessing; Nonlinear regression models; Linear regression techniques; COMPONENTS; TABLETS;
D O I
10.1016/j.rechem.2022.100508
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
In this article, we analyze the near-infrared (NIR) spectra of fifty-eight (58) commercial tablets of 500 mg of paracetamol from different origins (that is, with different batch numbers) in the local markets in Bamako. The NIR spectra were recorded in the spectral range 930 nm-1700 nm. The samples are divided into forty-eight (48) samples forming the set of calibration (training set) and ten (10) samples used as the validation or test set. To perform multivariate calibration, we apply-three nonlinear regression techniques (Gaussian processes regression (GPR), Random Forest (RF), Support vector machine (KSVM)), along with the traditional linear partial leastsquares regression (PLSR) to several data pretreatments of the 58 samples. The results show that the three nonlinear regression calibrations have better prediction performance than PLS as far as RMSE is concerned. To decide the best regression model, we avoid R2 since this quantity is not a good parameter for this purpose. We will instead consider RMSE when comparing the different multivariate models. Additionally, to assess the impact of data preprocessing, we apply the above regression techniques to the original data, Multi-scattering correction (MSC), standard variate normalization (SNV) correction, smoothing correction, first derivative (FD), and second derivative correction (SD). The overall results reveal that Gaussian Processes Regression (GPR) applied to smooth correction gives the lowest RMSEP = 2.303053e-06 for validation (prediction) and RMSEC = 2.112316e-06 for calibration. In our investigation, one also notices that the developed GPR model is more accurate and exhibits enhanced behavior no matter which data preprocessing is used. All in all, GPR can be seen as an alternative powerful regression tool for NIR spectra of paracetamol samples. The statistical parameters of the proposed model are compared to the results of some other models reported in the literature.
引用
收藏
页数:7
相关论文
共 50 条
  • [21] A partial least squares-based consensus regression method for the analysis of near-infrared complex spectral data of plant samples
    Su, Zhenqiang
    Tong, Weida
    Shi, Leming
    Shao, Xueguang
    Cai, Wensheng
    ANALYTICAL LETTERS, 2006, 39 (09) : 2073 - 2083
  • [22] Assessment of partial least-squares calibration and wavelength selection for complex near-infrared spectra
    McShane, MJ
    Cote, GL
    Spiegelman, CH
    APPLIED SPECTROSCOPY, 1998, 52 (06) : 878 - 884
  • [23] Investigation of partial least squares (PLS) calibration performance based on different resolutions of near infrared spectra
    Chung, H
    Choi, SY
    Choo, J
    Lee, Y
    BULLETIN OF THE KOREAN CHEMICAL SOCIETY, 2004, 25 (05): : 647 - 651
  • [24] ESTIMATION OF PHYSICAL PROPERTIES OF KRAFT PAPER BY NEAR INFRARED SPECTROSCOPY AN PARTIAL LEAST SQUARES REGRESSION.
    Samistraro, Gisely
    de Muniz, Graciela I. B.
    Peralta-Zamora, Patricio
    Cordeiro, Gilcelia A.
    QUIMICA NOVA, 2009, 32 (06): : 1422 - 1425
  • [25] Multiblock partial least squares regression based on wavelet transform for quantitative analysis of near infrared spectra
    Jing, Ming
    Cai, Wensheng
    Shao, Xueguang
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2010, 100 (01) : 22 - 27
  • [26] Discrimination of raw and processed Dipsacus asperoides by near infrared spectroscopy combined with least squares-support vector machine and random forests
    Xin, Ni
    Gu, Xiao-Feng
    Wu, Hao
    Hu, Yu-Zhu
    Yang, Zhong-Lin
    SPECTROCHIMICA ACTA PART A-MOLECULAR AND BIOMOLECULAR SPECTROSCOPY, 2012, 89 : 18 - 24
  • [27] Comparison of the vibration mode of metals in HNO3 by a partial least-squares regression analysis of near-infrared spectra
    Sakudo, Akikazu
    Tsenkova, Roumiana
    Tei, Kyoko
    Onozuka, Taisuke
    Ikuta, Kazuyoshi
    Yoshimura, Etsuro
    Onodera, Takashi
    BIOSCIENCE BIOTECHNOLOGY AND BIOCHEMISTRY, 2006, 70 (07) : 1578 - 1583
  • [28] Near-infrared spectroscopy quantitative determination of Pefloxacin mesylate concentration in pharmaceuticals by using partial least squares and principal component regression multivariate calibration
    Xie, Yunfei
    Song, Yan
    Zhang, Yong
    Zhao, Bing
    SPECTROCHIMICA ACTA PART A-MOLECULAR AND BIOMOLECULAR SPECTROSCOPY, 2010, 75 (05) : 1535 - 1539
  • [29] Near infrared spectroscopy for simultaneous quantification of five chemical components in Arnebiae Radix (AR) with partial least squares and support vector machine algorithms
    Zhong, Yong-Qi
    Li, Jia-Qi
    Li, Xiao-Long
    Dai, Sheng-Yun
    Sun, Fei
    VIBRATIONAL SPECTROSCOPY, 2023, 127
  • [30] Near-Infrared Spectroscopy and Partial Least-Squares Regression for Determination of Arachidonic Acid in Powdered Oil
    Yang, Meiyan
    Nie, Shaoping
    Li, Jing
    Xie, Mingyong
    Xiong, Hua
    Deng, Zeyuan
    Zheng, Weiwan
    Li, Lin
    Zhang, Xiaoming
    LIPIDS, 2010, 45 (06) : 559 - 565