Free variable selection QSPR study to predict 19F chemical shifts of some fluorinated organic compounds using Random Forest and RBF-PLS methods

被引:7
作者
Goudarzi, Nasser [1 ]
机构
[1] Univ Shahrood, Fac Chem, POB 316, Shahrood, Iran
关键词
Quantitative structure-property relationship (QSPR); Random forest; Radial basis function-partial least square (RBF-PIS); F-19 chemical shift; Fluorinated organic compounds (FOCs); SURFACE-WATER SAMPLES; COEFFICIENTS K-OW; PERFLUOROOCTANE SULFONATE; PERFLUORINATED ACIDS; ACCUMULATION; REGRESSION; INDEXES; BIOTA; QSAR;
D O I
10.1016/j.saa.2016.01.023
中图分类号
O433 [光谱学];
学科分类号
0703 ; 070302 ;
摘要
In this work, two new and powerful chemometrics methods are applied for the modeling and prediction of the F-19 chemical shift values of some fluorinated organic compounds. The radial basis function-partial least square (RBF-PLS) and random forest (RF) are employed to construct the models to predict the 19F chemical shifts. In this study, we didn't used from any variable selection method and RF method can be used as variable selection and modeling technique. Effects of the important parameters affecting the ability of the RF prediction power such as the number of trees (n(t)) and the number of randomly selected variables to split each node (m) were investigated. The root-mean-square errors of prediction (RMSEP) for the training set and the prediction set for the RBF-PLS and RF models were 44.70, 23.86, 29.77, and 23.69, respectively. Also, the correlation coefficients of the prediction set for the RBF-PLS and RF models were 0.8684 and 0.9313, respectively. The results obtained reveal that the RF model can be used as a powerful chemometrics tool for the quantitative structure property relationship (QSPR) studies. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:60 / 64
页数:5
相关论文
共 40 条
  • [1] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [2] Duvenbeck C., 1995, Topological and Geometrical Approach to Develop Models for Prediction of ~3C NMR shifts
  • [3] Feature selection method based on fuzzy entropy for regression in QSAR studies
    Elmi, Zahra
    Faez, Karim
    Goodarzi, Mohammad
    Goudarzi, Nasser
    [J]. MOLECULAR PHYSICS, 2009, 107 (17) : 1787 - 1798
  • [4] Prediction of ozone tropospheric degradation rate constant of organic compounds by using artificial neural networks
    Fatemi, MH
    [J]. ANALYTICA CHIMICA ACTA, 2006, 556 (02) : 355 - 363
  • [5] Global distribution of perfluorooctane sulfonate in wildlife
    Giesy, JP
    Kannan, K
    [J]. ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2001, 35 (07) : 1339 - 1342
  • [6] Perfluorochemical surfactants in the environment
    Giesy, JP
    Kannan, K
    [J]. ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2002, 36 (07) : 146A - 152A
  • [7] Artificial neural network prediction of retention factors of some benzene derivatives and heterocyclic compounds in micellar electrokinetic chromatography
    Golmohammadi, H
    Fatemi, MH
    [J]. ELECTROPHORESIS, 2005, 26 (18) : 3438 - 3444
  • [8] New Hybrid Genetic Based Support Vector Regression as QSAR Approach for Analyzing Flavonoids-GABA(A) Complexes
    Goodarzi, Mohammad
    Duchowicz, Pablo R.
    Wu, Chih H.
    Fernandez, Francisco M.
    Castro, Eduardo A.
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2009, 49 (06) : 1475 - 1485
  • [9] Feature Selection and Linear/Nonlinear Regression Methods for the Accurate Prediction of Glycogen Synthase Kinase-3β Inhibitory Activities
    Goodarzi, Mohammad
    Freitas, Matheus P.
    Jensen, Richard
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2009, 49 (04) : 824 - 832
  • [10] On the use of PLS and N-PLS in MIA-QSAR: Azole antifungals
    Goodarzi, Mohammad
    Freitas, Matheus P.
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2009, 96 (01) : 59 - 62