Development of models predicting biodegradation rate rating with multiple linear regression and support vector machine algorithms

被引:45
|
作者
Tang, Weihao [1 ]
Li, Yanying [1 ]
Yu, Yang [2 ]
Wang, Zhongyu [1 ]
Xu, Tong [1 ]
Chen, Jingwen [1 ]
Lin, Jun [2 ]
Li, Xuehua [1 ]
机构
[1] Dalian Univ Technol, Sch Environm Sci & Technol, Key Lab Ind Ecol & Environm Engn MOE, Dalian 116024, Peoples R China
[2] Minist Ecol & Environm MEE, Solid Waste & Chem Management Ctr, Beijing 100029, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Biodegradability; Quantitative structure-activity relationship; Multiple linear regression; Support vector machine; Molecular structure descriptors; AEROBIC BIODEGRADATION; READY BIODEGRADABILITY; BIOACCUMULATIVE ORGANICS; CHEMICALS; PERSISTENT; QSAR; POLLUTANTS;
D O I
10.1016/j.chemosphere.2020.126666
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Biodegradation is a significant process for removing organic chemicals from water, soil and sediment environments, and therefore biodegradability is critical to evaluate the environmental persistence of organic chemicals. In this study, based on a dataset with 171 compounds, four quantitative structure-activity relationship (QSAR) models were developed for predicting primary and ultimate biodegradation rate rating with multiple linear regression (MLR) and support vector machine (SVM) algorithms. Two MLR models were built with a dataset with carbon atom number <= 9, and two SVM models were built with a dataset with carbon atom number >9. In the MLR models, n(ArX) (number of X on aromatic ring) is the most important descriptor governing primary and ultimate biodegradation of organic chemicals. For the SVM models, determination coefficient (R-2) values, cross-validated coefficients (Q(LOO)(2)) and external validation coefficient (Q(ext)(2)) values are over 0.9, indicating the SVM models have satisfactory goodness-of-fit, robustness and external predictive abilities. The applicability domains of these models were visualized by the Williams plot. The developed models can be used as effective tools to predict biodegradability of organic chemicals. (C) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页数:7
相关论文
共 50 条
  • [1] CONSTRUCTION COSTS FORECASTING: COMPARISON OF THE ACCURACY OF LINEAR REGRESSION AND SUPPORT VECTOR MACHINE MODELS
    Petruseva, Silvana
    Zileska-Pancovska, Valentina
    Zujo, Vahida
    Brkan-Vejzovic, Aida
    TEHNICKI VJESNIK-TECHNICAL GAZETTE, 2017, 24 (05): : 1431 - 1438
  • [2] Prediction of dielectric dissipation factors of polymers from cyclic dimer structure using multiple linear regression and support vector machine
    Xu, Jie
    Zhu, Ligen
    Fang, Dong
    Liu, Li
    Wang, Luoxin
    Xu, Weilin
    COLLOID AND POLYMER SCIENCE, 2013, 291 (03) : 551 - 561
  • [3] A comparative study of multiple linear regression, artificial neural network and support vector machine for the prediction of dissolved oxygen
    Li, Xue
    Sha, Jian
    Wang, Zhong-liang
    HYDROLOGY RESEARCH, 2017, 48 (05): : 1214 - 1225
  • [4] A Hybrid of Multiple Linear Regression Clustering Model with Support Vector Machine for Colorectal Cancer Tumor Size Prediction
    Shafi, Muhammad Ammar
    Rusiman, Mohd Saifullah
    Ismail, Shuhaida
    Kamardan, Muhamad Ghazali
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (04) : 323 - 328
  • [5] A hybrid of Multiple Linear Regression Clustering model with support vector machine for colorectal cancer tumor size prediction
    Shafi M.A.
    Rusiman M.S.
    Ismail S.
    Kamardan M.G.
    International Journal of Advanced Computer Science and Applications, 2019, 10 (04): : 323 - 328
  • [6] Is Support Vector Regression method suitable for predicting rate of penetration?
    Kor, Korhan
    Altun, Gursat
    JOURNAL OF PETROLEUM SCIENCE AND ENGINEERING, 2020, 194 (194)
  • [7] Development of Multiple Linear Regression Models for Predicting Chronic Iron Toxicity to Aquatic Organisms
    Brix, Kevin V.
    Tear, Lucinda
    DeForest, David K.
    Adams, William J.
    ENVIRONMENTAL TOXICOLOGY AND CHEMISTRY, 2023, 42 (06) : 1386 - 1400
  • [8] Comparison of Multiple Linear Regression, Artificial Neural Network, Extreme Learning Machine, and Support Vector Machine in Deriving Operation Rule of Hydropower Reservoir
    Niu, Wen-Jing
    Feng, Zhong-Kai
    Feng, Bao-Fei
    Min, Yao-Wu
    Cheng, Chun-Tian
    Zhou, Jian-Zhong
    WATER, 2019, 11 (01)
  • [9] Prediction of dielectric dissipation factors of polymers from cyclic dimer structure using multiple linear regression and support vector machine
    Jie Xu
    Ligen Zhu
    Dong Fang
    Li Liu
    Luoxin Wang
    Weilin Xu
    Colloid and Polymer Science, 2013, 291 : 551 - 561
  • [10] Predictions of chromatographic retention indices of alkylphenols with support vector machines and multiple linear regression
    Fatemi, Mohammed Hossein
    Baher, Elham
    Ghorbanzade'h, Mehdi
    JOURNAL OF SEPARATION SCIENCE, 2009, 32 (23-24) : 4133 - 4142