Development of models predicting biodegradation rate rating with multiple linear regression and support vector machine algorithms

被引:45
|
作者
Tang, Weihao [1 ]
Li, Yanying [1 ]
Yu, Yang [2 ]
Wang, Zhongyu [1 ]
Xu, Tong [1 ]
Chen, Jingwen [1 ]
Lin, Jun [2 ]
Li, Xuehua [1 ]
机构
[1] Dalian Univ Technol, Sch Environm Sci & Technol, Key Lab Ind Ecol & Environm Engn MOE, Dalian 116024, Peoples R China
[2] Minist Ecol & Environm MEE, Solid Waste & Chem Management Ctr, Beijing 100029, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Biodegradability; Quantitative structure-activity relationship; Multiple linear regression; Support vector machine; Molecular structure descriptors; AEROBIC BIODEGRADATION; READY BIODEGRADABILITY; BIOACCUMULATIVE ORGANICS; CHEMICALS; PERSISTENT; QSAR; POLLUTANTS;
D O I
10.1016/j.chemosphere.2020.126666
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Biodegradation is a significant process for removing organic chemicals from water, soil and sediment environments, and therefore biodegradability is critical to evaluate the environmental persistence of organic chemicals. In this study, based on a dataset with 171 compounds, four quantitative structure-activity relationship (QSAR) models were developed for predicting primary and ultimate biodegradation rate rating with multiple linear regression (MLR) and support vector machine (SVM) algorithms. Two MLR models were built with a dataset with carbon atom number <= 9, and two SVM models were built with a dataset with carbon atom number >9. In the MLR models, n(ArX) (number of X on aromatic ring) is the most important descriptor governing primary and ultimate biodegradation of organic chemicals. For the SVM models, determination coefficient (R-2) values, cross-validated coefficients (Q(LOO)(2)) and external validation coefficient (Q(ext)(2)) values are over 0.9, indicating the SVM models have satisfactory goodness-of-fit, robustness and external predictive abilities. The applicability domains of these models were visualized by the Williams plot. The developed models can be used as effective tools to predict biodegradability of organic chemicals. (C) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页数:7
相关论文
共 50 条
  • [31] Prediction of COVID-19 pandemic measuring criteria using support vector machine, prophet and linear regression models in Indian scenario
    Gupta, Amit Kumar
    Singh, Vijander
    Mathur, Priya
    Travieso-Gonzalez, Carlos M.
    JOURNAL OF INTERDISCIPLINARY MATHEMATICS, 2021, 24 (01) : 89 - 108
  • [32] Optimization of support vector machine through the use of metaheuristic algorithms in forecasting TBM advance rate
    Zhou, Jian
    Qiu, Yingui
    Zhu, Shuangli
    Armaghani, Danial Jahed
    Li, Chuanqi
    Hoang Nguyen
    Yagiz, Saffet
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2021, 97
  • [33] Predicting the Crest Settlement of Concrete Face Rockfill Dams by Combining Threshold Regression and Support Vector Machine
    Wen, Lifeng
    Li, Yanlong
    Zhang, Haiyang
    Liu, Yunhe
    Zhou, Heng
    INTERNATIONAL JOURNAL OF GEOMECHANICS, 2022, 22 (06)
  • [34] Multi-Objective Models for Sparse Optimization in Linear Support Vector Machine Classification
    Pirouz, Behzad
    Pirouz, Behrouz
    MATHEMATICS, 2023, 11 (17)
  • [35] Biomass estimation of a high Andean plant community with multispectral images acquired using UAV remote sensing and Multiple Linear Regression, Support Vector Machine and Random Forests models
    Estrada Zuniga, Andres C.
    Cardenas Rodriguez, Jim
    Bejar Saya, Juan Victor
    Naupari Vasquez, Javier
    SCIENTIA AGROPECUARIA, 2022, 13 (03) : 301 - 310
  • [36] Forecasting Daily Electricity Consumption in Thailand Using Regression, Artificial Neural Network, Support Vector Machine, and Hybrid Models
    Pannakkong, Warut
    Harncharnchai, Thanyaporn
    Buddhakulsomsiri, Jirachai
    ENERGIES, 2022, 15 (09)
  • [37] Computational Models Using Multiple Machine Learning Algorithms for Predicting Drug Hepatotoxicity with the DILIrank Dataset
    Ancuceanu, Robert
    Hovanet, Marilena Viorica
    Anghel, Adriana Iuliana
    Furtunescu, Florentina
    Neagu, Monica
    Constantin, Carolina
    Dinu, Mihaela
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2020, 21 (06)
  • [38] Seasonal prediction of PM2.5 based on support vector machine model and multiple regression model
    Yang, Shuran
    INTERNATIONAL CONFERENCE ON ALGORITHMS, HIGH PERFORMANCE COMPUTING, AND ARTIFICIAL INTELLIGENCE (AHPCAI 2021), 2021, 12156
  • [39] Support Vector Machine versus Multiple Logistic Regression for Prediction of Postherpetic Neuralgia in Outpatients with Herpes Zoster
    Zhang, Jie
    Ding, Qiao
    Li, Xiu-Liang
    Hao, Yi-Wei
    Yang, Ying
    PAIN PHYSICIAN, 2022, 25 (03) : E481 - E488
  • [40] Novel hybrid machine learning models including support vector machine with meta-heuristic algorithms in predicting unconfined compressive strength of organic soils stabilised with cement and lime
    Trinh Quoc Ngo
    Linh Quy Nguyen
    Van Quan Tran
    INTERNATIONAL JOURNAL OF PAVEMENT ENGINEERING, 2023, 24 (02)