Direct Prediction of Bioaccumulation of Organic Contaminants in Plant Roots from Soils with Machine Learning Models Based on Molecular Structures

被引:40
作者
Gao, Feng [1 ]
Shen, Yike [2 ]
Sallach, Jonathan Brett [3 ]
Li, Hui [4 ]
Liu, Cun [5 ]
Li, Yuanbo [5 ]
机构
[1] Yale Univ, Sch Med, Dept Genet, 333 Cedar St, New Haven, CT 06510 USA
[2] Columbia Univ, Mailman Sch Publ Hlth, Dept Environm Hlth Sci, New York, NY 10032 USA
[3] Univ York, Dept Geog & Environm, Heslington, York YO10 5NG, N Yorkshire, England
[4] Michigan State Univ, Dept Plant Soil & Microbial Sci, E Lansing, MI 48824 USA
[5] Chinese Acad Sci, Inst Soil Sci, Key Lab o60f Soil Environm & Pollut Remediat, Nanjing 210008, Peoples R China
关键词
machine learning; root concentration factor (RCF); plant uptake; extended connectivity fingerprints (ECFP); molecular structure; gradient boosting regression tree (GBRT); POLYCYCLIC AROMATIC-HYDROCARBONS; PARTITION-LIMITED MODEL; ACROPETAL TRANSLOCATION; AGRICULTURAL SOILS; CROP PLANTS; BIOCONCENTRATION; PHARMACEUTICALS; TRICLOSAN; WHEAT; CHLOROBENZENES;
D O I
10.1021/acs.est.1c02376
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Root concentration factor (RCF) is an important characterization parameter to describe accumulation of organic contaminants in plants from soils in life cycle impact assessment (LCIA) and phytoremediation potential assessment. However, building robust predictive models remains challenging due to the complex interactions among chemical-soil-plant root systems. Here we developed end-to-end machine learning models to devolve the complex molecular structure relationship with RCF by training on a unified RCF data set with 341 data points covering 72 chemicals. We demonstrate the efficacy of the proposed gradient boosting regression tree (GBRT) model based on the extended connectivity fingerprints (ECFP) by predicting RCF values and achieved prediction performance with R-squared of 0.77 and mean absolute error (MAE) of 0.22 using 5-fold cross validation. In addition, our results reveal nonlinear relationships among properties of chemical, soil, and plant. Further in-depth analyses identify the key chemical topological substructures (e.g., -O, -Cl, aromatic rings and large conjugated pi-systems) related to RCF. Stemming from its simplicity and universality, the GBRT-ECFP model provides a valuable tool for LCIA and other environmental assessments to better characterize chemical risks to human health and ecosystems.
引用
收藏
页码:16358 / 16368
页数:11
相关论文
共 58 条
  • [1] Waste-to-Resource Transformation: Gradient Boosting Modeling for Organic Fraction Municipal Solid Waste Projection
    Adeogba, Eniola
    Barty, Peter
    O'Dwyer, Edward
    Guo, Miao
    [J]. ACS SUSTAINABLE CHEMISTRY & ENGINEERING, 2019, 7 (12) : 10460 - 10466
  • [2] A Simple Representation of Three-Dimensional Molecular Structure
    Axen, Seth D.
    Huang, Xi-Ping
    Caceres, Elena L.
    Gendelev, Leo
    Roth, Bryan L.
    Keiser, Michael J.
    [J]. JOURNAL OF MEDICINAL CHEMISTRY, 2017, 60 (17) : 7393 - 7409
  • [3] Examining plant uptake and translocation of emerging contaminants using machine learning: Implications to food security
    Bagheri, Majid
    Al-jabery, Khalid
    Wunsch, Donald
    Burken, Joel G.
    [J]. SCIENCE OF THE TOTAL ENVIRONMENT, 2020, 698 (698)
  • [4] DIELDRIN UPTAKE BY CORN AS AFFECTED BY SOIL PROPERTIES
    BEESTMAN, GB
    KEENEY, DR
    CHESTERS, G
    [J]. AGRONOMY JOURNAL, 1969, 61 (02) : 247 - &
  • [5] Uptake of veterinary medicines from soils into plants
    Boxall, ABA
    Johnson, P
    Smith, EJ
    Sinclair, CJ
    Stutt, E
    Levy, LS
    [J]. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY, 2006, 54 (06) : 2288 - 2297
  • [6] Assessing Human Health Risks from Per- and Polyfluoroalkyl Substance (PFAS)-Impacted Vegetable Consumption: A Tiered Modeling Approach
    Brown, Juliane B.
    Conder, Jason M.
    Arblaster, Jennifer A.
    Higgins, Christopher P.
    [J]. ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2020, 54 (23) : 15202 - 15214
  • [7] Polycyclic aromatic hydrocarbons and phthalic acid esters in the soil-radish (Raphanus sativus) system with sewage sludge and compost application
    Cai, Quan-Ying
    Mo, Ce-Hui
    Wu, Qi-Tang
    Zeng, Qlao-Yun
    [J]. BIORESOURCE TECHNOLOGY, 2008, 99 (06) : 1830 - 1836
  • [8] Carbonell P, 2013, J CHEM INF MODEL, V53, P887, DOI [10.1021/ci300584r, 10.1021/ci300584rl1]
  • [9] Fate and Uptake of Pharmaceuticals in Soil-Plant Systems
    Carter, Laura J.
    Harris, Eleanor
    Williams, Mike
    Ryan, Jim J.
    Kookana, Rai S.
    Boxall, Alistair B. A.
    [J]. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY, 2014, 62 (04) : 816 - 825
  • [10] A partition-limited model for the plant uptake of organic contaminants from soil and water
    Chiou, CT
    Sheng, GY
    Manes, M
    [J]. ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2001, 35 (07) : 1437 - 1444