Predictive Modeling of Pesticides Reproductive Toxicity in Earthworms Using Interpretable Machine-Learning Techniques on Imbalanced Data

被引:0
|
作者
Kotli, Mihkel [1 ]
Piir, Geven [1 ]
Maran, Uko [1 ]
机构
[1] Univ Tartu, Inst Chem, EE-50411 Tartu, Estonia
来源
ACS OMEGA | 2025年 / 10卷 / 05期
基金
欧盟地平线“2020”;
关键词
QSAR MODELS; CHEMICALS; SORPTION;
D O I
10.1021/acsomega.4c09719
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
The earthworm is a key indicator species in soil ecosystems. This makes the reproductive toxicity of chemical compounds to earthworms a desired property of determination and makes computational models necessary for descriptive and predictive purposes. Thus, the aim was to develop an advanced Quantitative Structure-Activity Relationship modeling approach for this complex property with imbalanced data. The approach integrated gradient-boosted decision trees as classifiers with a genetic algorithm for feature selection and Bayesian optimization for hyperparameter tuning. An additional goal was to analyze and interpret, using SHAP values, the structural features encoded by the molecular descriptors that contribute to pesticide toxicity and nontoxicity, the most notable of which are solvation entropy and a number of hydrolyzable bonds. The final model was constructed as a stacked ensemble of models and combined the strengths of the individual models. Evaluation of this model with an external test set of 147 compounds demonstrated a well-defined applicability domain and sufficient predictive capabilities with a Balanced Accuracy of 77%. The model representation follows FAIR principles and is available on QsarDB.org.
引用
收藏
页码:4732 / 4744
页数:13
相关论文
共 50 条
  • [1] Machine-learning classifiers for imbalanced tornado data
    Trafalis T.B.
    Adrianto I.
    Richman M.B.
    Lakshmivarahan S.
    Computational Management Science, 2014, 11 (4) : 403 - 418
  • [2] An Interpretable Machine-learning Framework for Modeling High-resolution Spectroscopic Data*
    Gully-Santiago, Michael
    Morley, Caroline V.
    ASTROPHYSICAL JOURNAL, 2022, 941 (02):
  • [3] Mental Health Predictive Analysis Using Machine-Learning Techniques
    Jain, Vanshika
    Kumari, Ritika
    Bansal, Poonam
    Dev, Amita
    SMART TRENDS IN COMPUTING AND COMMUNICATIONS, VOL 4, SMARTCOM 2024, 2024, 948 : 103 - 115
  • [4] Borehole Resistivity Measurement Modeling Using Machine-Learning Techniques
    Xu, Yankai
    Sun, Keli
    Xie, Hui
    Zhong, Xiaoyan
    Mirto, Ettore
    Feng, Yao
    Hong, Xiaobo
    PETROPHYSICS, 2018, 59 (06): : 778 - 785
  • [5] Interpretable Machine Learning Techniques for Predictive Cattle Behavior Monitoring
    Ibrahim, Tumwesige
    Isaac, Kawooya Barry
    Francis, Bwogi
    Lule, Emmanuel
    Hellen, Nakayiza
    Chongomweru, Halimu
    Marvin, Ggaliwango
    2ND INTERNATIONAL CONFERENCE ON SUSTAINABLE COMPUTING AND SMART SYSTEMS, ICSCSS 2024, 2024, : 1219 - 1224
  • [6] Modeling of Preparation of Chitosan/Tripolyphosphate Nanoparticles Using Machine-Learning Techniques
    Akbari, Mona
    Akbari, Maryam
    IRANIAN JOURNAL OF CHEMISTRY & CHEMICAL ENGINEERING-INTERNATIONAL ENGLISH EDITION, 2024, 43 (01): : 106 - 118
  • [7] Interpretable Machine Learning Techniques for Predictive Cattle Behavior Monitoring
    Makerere University, Department of Computer Science, Kampala, Uganda
    不详
    不详
    Int. Conf. Sustain. Comput. Smart Syst., ICSCSS - Proc., (1219-1224):
  • [8] Machine-learning techniques for macromolecular crystallization data
    Gopalakrishnan, V
    Livingston, G
    Hennessy, D
    Buchanan, B
    Rosenberg, JM
    ACTA CRYSTALLOGRAPHICA SECTION D-STRUCTURAL BIOLOGY, 2004, 60 : 1705 - 1716
  • [9] Comparative Analysis of Machine Learning Techniques Using Predictive Modeling
    Khandelwal, Ritu
    Goyal, Hemlata
    Shekhawat, Rajveer S.
    Recent Advances in Computer Science and Communications, 2022, 15 (03) : 466 - 477
  • [10] A comparative analysis of machine learning techniques for imbalanced data
    Mrad, Ali Ben
    Lahiani, Amine
    Mefteh-Wali, Salma
    Mselmi, Nada
    ANNALS OF OPERATIONS RESEARCH, 2024,