Development of a robust Machine learning model for Ames test outcome prediction

被引:0
作者
Borah, Gori Sankar [1 ]
Nagamani, Selvaraman [2 ,3 ]
机构
[1] Assam Kaziranga Univ, Sch Comp Sci, Jorhat 785006, India
[2] CSIR North East Inst Sci & Technol, Jorhat 785006, India
[3] Acad Sci & Innovat Res AcSIR, Ghaziabad 201002, India
关键词
Ames mutagenicity; Multi-step feature selection; Machine learning; XGBoost; ACCURATE;
D O I
10.1016/j.cplett.2024.141663
中图分类号
O64 [物理化学(理论化学)、化学物理学];
学科分类号
070304 ; 081704 ;
摘要
The mutagenicity is an essential parameter for evaluating the safety of pharmaceuticals, chemicals, consumer products, environmentally related compounds and the Ames assay is a significant test for predicting the mutagenicity of chemical compounds. In the data-driven era, developing robust models for efficient mutagenicity prediction before synthesizing and testing in vitro has gained increasing attention. In this study, a machine learning model that could predict Ames mutagenicity based on 2D molecular descriptors was developed. A multistep filtering process that adequately helps in identifying significant descriptors was adopted in this study. Three different sets of descriptors, namely, RDKit, Mordred and CDK were used to train three machine learning algorithms, viz., random forest, xgboost and catboost. The datasets were collected from different resources to develop a robust machine learning model. The robustness of this model was further validated by comparing different available ML and DL models for Ames genotoxicity. Specifically, 12 models, including our xgboost model, were used to validate an external dataset, and our model exhibited excellent performance, with an impressive AUC of 0.97. The codes to predict the genotoxicity of a molecule is available at https://github. com/Naga270588/Genotoxicity.
引用
收藏
页数:9
相关论文
共 38 条
  • [1] CARCINOGENS ARE MUTAGENS - SIMPLE TEST SYSTEM COMBINING LIVER HOMOGENATES FOR ACTIVATION AND BACTERIA FOR DETECTION
    AMES, BN
    DURSTON, WE
    YAMASAKI, E
    LEE, FD
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1973, 70 (08) : 2281 - 2285
  • [2] The application of chemical similarity measures in an unconventional modeling framework c-RASAR along with dimensionality reduction techniques to a representative hepatotoxicity dataset
    Banerjee, Arkaprava
    Roy, Kunal
    [J]. SCIENTIFIC REPORTS, 2024, 14 (01):
  • [3] ARKA: a framework of dimensionality reduction for machine-learning classification modeling, risk assessment, and data gap-filling of sparse environmental toxicity data
    Banerjee, Arkaprava
    Roy, Kunal
    [J]. ENVIRONMENTAL SCIENCE-PROCESSES & IMPACTS, 2024, 26 (06) : 991 - 1007
  • [4] Update on genotoxicity and carcinogenicity testing of 472 marketed pharmaceuticals
    Brambilla, Giovanni
    Martelli, Antonietta
    [J]. MUTATION RESEARCH-REVIEWS IN MUTATION RESEARCH, 2009, 681 (2-3) : 209 - 229
  • [5] QSAR Modeling: Where Have You Been? Where Are You Going To?
    Cherkasov, Artem
    Muratov, Eugene N.
    Fourches, Denis
    Varnek, Alexandre
    Baskin, Igor I.
    Cronin, Mark
    Dearden, John
    Gramatica, Paola
    Martin, Yvonne C.
    Todeschini, Roberto
    Consonni, Viviana
    Kuz'min, Victor E.
    Cramer, Richard
    Benigni, Romualdo
    Yang, Chihae
    Rathman, James
    Terfloth, Lothar
    Gasteiger, Johann
    Richard, Ann
    Tropsha, Alexander
    [J]. JOURNAL OF MEDICINAL CHEMISTRY, 2014, 57 (12) : 4977 - 5010
  • [6] Machine learning-Predicting Ames mutagenicity of small molecules
    Chu, Charmaine S. M.
    Simpson, Jack D.
    O'Neill, Paul M.
    Berry, Neil G.
    [J]. JOURNAL OF MOLECULAR GRAPHICS & MODELLING, 2021, 109
  • [7] Trust, But Verify: On the Importance of Chemical Structure Curation in Cheminformatics and QSAR Modeling Research
    Fourches, Denis
    Muratov, Eugene
    Tropsha, Alexander
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2010, 50 (07) : 1189 - 1204
  • [8] Guo JJ, 2022, Arxiv, DOI arXiv:2202.10873
  • [9] Benchmark Data Set for in Silico Prediction of Ames Mutagenicity
    Hansen, Katja
    Mika, Sebastian
    Schroeter, Timon
    Sutter, Andreas
    ter Laak, Antonius
    Steger-Hartmann, Thomas
    Heinrich, Nikolaus
    Mueller, Klaus-Robert
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2009, 49 (09) : 2077 - 2081
  • [10] A Comparison of Nine Machine Learning Mutagenicity Models and Their Application for Predicting Pyrrolizidine Alkaloids
    Helma, Christoph
    Schoening, Verena
    Drewe, Juergen
    Boss, Philipp
    [J]. FRONTIERS IN PHARMACOLOGY, 2021, 12