BioPrint meets the AI age: development of artificial intelligence-based ADMET models for the drug discovery platform SAFIRE

被引:0
作者
Biehn, Sarah E. [1 ]
Goncalves, Luis Miguel [1 ]
Lehmann, Juerg [1 ]
Marty, Jessica D. [1 ]
Mueller, Christoph [1 ]
Ramirez, Samuel A. [1 ]
Tillier, Fabien [1 ]
Sage, Carleton R. [1 ]
机构
[1] Eurofins Panlabs Inc, Eurofins DiscoveryAI, St Charles, MO 63304 USA
关键词
ADMET; AI; BioPrint database; drug discovery; machine learning; APPLICABILITY DOMAIN; SPACE; PREDICTION; DATABASE;
D O I
10.4155/fmc-2024-0007
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Background: To prioritize compounds with a higher likelihood of success, artificial intelligence models can be used to predict absorption, distribution, metabolism, excretion and toxicity (ADMET) properties of molecules quickly and efficiently. Methods: Models were trained with BioPrint database proprietary data along with public datasets to predict various ADMET end points for the SAFIRE platform. Results: SAFIRE models performed at or above 75% accuracy and 0.4 Matthew's correlation coefficient with validation sets. Training with both proprietary and public data improved model performance and expanded the chemical space on which the models were trained. The platform features scoring functionality to guide user decision-making. Conclusion: High-quality datasets along with chemical space considerations yielded ADMET models performing favorably with utility in the drug discovery process. BioPrint meets the artificial intelligence age: researchers trained absorption, distribution, metabolism, excretion and toxicity machine learning models with the BioPrint database for the new SAFIRE platform.
引用
收藏
页码:587 / 599
页数:14
相关论文
共 38 条
  • [1] [Anonymous], 2022, XGBoost Parameters-xgboost 1.7.5 documentation
  • [2] Banerjee P., 2020, NUCLEIC ACIDS RES, V48, pW580, DOI [10.1093/nar/gkaa166, DOI 10.1093/NAR/GKAA166]
  • [3] KNIME:: The Konstanz Information Miner
    Berthold, Michael R.
    Cebron, Nicolas
    Dill, Fabian
    Gabriel, Thomas R.
    Koetter, Tobias
    Meinl, Thorsten
    Ohl, Peter
    Sieb, Christoph
    Thiel, Kilian
    Wiswedel, Bernd
    [J]. DATA ANALYSIS, MACHINE LEARNING AND APPLICATIONS, 2008, : 319 - 326
  • [4] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [5] SMOTE: Synthetic minority over-sampling technique
    Chawla, Nitesh V.
    Bowyer, Kevin W.
    Hall, Lawrence O.
    Kegelmeyer, W. Philip
    [J]. 2002, American Association for Artificial Intelligence (16)
  • [6] XGBoost: A Scalable Tree Boosting System
    Chen, Tianqi
    Guestrin, Carlos
    [J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 785 - 794
  • [7] The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation
    Chicco, Davide
    Jurman, Giuseppe
    [J]. BMC GENOMICS, 2020, 21 (01)
  • [8] CORTES C, 1995, MACH LEARN, V20, P273, DOI 10.1023/A:1022627411411
  • [9] Improved Prediction of Aqueous Solubility of Novel Compounds by Going Deeper With Deep Learning
    Cui, Qiuji
    Lu, Shuai
    Ni, Bingwei
    Zeng, Xian
    Tan, Ying
    Chen, Ya Dong
    Zhao, Hongping
    [J]. FRONTIERS IN ONCOLOGY, 2020, 10
  • [10] SwissADME: a free web tool to evaluate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules
    Daina, Antoine
    Michielin, Olivier
    Zoete, Vincent
    [J]. SCIENTIFIC REPORTS, 2017, 7