A practical guide to machine-learning scoring for structure-based virtual screening

被引:0
|
作者
Viet-Khoa Tran-Nguyen
Muhammad Junaid
Saw Simeon
Pedro J. Ballester
机构
[1] Centre de Recherche en Cancérologie de Marseille,Department of Bioengineering
[2] Imperial College London,undefined
来源
Nature Protocols | 2023年 / 18卷
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Structure-based virtual screening (SBVS) via docking has been used to discover active molecules for a range of therapeutic targets. Chemical and protein data sets that contain integrated bioactivity information have increased both in number and in size. Artificial intelligence and, more concretely, its machine-learning (ML) branch, including deep learning, have effectively exploited these data sets to build scoring functions (SFs) for SBVS against targets with an atomic-resolution 3D model (e.g., generated by X-ray crystallography or predicted by AlphaFold2). Often outperforming their generic and non-ML counterparts, target-specific ML-based SFs represent the state of the art for SBVS. Here, we present a comprehensive and user-friendly protocol to build and rigorously evaluate these new SFs for SBVS. This protocol is organized into four sections: (i) using a public benchmark of a given target to evaluate an existing generic SF; (ii) preparing experimental data for a target from public repositories; (iii) partitioning data into a training set and a test set for subsequent target-specific ML modeling; and (iv) generating and evaluating target-specific ML SFs by using the prepared training-test partitions. All necessary code and input/output data related to three example targets (acetylcholinesterase, HMG-CoA reductase, and peroxisome proliferator-activated receptor-α) are available at https://github.com/vktrannguyen/MLSF-protocol, can be run by using a single computer within 1 week and make use of easily accessible software/programs (e.g., Smina, CNN-Score, RF-Score-VS and DeepCoy) and web resources. Our aim is to provide practical guidance on how to augment training data to enhance SBVS performance, how to identify the most suitable supervised learning algorithm for a data set, and how to build an SF with the highest likelihood of discovering target-active molecules within a given compound library.
引用
收藏
页码:3460 / 3511
页数:51
相关论文
共 50 条
  • [31] Inactive-enriched machine-learning models exploiting patent data improve structure-based virtual screening for PDL1 dimerizers
    Gomez-Sacristan, Pablo
    Simeon, Saw
    Tran-Nguyen, Viet-Khoa
    Patil, Sachin
    Ballester, Pedro J.
    JOURNAL OF ADVANCED RESEARCH, 2025, 67 : 185 - 196
  • [32] Improved method of structure-based virtual screening based on ensemble learning
    Li, Jin
    Liu, WeiChao
    Song, Yongping
    Xia, JiYi
    RSC ADVANCES, 2020, 10 (13) : 7609 - 7618
  • [33] Structure-Based Pharmacophores for Virtual Screening
    Loewer, Martin
    Proschak, Ewgenij
    MOLECULAR INFORMATICS, 2011, 30 (05) : 398 - 404
  • [34] Structure-based virtual screening: an overview
    Lyne, PD
    DRUG DISCOVERY TODAY, 2002, 7 (20) : 1047 - 1055
  • [35] Structure-based prediction of BRAF mutation classes using machine-learning approaches
    Fanny S. Krebs
    Christian Britschgi
    Sylvain Pradervand
    Rita Achermann
    Petros Tsantoulis
    Simon Haefliger
    Andreas Wicki
    Olivier Michielin
    Vincent Zoete
    Scientific Reports, 12
  • [36] Structure-based virtual ligand screening
    Villoutreix, Bruno O.
    CURRENT PROTEIN & PEPTIDE SCIENCE, 2006, 7 (05) : 367 - 367
  • [37] Structure-based prediction of BRAF mutation classes using machine-learning approaches
    Krebs, Fanny S.
    Britschgi, Christian
    Pradervand, Sylvain
    Achermann, Rita
    Tsantoulis, Petros
    Haefliger, Simon
    Wicki, Andreas
    Michielin, Olivier
    Zoete, Vincent
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [38] SCORCH: Improving structure-based virtual screening with machine learning classifiers, data augmentation, and uncertainty estimation
    McGibbon, Miles
    Money-Kyrle, Sam
    Blay, Vincent
    Houston, Douglas R.
    JOURNAL OF ADVANCED RESEARCH, 2023, 46 : 135 - 147
  • [39] Empirical Scoring Functions for Structure-Based Virtual Screening: Applications, Critical Aspects, and Challenges
    Guedes, Isabella A.
    Pereira, Felipe S. S.
    Dardenne, Laurent E.
    FRONTIERS IN PHARMACOLOGY, 2018, 9
  • [40] How can machine-learning methods assist in virtual screening for hyperuricemia? A healthcare machine-learning approach
    Ichikawa, Daisuke
    Saito, Toki
    Ujita, Waka
    Oyama, Hiroshi
    JOURNAL OF BIOMEDICAL INFORMATICS, 2016, 64 : 20 - 24