A practical guide to machine-learning scoring for structure-based virtual screening

被引：0

作者：

Viet-Khoa Tran-Nguyen

Muhammad Junaid

Saw Simeon

Pedro J. Ballester

机构：

[1] Centre de Recherche en Cancérologie de Marseille,Department of Bioengineering

[2] Imperial College London,undefined

来源：

Nature Protocols | 2023年 / 18卷

关键词：

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Structure-based virtual screening (SBVS) via docking has been used to discover active molecules for a range of therapeutic targets. Chemical and protein data sets that contain integrated bioactivity information have increased both in number and in size. Artificial intelligence and, more concretely, its machine-learning (ML) branch, including deep learning, have effectively exploited these data sets to build scoring functions (SFs) for SBVS against targets with an atomic-resolution 3D model (e.g., generated by X-ray crystallography or predicted by AlphaFold2). Often outperforming their generic and non-ML counterparts, target-specific ML-based SFs represent the state of the art for SBVS. Here, we present a comprehensive and user-friendly protocol to build and rigorously evaluate these new SFs for SBVS. This protocol is organized into four sections: (i) using a public benchmark of a given target to evaluate an existing generic SF; (ii) preparing experimental data for a target from public repositories; (iii) partitioning data into a training set and a test set for subsequent target-specific ML modeling; and (iv) generating and evaluating target-specific ML SFs by using the prepared training-test partitions. All necessary code and input/output data related to three example targets (acetylcholinesterase, HMG-CoA reductase, and peroxisome proliferator-activated receptor-α) are available at https://github.com/vktrannguyen/MLSF-protocol, can be run by using a single computer within 1 week and make use of easily accessible software/programs (e.g., Smina, CNN-Score, RF-Score-VS and DeepCoy) and web resources. Our aim is to provide practical guidance on how to augment training data to enhance SBVS performance, how to identify the most suitable supervised learning algorithm for a data set, and how to build an SF with the highest likelihood of discovering target-active molecules within a given compound library.

引用

页码：3460 / 3511

页数：51

共 50 条

[31] Inactive-enriched machine-learning models exploiting patent data improve structure-based virtual screening for PDL1 dimerizers
Gomez-Sacristan, Pablo
Simeon, Saw
Tran-Nguyen, Viet-Khoa
Patil, Sachin
Ballester, Pedro J.
JOURNAL OF ADVANCED RESEARCH, 2025, 67 : 185 - 196
[32] Improved method of structure-based virtual screening based on ensemble learning
Li, Jin
Liu, WeiChao
Song, Yongping
Xia, JiYi
RSC ADVANCES, 2020, 10 (13) : 7609 - 7618
[33] Structure-Based Pharmacophores for Virtual Screening
Loewer, Martin
Proschak, Ewgenij
MOLECULAR INFORMATICS, 2011, 30 (05) : 398 - 404
[34] Structure-based virtual screening: an overview
Lyne, PD
DRUG DISCOVERY TODAY, 2002, 7 (20) : 1047 - 1055
[35] Structure-based prediction of BRAF mutation classes using machine-learning approaches
Fanny S. Krebs
Christian Britschgi
Sylvain Pradervand
Rita Achermann
Petros Tsantoulis
Simon Haefliger
Andreas Wicki
Olivier Michielin
Vincent Zoete
Scientific Reports, 12
[36] Structure-based virtual ligand screening
Villoutreix, Bruno O.
CURRENT PROTEIN & PEPTIDE SCIENCE, 2006, 7 (05) : 367 - 367
[37] Structure-based prediction of BRAF mutation classes using machine-learning approaches
Krebs, Fanny S.
Britschgi, Christian
Pradervand, Sylvain
Achermann, Rita
Tsantoulis, Petros
Haefliger, Simon
Wicki, Andreas
Michielin, Olivier
Zoete, Vincent
SCIENTIFIC REPORTS, 2022, 12 (01)
[38] SCORCH: Improving structure-based virtual screening with machine learning classifiers, data augmentation, and uncertainty estimation
McGibbon, Miles
Money-Kyrle, Sam
Blay, Vincent
Houston, Douglas R.
JOURNAL OF ADVANCED RESEARCH, 2023, 46 : 135 - 147
[39] Empirical Scoring Functions for Structure-Based Virtual Screening: Applications, Critical Aspects, and Challenges
Guedes, Isabella A.
Pereira, Felipe S. S.
Dardenne, Laurent E.
FRONTIERS IN PHARMACOLOGY, 2018, 9
[40] How can machine-learning methods assist in virtual screening for hyperuricemia? A healthcare machine-learning approach
Ichikawa, Daisuke
Saito, Toki
Ujita, Waka
Oyama, Hiroshi
JOURNAL OF BIOMEDICAL INFORMATICS, 2016, 64 : 20 - 24

← 1 2 3 4 5 →