ClassyPose: A Machine-Learning Classification Model for Ligand Pose Selection Applied to Virtual Screening in Drug Discovery

被引:2
作者
Tran-Nguyen, Viet-Khoa [1 ]
Camproux, Anne-Claude [1 ]
Taboureau, Olivier [1 ]
机构
[1] Univ Paris Cite, CNRS, UMR8251, INSERM U1133,Unite Biol Fonct & Adaptat, F-75013 Paris, France
关键词
good pose probability; machine-learning; PLEC fingerprints; pose classification; pose selection; support vector machine; virtual screening; MOLECULAR DOCKING; SCORING FUNCTIONS; BINDING-AFFINITY; PREDICTION; SHAPE; PERFORMANCE; INHIBITORS; COMPLEXES; DATABASE; ACCURACY;
D O I
10.1002/aisy.202400238
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Determining the target-bound conformation of a drug-like molecule is a crucial step in drug design, as it affects the outcome of virtual screening (VS), and paves the way for hit-to-lead and lead optimization. While most docking programs usually manage to produce at least a near-native pose for a bioactive molecule inside its binding pocket, their integrated classical scoring functions (SFs) generally fail to prioritize this pose. Many studies have been carried out to tackle this SF problem, offering multiple pose refinement and/or classification methods, albeit with limitations. This study presents a new support vector machine model for pose classification, called "ClassyPose", which predicts the probability that a receptor-bound ligand conformation could be near-native, without any additional pose optimization step. Trained on protein-ligand extended connectivity features extracted from over 21 600 crystal and docking poses of diverse ligands, this model outperformed other machine-learning algorithms and three existing SFs in terms of docking power, identifying the native ligand pose as top-ranked solution for more than 90% of entries in two test sets. It also achieved high specificity (above 0.96), and improved VS performance when used for pose selection. This efficient, user-friendly tool and all related data are available at https://github.com/vktrannguyen/Classy_Pose. ClassyPose is a new support vector machine model for correct ligand pose selection. Trained on protein-ligand features extracted from native and redocked binding modes of diverse ligands, it has strong docking power, achieves high specificity, and improves virtual screening performance when used as a pose selection tool. The code and all data are user-friendly and available free of charge.image (c) 2024 WILEY-VCH GmbH
引用
收藏
页数:13
相关论文
共 50 条
[41]   Bioactivity predictions and virtual screening using machine learning predictive model [J].
Siddiqui, Noor Fatima ;
Vishwakarma, Pinky ;
Thakur, Shikha ;
Jadhav, Hemant R. .
JOURNAL OF BIOMOLECULAR STRUCTURE & DYNAMICS, 2025, 43 (08) :3909-3928
[42]   Machine learning techniques applied to the drug design and discovery of new antivirals: a brief look over the past decade [J].
Serafim, Mateus Sa Magalhaes ;
dos Santos Junior, Valtair Severino ;
Gertrudes, Jadson Castro ;
Maltarollo, Vinicius Goncalves ;
Honorio, Kathia Maria .
EXPERT OPINION ON DRUG DISCOVERY, 2021, 16 (09) :961-975
[43]   Learning the Edit Costs of Graph Edit Distance Applied to Ligand-Based Virtual Screening [J].
Garcia-Hernandez, Carlos ;
Fernandez, Alberto ;
Serratosa, Francesc .
CURRENT TOPICS IN MEDICINAL CHEMISTRY, 2020, 20 (18) :1582-1592
[44]   Benthic Habitat Mapping Model and Cross Validation Using Machine-Learning Classification Algorithms [J].
Wicaksono, Pramaditya ;
Aryaguna, Prama Ardha ;
Lazuardi, Wahyu .
REMOTE SENSING, 2019, 11 (11)
[45]   A Study of Applications of Machine Learning Based Classification Methods for Virtual Screening of Lead Molecules [J].
Vyas, Renu ;
Bapat, Sanket ;
Jain, Esha ;
Tambe, Sanjeev S. ;
Karthikeyan, Muthukumarasamy ;
Kulkarni, Bhaskar D. .
COMBINATORIAL CHEMISTRY & HIGH THROUGHPUT SCREENING, 2015, 18 (07) :658-672
[46]   A review of deep learning methods for ligand based drug virtual screening [J].
Wu, Hongjie ;
Liu, Junkai ;
Zhang, Runhua ;
Lu, Yaoyao ;
Cui, Guozeng ;
Cui, Zhiming ;
Ding, Yijie .
FUNDAMENTAL RESEARCH, 2024, 4 (04) :715-737
[47]   Machine-learning regression applied to diagnose horizontal visibility from mesoscale NWP model forecasts [J].
Bari, Driss ;
Ouagabi, Abdelali .
SN APPLIED SCIENCES, 2020, 2 (04)
[48]   ALADDIN: Docking Approach Augmented by Machine Learning for Protein Structure Selection Yields Superior Virtual Screening Performance [J].
Fan, Ningning ;
Bauer, Christoph A. ;
Stork, Conrad ;
Kops, Christina de Bruyn ;
Kirchmair, Johannes .
MOLECULAR INFORMATICS, 2020, 39 (04)
[49]   Feature Selection and Machine Learning Applied for Alzheimer's Disease Classification [J].
Sanchez-Reyna, Gabriela ;
Espino-Salinas, Carlos H. ;
Rodriguez-Aguayo, Pablo C. ;
Salinas-Gonzalez, Jared D. ;
Zanella-Calzada, Laura A. ;
Martinez-Escobar, Elda Y. ;
Celaya-Padilla, Jose M. ;
Galvan-Tejada, Jorge, I ;
Galvan-Tejada, Carlos E. .
VIII LATIN AMERICAN CONFERENCE ON BIOMEDICAL ENGINEERING AND XLII NATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING, 2020, 75 :121-128
[50]   Artificial intelligence and machine-learning approaches in structure and ligand-based discovery of drugs affecting central nervous system [J].
Gautam, Vertika ;
Gaurav, Anand ;
Masand, Neeraj ;
Lee, Vannajan Sanghiran ;
Patil, Vaishali M. .
MOLECULAR DIVERSITY, 2023, 27 (02) :959-985