Improving Structure-Based Virtual Screening with Ensemble Docking and Machine Learning

被引:33
作者
Ricci-Lopez, Joel [1 ,2 ]
Aguila, Sergio A. [2 ]
Gilson, Michael K. [3 ]
Brizuela, Carlos A. [1 ]
机构
[1] Ctr Invest Cient & Educ Super Ensenada CICESE, Ensenada 22860, Baja California, Mexico
[2] Univ Nacl Autonoma Mexico, Ctr Nanociencias & Nanotecnol, Ensenada 22860, Baja California, Mexico
[3] Univ Calif San Diego, Skaggs Sch Pharm & Pharmaceut Sci, La Jolla, CA 92093 USA
关键词
PROTEIN FLEXIBILITY; RECEPTOR FLEXIBILITY; MOLECULAR DOCKING; SCORING FUNCTIONS; BINDING-AFFINITY; DRUG-DISCOVERY; LIGAND DOCKING; CONFORMATIONS; COMBINATION; IMPROVEMENT;
D O I
10.1021/acs.jcim.1c00511
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
One of the main challenges of structure-based virtual screening (SBVS) is the incorporation of the receptor's flexibility, as its explicit representation in every docking run implies a high computational cost. Therefore, a common alternative to include the receptor's flexibility is the approach known as ensemble docking. Ensemble docking consists of using a set of receptor conformations and performing the docking assays over each of them. However, there is still no agreement on how to combine the ensemble docking results to obtain the final ligand ranking. A common choice is to use consensus strategies to aggregate the ensemble docking scores, but these strategies exhibit slight improvement regarding the single-structure approach. Here, we claim that using machine learning (ML) methodologies over the ensemble docking results could improve the predictive power of SBVS. To test this hypothesis, four proteins were selected as study cases: CDK2, FXa, EGFR, and HSP90. Protein conformational ensembles were built from crystallographic structures, whereas the evaluated compound library comprised up to three benchmarking data sets (DUD, DEKOIS 2.0, and CSAR-2012) and cocrystallized molecules. Ensemble docking results were processed through 30 repetitions of 4-fold cross-validation to train and validate two ML classifiers: logistic regression and gradient boosting trees. Our results indicate that the ML classifiers significantly outperform traditional consensus strategies and even the best performance case achieved with single-structure docking. We provide statistical evidence that supports the effectiveness of ML to improve the ensemble docking performance.
引用
收藏
页码:5362 / 5376
页数:15
相关论文
共 118 条
  • [51] Drug Efficiency Indices for Improvement of Molecular Docking Scoring Functions
    Garcia-Sosa, Alfonso T.
    Hetenyi, Csaba
    Maran, Uko
    [J]. JOURNAL OF COMPUTATIONAL CHEMISTRY, 2010, 31 (01) : 174 - 184
  • [52] Improving Docking Results via Reranking of Ensembles of Ligand Poses in Multiple X-ray Protein Conformations with MM-GBSA
    Greenidge, P. A.
    Kramer, C.
    Mozziconacci, J. -C.
    Sherman, W.
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2014, 54 (10) : 2697 - 2717
  • [53] Benchmarking sets for molecular docking
    Huang, Niu
    Shoichet, Brian K.
    Irwin, John J.
    [J]. JOURNAL OF MEDICINAL CHEMISTRY, 2006, 49 (23) : 6789 - 6801
  • [54] Ensemble docking of multiple protein structures: Considering protein structural variations in molecular docking
    Huang, Sheng-You
    Zou, Xiaoqin
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2007, 66 (02) : 399 - 421
  • [55] Kassambara Alboukadel, 2023, CRAN
  • [56] Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap
    Kim, Ji-Hyun
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2009, 53 (11) : 3735 - 3745
  • [57] Docking and scoring in virtual screening for drug discovery: Methods and applications
    Kitchen, DB
    Decornez, H
    Furr, JR
    Bajorath, J
    [J]. NATURE REVIEWS DRUG DISCOVERY, 2004, 3 (11) : 935 - 949
  • [58] Combination of a naive Bayes classifier with consensus scoring improves enrichment of high-throughput docking results
    Klon, AE
    Glick, M
    Davies, JW
    [J]. JOURNAL OF MEDICINAL CHEMISTRY, 2004, 47 (18) : 4356 - 4359
  • [59] Lessons Learned in Empirical Scoring with smina from the CSAR 2011 Benchmarking Exercise
    Koes, David Ryan
    Baumgartner, Matthew P.
    Camacho, Carlos J.
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2013, 53 (08) : 1893 - 1904
  • [60] Potential and Limitations of Ensemble Docking
    Korb, Oliver
    Olsson, Tjelvar S. G.
    Bowden, Simon J.
    Hall, Richard J.
    Verdonk, Marcel L.
    Liebeschuetz, John W.
    Cole, Jason C.
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2012, 52 (05) : 1262 - 1274