Improving Structure-Based Virtual Screening with Ensemble Docking and Machine Learning

被引:33
|
作者
Ricci-Lopez, Joel [1 ,2 ]
Aguila, Sergio A. [2 ]
Gilson, Michael K. [3 ]
Brizuela, Carlos A. [1 ]
机构
[1] Ctr Invest Cient & Educ Super Ensenada CICESE, Ensenada 22860, Baja California, Mexico
[2] Univ Nacl Autonoma Mexico, Ctr Nanociencias & Nanotecnol, Ensenada 22860, Baja California, Mexico
[3] Univ Calif San Diego, Skaggs Sch Pharm & Pharmaceut Sci, La Jolla, CA 92093 USA
关键词
PROTEIN FLEXIBILITY; RECEPTOR FLEXIBILITY; MOLECULAR DOCKING; SCORING FUNCTIONS; BINDING-AFFINITY; DRUG-DISCOVERY; LIGAND DOCKING; CONFORMATIONS; COMBINATION; IMPROVEMENT;
D O I
10.1021/acs.jcim.1c00511
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
One of the main challenges of structure-based virtual screening (SBVS) is the incorporation of the receptor's flexibility, as its explicit representation in every docking run implies a high computational cost. Therefore, a common alternative to include the receptor's flexibility is the approach known as ensemble docking. Ensemble docking consists of using a set of receptor conformations and performing the docking assays over each of them. However, there is still no agreement on how to combine the ensemble docking results to obtain the final ligand ranking. A common choice is to use consensus strategies to aggregate the ensemble docking scores, but these strategies exhibit slight improvement regarding the single-structure approach. Here, we claim that using machine learning (ML) methodologies over the ensemble docking results could improve the predictive power of SBVS. To test this hypothesis, four proteins were selected as study cases: CDK2, FXa, EGFR, and HSP90. Protein conformational ensembles were built from crystallographic structures, whereas the evaluated compound library comprised up to three benchmarking data sets (DUD, DEKOIS 2.0, and CSAR-2012) and cocrystallized molecules. Ensemble docking results were processed through 30 repetitions of 4-fold cross-validation to train and validate two ML classifiers: logistic regression and gradient boosting trees. Our results indicate that the ML classifiers significantly outperform traditional consensus strategies and even the best performance case achieved with single-structure docking. We provide statistical evidence that supports the effectiveness of ML to improve the ensemble docking performance.
引用
收藏
页码:5362 / 5376
页数:15
相关论文
共 50 条
  • [1] Improved method of structure-based virtual screening based on ensemble learning
    Li, Jin
    Liu, WeiChao
    Song, Yongping
    Xia, JiYi
    RSC ADVANCES, 2020, 10 (13) : 7609 - 7618
  • [2] Machine-learning scoring functions for structure-based virtual screening
    Li Hongjian
    Sze, Kam-Heung
    Lu Gang
    Ballester, Pedro J.
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE, 2021, 11 (01)
  • [3] Traditional and machine learning approaches in structure-based drug virtual screening
    Zhang, Hong
    Gao, Yi Qin
    CHINESE JOURNAL OF CHEMICAL PHYSICS, 2024, 37 (02) : 177 - 191
  • [4] SCORCH: Improving structure-based virtual screening with machine learning classifiers, data augmentation, and uncertainty estimation
    McGibbon, Miles
    Money-Kyrle, Sam
    Blay, Vincent
    Houston, Douglas R.
    JOURNAL OF ADVANCED RESEARCH, 2023, 46 : 135 - 147
  • [5] Machine Learning Boosted Docking (HASTEN): An Open-source Tool To Accelerate Structure-based Virtual Screening Campaigns
    Kalliokoski, Tuomo
    MOLECULAR INFORMATICS, 2021, 40 (09)
  • [6] Ligand docking and virtual screening in structure-based drug discovery
    Cavasotto, Claudio N.
    FROM PHYSICS TO BIOLOGY: THE INTERFACE BETWEEN EXPERIMENT AND COMPUTATION, 2006, 851 : 34 - 49
  • [7] Ligand docking and structure-based virtual screening in drug discovery
    Cavasotto, Claudio N.
    Orry, Andrew J. W.
    CURRENT TOPICS IN MEDICINAL CHEMISTRY, 2007, 7 (10) : 1006 - 1014
  • [8] Recent progress on the prospective application of machine learning to structure-based virtual screening
    Ghislat, Ghita
    Rahman, Taufiq
    Ballester, Pedro J.
    CURRENT OPINION IN CHEMICAL BIOLOGY, 2021, 65 : 28 - 34
  • [9] A practical guide to machine-learning scoring for structure-based virtual screening
    Viet-Khoa Tran-Nguyen
    Muhammad Junaid
    Saw Simeon
    Pedro J. Ballester
    Nature Protocols, 2023, 18 : 3460 - 3511
  • [10] Performance of machine-learning scoring functions in structure-based virtual screening
    Wojcikowski, Maciej
    Ballester, Pedro J.
    Siedlecki, Pawel
    SCIENTIFIC REPORTS, 2017, 7