DeepBSP-a Machine Learning Method for Accurate Prediction of Protein-Ligand Docking Structures

被引:45
作者
Bao, Jingxiao [1 ]
He, Xiao [1 ,2 ]
Zhang, John Z. H. [1 ,2 ,3 ,4 ]
机构
[1] East China Normal Univ, Sch Chem & Mol Engn, Shanghai Engn Res Ctr Mol Therapeut & New Drug De, Shanghai 200062, Peoples R China
[2] NYU Shanghai, NYU ECNU Ctr Computat Chem, Shanghai 200062, Peoples R China
[3] NYU, Dept Chem, New York, NY 10003 USA
[4] Shanxi Univ, Collaborat Innovat Ctr Extreme Opt, Taiyuan 030006, Shanxi, Peoples R China
基金
上海市自然科学基金; 中国国家自然科学基金; 国家重点研发计划;
关键词
EMPIRICAL SCORING FUNCTIONS; SEQUENCE SIMILARITY; AFFINITY; INHIBITION; ALGORITHM; IDENTIFICATION; DERIVATIVES; VALIDATION; IMPACT; GLIDE;
D O I
10.1021/acs.jcim.1c00334
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
In recent years, machine-learning-based scoring functions have significantly improved the scoring power. However, many of these methods do not perform well in distinguishing the native structure from docked decoy poses due to the lack of decoy structural information in their training data. Here, we developed a machine-learning model, named DeepBSP, that can directly predict the root mean square deviation (rmsd) of a ligand docking pose with reference to its native binding pose. Unlike the binding affinity, the rmsd between the docking poses with reference to their native structures can be straightforwardly determined. By training on a generated data set with 11,925 native complexes and more than 165,000 docked poses, our model shows excellent docking power on our test set and also on the CASF-2016 docking decoy set compared to other major scoring functions. Thus, by combining molecular dockings that generate many poses with the application of DeepBSP, one can more accurately predict the best binding pose that is closest to the native complex structure. This DeepBSP model shall be very useful in picking out poses close to their natives from many poses generated from a dock application.
引用
收藏
页码:2231 / 2240
页数:10
相关论文
共 55 条
[1]  
Aggarwal R., 2020, LEARNING RMSD IMPROV
[2]  
[Anonymous], 2020, NATURE, DOI DOI 10.1038/s41586-020-2223-y, Patent No. [WO2020086857A1, 2020086857]
[3]   Task-Specific Scoring Functions for Predicting Ligand Binding Poses and Affinity and for Screening Enrichment [J].
Ashtawy, Hossam M. ;
Mahapatra, Nihar R. .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2018, 58 (01) :119-133
[4]   Machine-learning scoring functions for identifying native poses of ligands docked to known and novel proteins [J].
Ashtawy, Hossam M. ;
Mahapatra, Nihar R. .
BMC BIOINFORMATICS, 2015, 16
[5]   Development of a New Scoring Function for Virtual Screening: APBScore [J].
Bao, Jingxiao ;
He, Xiao ;
Zhang, John Z. H. .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2020, 60 (12) :6355-6365
[6]   The first synthesis, carbonic anhydrase inhibition and anticholinergic activities of some bromophenol derivatives with S including natural products [J].
Bayrak, Cetin ;
Taslimi, Parham ;
Karaman, Halide Sedef ;
Gulcin, Ilhami ;
Menzek, Abdullah .
BIOORGANIC CHEMISTRY, 2019, 85 :128-139
[7]   DockRMSD: an open-source tool for atom mapping and RMSD calculation of symmetric molecules through graph isomorphism [J].
Bell, Eric W. ;
Zhang, Yang .
JOURNAL OF CHEMINFORMATICS, 2019, 11 (1)
[8]   Virtual Screening Identifies Irreversible FMS-like Tyrosine Kinase 3 Inhibitors with Activity toward Resistance-Conferring Mutations [J].
Bensinger, Dennis ;
Stubba, Daniel ;
Cremer, Anjali ;
Kohl, Vanessa ;
Wassmer, Theresa ;
Stuckert, Johanna ;
Engemann, Victoria ;
Stegmaier, Kimberly ;
Schmitz, Katja ;
Schmidt, Boris .
JOURNAL OF MEDICINAL CHEMISTRY, 2019, 62 (05) :2428-2446
[9]   RCSB Protein Data Bank: biological macromolecular structures enabling research and education in fundamental biology, biomedicine, biotechnology and energy [J].
Burley, Stephen K. ;
Berman, Helen M. ;
Bhikadiya, Charmi ;
Bi, Chunxiao ;
Chen, Li ;
Di Costanzo, Luigi ;
Christie, Cole ;
Dalenberg, Ken ;
Duarte, Jose M. ;
Dutta, Shuchismita ;
Feng, Zukang ;
Ghosh, Sutapa ;
Goodsell, David S. ;
Green, Rachel K. ;
Guranovic, Vladimir ;
Guzenko, Dmytro ;
Hudson, Brian P. ;
Kalro, Tara ;
Liang, Yuhe ;
Lowe, Robert ;
Namkoong, Harry ;
Peisach, Ezra ;
Periskova, Irina ;
Prlic, Andreas ;
Randle, Chris ;
Rose, Alexander ;
Rose, Peter ;
Sala, Raul ;
Sekharan, Monica ;
Shao, Chenghua ;
Tan, Lihua ;
Tao, Yi-Ping ;
Valasatava, Yana ;
Voigt, Maria ;
Westbrook, John ;
Woo, Jesse ;
Yang, Huanwang ;
Young, Jasmine ;
Zhuravleva, Marina ;
Zardecki, Christine .
NUCLEIC ACIDS RESEARCH, 2019, 47 (D1) :D464-D474
[10]   Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening [J].
Cang, Zixuan ;
Mu, Lin ;
Wei, Guo-Wei .
PLOS COMPUTATIONAL BIOLOGY, 2018, 14 (01)