Structural and Sequence Similarity Makes a Significant Impact on Machine-Learning-Based Scoring Functions for Protein-Ligand Interactions

被引:74
作者
Li, Yang [1 ,2 ]
Yang, Jianyi [2 ]
机构
[1] Nankai Univ, Coll Life Sci, Tianjin 300071, Peoples R China
[2] Nankai Univ, Sch Math Sci, Tianjin 300071, Peoples R China
基金
中国国家自然科学基金;
关键词
OUT CROSS-VALIDATION; BINDING-AFFINITY; RANDOM FOREST; PREDICTION; LEAD; OPTIMIZATION; APPROPRIATE; ACCURACY; DOCKING; SET;
D O I
10.1021/acs.jcim.7b00049
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
The prediction of protein-ligand binding affinity has recently been improved remarkably by machine-learning-based scoring functions. For example, using a set Of simple descriptors representing the atomic distance counts, the RF-Score improves the Pearson correlation coefficient to about 0.8 on the core set of the PDBbind 2007 database, which is significantly higher than the performance of any conventional scoring function on the same benchmark. A few studies have been made to discuss the performance of machine-learning-based, methods, but the reason for this improvement remains unclear. In this study, by systemically controlling the structural and sequence similarity between the training and test proteins of the PDBbind benchmark, we demonstrate that protein structural and sequence Similarity makes a significant impact on machine-learning-based methods. After removal of training proteins that are highly similar to the test proteins identified by structure alignment and sequence alignment, machine-learning-based methods trained on the new training sets do not outperform the conventional scoring functions any more. On the contrary, the performance of conventional functions like X-Score is relatively stable no matter what training data are used to fit the weights of its energy terms.
引用
收藏
页码:1007 / 1012
页数:6
相关论文
共 35 条
[1]   Machine-learning scoring functions to improve structure-based binding affinity prediction and virtual screening [J].
Ain, Qurrat Ul ;
Aleksandrova, Antoniya ;
Roessler, Florian D. ;
Ballester, Pedro J. .
WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE, 2015, 5 (06) :405-424
[2]   Does a More Precise Chemical Description of Protein-Ligand Complexes Lead to More Accurate Prediction of Binding Affinity? [J].
Ballester, Pedro J. ;
Schreyer, Adrian ;
Blundell, Tom L. .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2014, 54 (03) :944-955
[3]   Comments on "Leave-Cluster-Out Cross-Validation Is Appropriate for Scoring Functions Derived from Diverse Protein Data Sets": Significance for the Validation of Scoring Functions [J].
Ballester, Pedro J. ;
Mitchell, John B. O. .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2011, 51 (08) :1739-1741
[4]   A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking [J].
Ballester, Pedro J. ;
Mitchell, John B. O. .
BIOINFORMATICS, 2010, 26 (09) :1169-1175
[5]   Nonlinear Scoring Functions for Similarity-Based Ligand Docking and Binding Affinity Prediction [J].
Brylinski, Michal .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2013, 53 (11) :3097-3112
[6]   Comparative Assessment of Scoring Functions on a Diverse Test Set [J].
Cheng, Tiejun ;
Li, Xun ;
Li, Yan ;
Liu, Zhihai ;
Wang, Renxiao .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2009, 49 (04) :1079-1093
[7]   Characterization of Small Molecule Binding. I. Accurate Identification of Strong Inhibitors in Virtual Screening [J].
Ding, Bo ;
Wang, Jian ;
Li, Nan ;
Wang, Wei .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2013, 53 (01) :114-122
[8]   Beware of Machine Learning-Based Scoring Functions-On the Danger of Developing Black Boxes [J].
Gabel, Joffrey ;
Desaphy, Jeremy ;
Rognan, Didier .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2014, 54 (10) :2807-2815
[9]   Knowledge-based scoring function to predict protein-ligand interactions [J].
Gohlke, H ;
Hendlich, M ;
Klebe, G .
JOURNAL OF MOLECULAR BIOLOGY, 2000, 295 (02) :337-356
[10]   Molecular mechanics methods for predicting protein-ligand binding [J].
Huang, Niu ;
Kalyanaraman, Chakrapani ;
Bernacki, Katarzyna ;
Jacobson, Matthew P. .
PHYSICAL CHEMISTRY CHEMICAL PHYSICS, 2006, 8 (44) :5166-5177