BgN-Score and BsN-Score: Bagging and boosting based ensemble neural networks scoring functions for accurate binding affinity prediction of protein-ligand complexes

被引:49
|
作者
Ashtawy, Hossam M. [1 ]
Mahapatra, Nihar R. [1 ]
机构
[1] Michigan State Univ, Dept Elect & Comp Engn, E Lansing, MI 48824 USA
来源
BMC BIOINFORMATICS | 2015年 / 16卷
基金
美国国家科学基金会;
关键词
MOLECULAR DOCKING; RECOGNITION; VALIDATION; DISCOVERY;
D O I
10.1186/1471-2105-16-S4-S8
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Accurately predicting the binding affinities of large sets of protein-ligand complexes is a key challenge in computational biomolecular science, with applications in drug discovery, chemical biology, and structural biology. Since a scoring function (SF) is used to score, rank, and identify drug leads, the fidelity with which it predicts the affinity of a ligand candidate for a protein's binding site has a significant bearing on the accuracy of virtual screening. Despite intense efforts in developing conventional SFs, which are either force-field based, knowledge-based, or empirical, their limited predictive power has been a major roadblock toward cost-effective drug discovery. Therefore, in this work, we present novel SFs employing a large ensemble of neural networks (NN) in conjunction with a diverse set of physicochemical and geometrical features characterizing protein-ligand complexes to predict binding affinity. Results: We assess the scoring accuracies of two new ensemble NN SFs based on bagging (BgN-Score) and boosting (BsN-Score), as well as those of conventional SFs in the context of the 2007 PDBbind benchmark that encompasses a diverse set of high-quality protein families. We find that BgN-Score and BsN-Score have more than 25% better Pearson's correlation coefficient (0.804 and 0.816 vs. 0.644) between predicted and measured binding affinities compared to that achieved by a state-of-the-art conventional SF. In addition, these ensemble NN SFs are also at least 19% more accurate (0.804 and 0.816 vs. 0.675) than SFs based on a single neural network that has been traditionally used in drug discovery applications. We further find that ensemble models based on NNs surpass SFs based on the decision-tree ensemble technique Random Forests. Conclusions: Ensemble neural networks SFs, BgN-Score and BsN-Score, are the most accurate in predicting binding affinity of protein-ligand complexes among the considered SFs. Moreover, their accuracies are even higher when they are used to predict binding affinities of protein-ligand complexes that are related to their training sets.
引用
收藏
页数:12
相关论文
共 40 条
  • [31] ET-score: Improving Protein-ligand Binding Affinity Prediction Based on Distance-weighted Interatomic Contact Features Using Extremely Randomized Trees Algorithm
    Rayka, Milad
    Karimi-Jafari, Mohammad Hossein
    Firouzi, Rohoullah
    MOLECULAR INFORMATICS, 2021, 40 (08)
  • [32] OnionNet: a Multiple-Layer Intermolecular-Contact-Based Convolutional Neural Network for Protein-Ligand Binding Affinity Prediction
    Zheng, Liangzhen
    Fan, Jingrong
    Mu, Yuguang
    ACS OMEGA, 2019, 4 (14): : 15956 - 15965
  • [33] PIGNet2: a versatile deep learning-based protein-ligand interaction prediction model for binding affinity scoring and virtual screening
    Moon, Seokhyun
    Hwang, Sang-Yeon
    Lim, Jaechang
    Kim, Woo Youn
    DIGITAL DISCOVERY, 2024, 3 (02): : 287 - 299
  • [34] SG-ML-PLAP: A structure-guided machine learning-based scoring function for protein-ligand binding affinity prediction
    Pal, Sapna
    Pal, Ankita
    Mohanty, Debasisa
    PROTEIN SCIENCE, 2025, 34 (01)
  • [35] Sfcnn: a novel scoring function based on 3D convolutional neural network for accurate and stable protein–ligand affinity prediction
    Yu Wang
    Zhengxiao Wei
    Lei Xi
    BMC Bioinformatics, 23
  • [36] How well can we score now and where do we go from here: Comprehensive evaluation of 13 scoring functions on 800 protein-ligand complexes and development of new scoring functions.
    Wang, SM
    Wang, RX
    Fang, XL
    Yang, CY
    Lu, YP
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2004, 228 : U516 - U516
  • [37] Interaction-Based Inductive Bias in Graph Neural Networks: Enhancing Protein-Ligand Binding Affinity Predictions From 3D Structures
    Yang, Ziduo
    Zhong, Weihe
    Lv, Qiujie
    Dong, Tiejun
    Chen, Guanxing
    Chen, Calvin Yu-Chian
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (12) : 8191 - 8208
  • [38] Evaluation of Several Two-Step Scoring Functions Based on Linear Interaction Energy, Effective Ligand Size, and Empirical Pair Potentials for Prediction of Protein-Ligand Binding Geometry and Free Energy
    Rahaman, Obaidur
    Estrada, Trilce P.
    Doren, Douglas J.
    Taufer, Michela
    Brooks, Charles L., III
    Armen, Roger S.
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2011, 51 (09) : 2047 - 2065
  • [39] T-ALPHA: A Hierarchical Transformer-Based Deep Neural Network for Protein-Ligand Binding Affinity Prediction with Uncertainty-Aware Self-Learning for Protein-Specific Alignment
    Kyro, Gregory W.
    Smaldone, Anthony M.
    Shee, Yu
    Xu, Chuzhi
    Batista, Victor S.
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2025, 65 (05) : 2395 - 2415
  • [40] Data Mining Meets Machine Learning: A Novel ANN-based Multi-body Interaction Docking Scoring Function (MBI-score) based on Utilizing Frequent Geometric and Chemical Patterns of Interfacial Atoms in Native Protein-ligand Complexes
    Khashan, Raed
    Tropsha, Alexander
    Zheng, Weifan
    MOLECULAR INFORMATICS, 2022, 41 (08)