BWM: A Novel, Provable, Ensemble-based Dynamic Programming Algorithm for Sparse Approximations of Computational Protein Design

被引:15
|
作者
Jou, Jonathan D. [1 ]
Jain, Swati [1 ,2 ,3 ,4 ]
Georgiev, Ivelin S. [1 ,6 ]
Donald, Bruce R. [1 ,2 ,5 ]
机构
[1] Duke Univ, Dept Comp Sci, Durham, NC 27708 USA
[2] Duke Univ, Med Ctr, Dept Biochem, Durham, NC 27708 USA
[3] Duke Univ, Dept Computat Biol, Durham, NC 27708 USA
[4] Duke Univ, Bioinformat Program, Durham, NC 27708 USA
[5] Duke Univ, Dept Chem, Durham, NC 27708 USA
[6] NIAID, Vaccine Res Ctr, NIH, 9000 Rockville Pike, Bethesda, MD 20892 USA
基金
美国国家卫生研究院;
关键词
protein design; sparse residue interaction graphs; provable algorithms; branch-decomposition; dynamic programming; ensemble-based algorithms; OSPREY; DEAD-END ELIMINATION; SIDE-CHAIN; GRAMICIDIN SYNTHETASE; SEARCH ALGORITHM; PREDICTION; REDESIGN; CONFORMATION; SPECIFICITY; RESISTANCE; POTENCY;
D O I
10.1089/cmb.2015.0194
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Sparse energy functions that ignore long range interactions between residue pairs are frequently used by protein design algorithms to reduce computational cost. Current dynamic programming algorithms that fully exploit the optimal substructure produced by these energy functions only compute the GMEC. This disproportionately favors the sequence of a single, static conformation and overlooks better binding sequences with multiple low-energy conformations. Provable, ensemble-based algorithms such as A* avoid this problem, but A* cannot guarantee better performance than exhaustive enumeration. We propose a novel, provable, dynamic programming algorithm called Branch-Width Minimization* (BWM*) to enumerate a gap-free ensemble of conformations in order of increasing energy. Given a branch-decomposition of branch-width w for an n-residue protein design with at most q discrete side-chain conformations per residue, BWM* returns the sparse GMEC in O(nw(2) q(3/2w)) time and enumerates each additional conformation in merely O(n log q) time. We define a new measure, Total Effective Search Space (TESS), which can be computed efficiently a priori before BWM* or A* is run. We ran BWM* on 67 protein design problems and found that TESS discriminated between BWM*-efficient and A*-efficient cases with 100% accuracy. As predicted by TESS and validated experimentally, BWM* outperforms A* in 73% of the cases and computes the full ensemble or a close approximation faster than A*, enumerating each additional conformation in milliseconds. Unlike A*, the performance of BWM* can be predicted in polynomial time before running the algorithm, which gives protein designers the power to choose the most efficient algorithm for their particular design problem.
引用
收藏
页码:413 / 424
页数:12
相关论文
共 50 条
  • [31] Fast and robust map-matching algorithm based on a global measure and dynamic programming for sparse probe data
    Yokota, Takayoshi
    Okude, Mariko
    Sakamoto, Toshiyuki
    Kitahara, Reiji
    IET INTELLIGENT TRANSPORT SYSTEMS, 2019, 13 (11) : 1613 - 1623
  • [32] A Novel Ensemble Learning-Based Computational Method to Predict Protein-Protein Interactions from Protein Primary Sequences
    Pan, Jie
    Wang, Shiwei
    Yu, Changqing
    Li, Liping
    You, Zhuhong
    Sun, Yanmei
    BIOLOGY-BASEL, 2022, 11 (05):
  • [33] Design and Evaluation of Agent Based Prioritized Dynamic Round Robin Scheduling Algorithm on Computational Grids
    Shah, Syed Nasir Mehmood
    Zakaria, M. Nordin B.
    Haron, Nazleeni
    Bin Mahmood, Ahmad Kamil
    Naono, Ken
    AASRI CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND BIOINFORMATICS, 2012, 1 : 531 - 543
  • [34] A novel evolutionary algorithm based on an orthogonal design for dynamic optimization problems (ODEA)
    Zeng, SY
    He, J
    de Garis, H
    Kang, LS
    2005 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-3, PROCEEDINGS, 2005, : 1188 - 1195
  • [35] Modularity-based parallel protein design algorithm with an implementation using shared memory programming
    Pal, Abantika
    Mulumudy, Rohith
    Mitra, Pralay
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2022, 90 (03) : 658 - 669
  • [36] Computational Analysis of African Swine Fever Virus Protein Space for the Design of an Epitope-Based Vaccine Ensemble
    Ros-Lucas, Albert
    Correa-Fiz, Florencia
    Bosch-Camos, Laia
    Rodriguez, Fernando
    Alonso-Padilla, Julio
    PATHOGENS, 2020, 9 (12): : 1 - 19
  • [37] An Ensemble-Based Protocol for the Computational Prediction of Helix-Helix Interactions in G Protein-Coupled Receptors using Coarse-Grained Molecular Dynamics
    Altwaijry, Nojood A.
    Baron, Michael
    Wright, David W.
    Coveney, Peter V.
    Townsend-Nicholson, Andrea
    JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2017, 13 (05) : 2254 - 2270
  • [38] Novel iterative neural dynamic programming for data-based approximate optimal control design
    Mu, Chaoxu
    Wang, Ding
    He, Haibo
    AUTOMATICA, 2017, 81 : 240 - 252
  • [39] cOSPREY: A Cloud-Based Distributed Algorithm for Large-Scale Computational Protein Design
    Pan, Yuchao
    Dong, Yuxi
    Zhou, Jingtian
    Hallen, Mark
    Donald, Bruce R.
    Zeng, Jianyang
    Xu, Wei
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2016, 23 (09) : 737 - 749
  • [40] A novel control strategy for active battery thermal management systems based on dynamic programming and a genetic algorithm
    Fan, Yuqian
    Zuo, Xiangang
    Zhan, Di
    Zhao, Jifei
    Zhang, Guifeng
    Wang, Huanyu
    Wang, Ke
    Yang, Shuting
    Tan, Xiaojun
    APPLIED THERMAL ENGINEERING, 2023, 233