A DIRICHLET PROCESS MIXTURE OF HIDDEN MARKOV MODELS FOR PROTEIN STRUCTURE PREDICTION

被引:17
作者
Lennox, Kristin P. [1 ]
Dahl, David B. [1 ]
Vannucci, Marina [2 ]
Day, Ryan [3 ]
Tsai, Jerry W. [3 ]
机构
[1] Texas A&M Univ, Dept Stat, College Stn, TX 77843 USA
[2] Rice Univ, Dept Stat, Houston, TX 77251 USA
[3] Univ Pacific, Dept Chem, Stockton, CA 95211 USA
关键词
Bayesian nonparametrics; density estimation; dihedral angles; protein structure prediction; torsion angles; von Mises distribution; VON-MISES DISTRIBUTION; NONPARAMETRIC PROBLEMS; PROBABILISTIC MODEL; DENSITY-ESTIMATION; BIOINFORMATICS; DISTRIBUTIONS; INFERENCE; GENOMICS; DATABASE; ANGLES;
D O I
10.1214/09-AOAS296
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
By providing new insights into the distribution of a protein's torsion angles, recent statistical models for this data have pointed the way to more efficient methods for protein structure prediction. Most current approaches have concentrated on bivariate models at a single sequence position. There is, however, considerable value in simultaneously modeling angle pairs at multiple sequence positions in a protein. One area of application for such models is in structure prediction for the highly variable loop and turn regions. Such modeling is difficult due to the fact that the number of known protein structures available to estimate these torsion angle distributions is typically small. Furthermore, the data is "sparse" in that not all proteins have angle pairs at each sequence position. We propose a new semiparametric model for the joint distributions of angle pairs at multiple sequence positions. Our model accommodates sparse data by leveraging known information about the behavior of protein secondary structure. We demonstrate our technique by predicting the torsion angles in a loop from the globin fold family. Our results show that a template-based approach can now be successfully extended to modeling the notoriously difficult loop and turn regions.
引用
收藏
页码:916 / 942
页数:27
相关论文
共 50 条
  • [41] Quantum annealing for Dirichlet process mixture models with applications to network clustering
    Sato, Issei
    Tanaka, Shu
    Kurihara, Kenichi
    Miyashita, Seiji
    Nakagawa, Hiroshi
    Neurocomputing, 2013, 121 : 523 - 531
  • [42] Theory and computations for the Dirichlet process and related models: An overview
    Jara, Alejandro
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2017, 81 : 128 - 146
  • [43] Bayesian Inversion of Convolved Hidden Markov Models With Applications in Reservoir Prediction
    Fjeldstad, Torstein
    Omre, Henning
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2020, 58 (03): : 1957 - 1968
  • [44] Quantitative prediction of the effect of genetic variation using hidden Markov models
    Liu, Mingming
    Watson, Layne T.
    Zhang, Liqing
    BMC BIOINFORMATICS, 2014, 15
  • [45] COMPARATIVE ANALYSIS OF TRIANGULAR FUZZY HIDDEN MARKOV MODELS AND TRADITIONAL HIDDEN MARKOV MODELS
    Vyshnavi, M.
    Muthukumar, M.
    ADVANCES AND APPLICATIONS IN STATISTICS, 2025, 92 (02) : 171 - 189
  • [46] Combined prediction of Tat and Sec signal peptides with hidden Markov models
    Bagos, Pantelis G.
    Nikolaou, Elisanthi P.
    Liakopoulos, Theodore D.
    Tsirigos, Konstantinos D.
    BIOINFORMATICS, 2010, 26 (22) : 2811 - 2817
  • [47] A novel approach for modeling positive vectors with inverted Dirichlet-based hidden Markov models
    Nasfi, Rim
    Amayri, Manar
    Bouguila, Nizar
    KNOWLEDGE-BASED SYSTEMS, 2020, 192
  • [49] Asymmetric hidden Markov models
    Bueno, Marcos L. P.
    Hommersom, Arjen
    Lucas, Peter J. F.
    Linard, Alexis
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2017, 88 : 169 - 191
  • [50] Prediction of Protein Interdomain Linker Regions by a Nonstationary Hidden Markov Model
    Bae, Kyounghwa
    Mallick, Bani K.
    Elsik, Christine G.
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2008, 103 (483) : 1085 - 1099