An expectation maximization algorithm for training hidden substitution models

被引:62
作者
Holmes, I [1 ]
Rubin, GM [1 ]
机构
[1] Univ Calif Berkeley, Howard Hughes Med Inst, Berkeley, CA 94720 USA
关键词
molecular evolution; bioinformatics; amino acid substitution rates; Markov models;
D O I
10.1006/jmbi.2002.5405
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We derive an expectation maximization algorithm for maximum-likelihood training of substitution rate matrices from multiple sequence alignments. The algorithm can be used to train hidden substitution models, where the structural context of a residue is treated as a hidden variable that can evolve over time. We used the algorithm to train hidden substitution matrices on protein alignments in the Pfam database. Measuring the accuracy of multiple alignment algorithms with reference to BAli-BASE (a database of structural reference alignments) our substitution matrices consistently outperform the PAM series, with the improvement steadily increasing as up to four hidden site classes are added. We discuss several applications of this algorithm in bioinformatics. (C) 2002 Elsevier Science Ltd.
引用
收藏
页码:753 / 764
页数:12
相关论文
共 50 条
[41]   Investigating Protein-Coding Sequence Evolution with Probabilistic Codon Substitution Models [J].
Anisimova, Maria ;
Kosiol, Carolin .
MOLECULAR BIOLOGY AND EVOLUTION, 2009, 26 (02) :255-271
[42]   Svirezhev's substitution principle and matrix models for dynamics of populations with complex structures [J].
Logofet, D. O. .
ZHURNAL OBSHCHEI BIOLOGII, 2010, 71 (01) :30-40
[43]   Accurate Estimation of Substitution Rates with Neighbor-Dependent Models in a Phylogenetic Context [J].
Berard, Jean ;
Gueguen, Laurent .
SYSTEMATIC BIOLOGY, 2012, 61 (03) :510-521
[44]   Performance of likelihood ratio tests of evolutionary hypotheses under inadequate substitution models [J].
Zhang, JZ .
MOLECULAR BIOLOGY AND EVOLUTION, 1999, 16 (06) :868-875
[45]   T-HMM: A Novel Biomedical Text Classifier Based on Hidden Markov Models [J].
Seara Vieira, A. ;
Iglesias, E. L. ;
Borrajo, L. .
8TH INTERNATIONAL CONFERENCE ON PRACTICAL APPLICATIONS OF COMPUTATIONAL BIOLOGY & BIOINFORMATICS (PACBB 2014), 2014, 294 :225-234
[46]   Classification of the short-chain dehydrogenase/reductase superfamily using hidden Markov models [J].
Kallberg, Yvonne ;
Oppermann, Udo ;
Persson, Bengt .
FEBS JOURNAL, 2010, 277 (10) :2375-2386
[47]   Many Strong Limit Theorems for Hidden Markov Models on a Non-homogeneous Tree [J].
Jin, Shaohua ;
Lv, Jie ;
Fan, Zhenyao ;
Sun, Shuguang ;
Zhang, Yanmin .
ADVANCED TECHNOLOGY IN TEACHING - PROCEEDINGS OF THE 2009 3RD INTERNATIONAL CONFERENCE ON TEACHING AND COMPUTATIONAL SCIENCE (WTCS 2009), VOL 1: INTELLIGENT UBIQUITIOUS COMPUTING AND EDUCATION, 2012, 116 :331-+
[48]   Hidden State Models for Noncontact Measurements of the Carotid Pulse Using a Laser Doppler Vibrometer [J].
Kaplan, Alan D. ;
O'Sullivan, Joseph A. ;
Sirevaag, Erik J. ;
Lai, Po-Hsiang ;
Rohrbaugh, John W. .
IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2012, 59 (03) :744-753
[49]   Some Strong Limit Theorems for Hidden Markov Models Indexed by a Non-homogeneous Tree [J].
Jin, Shaohua ;
Wang, Yongxue ;
Liu, Huitao ;
Tian, Ying ;
Li, Hui .
2010 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY AND SECURITY INFORMATICS (IITSI 2010), 2010, :169-172
[50]   ApHMM: Accelerating Profile Hidden Markov Models for Fast and Energy-efficient Genome Analysis [J].
Firtina, Can ;
Pillai, Kamlesh ;
Kalsi, Gurpreet S. ;
Suresh, Bharathwaj ;
Cali, Damla Senol ;
Kim, Jeremie S. ;
Shahroodi, Taha ;
Cavlak, Meryem Banu ;
Lindegger, Joel ;
Alser, Mohammed ;
Luna, Juan Gomez ;
Subramoney, Sreenivas ;
Mutlu, Onur .
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2024, 21 (01)