HIDDEN MARKOV-MODELS OF BIOLOGICAL PRIMARY SEQUENCE INFORMATION

被引：264

作者：

BALDI, P

CHAUVIN, Y

HUNKAPILLER, T

MCCLURE, MA

机构：

[1] NETID INC, SAN FRANCISCO, CA 94107 USA

[2] UNIV WASHINGTON, DEPT MOLEC BIOTECHNOL, SEATTLE, WA 98195 USA

[3] UNIV CALIF IRVINE, DEPT ECOL & EVOLUT BIOL, IRVINE, CA 92717 USA

[4] JET PROP LAB, PASADENA, CA 91109 USA

[5] STANFORD UNIV, DEPT PSYCHOL, STANFORD, CA 94025 USA

来源：

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA | 1994年 / 91卷 / 03期

关键词：

MULTIPLE SEQUENCE ALIGNMENTS; PROTEIN MODELING; ADAPTIVE ALGORITHMS; SEQUENCE CLASSIFICATION;

D O I：

10.1073/pnas.91.3.1059

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Hidden Markov model (HMM) techniques are used to model families of biological sequences. A smooth and convergent algorithm is introduced to iteratively adapt the transition and emission parameters of the models from the examples in a given family. The HMM approach is applied to three protein families: globins, immunoglobulins, and kinases. In all cases, the models derived capture the important statistical characteristics of the family and can be used for a number of tasks, including multiple alignments, motif detection, and classification. For K sequences of average length N, this approach yields an effective multiple-alignment algorithm which requires O(KN2) operations, linear in the number of sequences.

引用

页码：1059 / 1063

页数：5

共 26 条

[1]

BALDI P, 1994, NEURAL COMPUT, V6, P305

[2]

BALDI P, 1993, ADV NEURAL INFORMATI, V5, P747

[3] DETERMINANTS OF A PROTEIN FOLD - UNIQUE FEATURES OF THE GLOBIN AMINO-ACID-SEQUENCES [J].