Multiple sequence alignment using partial order graphs

被引:549
作者
Lee, C [1 ]
Grasso, C [1 ]
Sharlow, MF [1 ]
机构
[1] Univ Calif Los Angeles, Dept Chem & Biochem, Los Angeles, CA 90095 USA
基金
美国国家科学基金会;
关键词
D O I
10.1093/bioinformatics/18.3.452
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Progressive Multiple Sequence Alignment (MSA) methods depend on reducing an MSA to a linear profile for each alignment step. However, this leads to loss of information needed for accurate alignment, and gap scoring artifacts. Results: We present a graph representation of an MSA that can itself be aligned directly by pairwise dynamic programming, eliminating the need to reduce the MSA to a profile. This enables our algorithm (Partial Order Alignment (POA)) to guarantee that the optimal alignment of each new sequence versus each sequence in the MSA will be considered. Moreover, this algorithm introduces a new edit operator, homologous recombination, important for multidomain sequences. The algorithm has improved speed (linear time complexity) over existing MSA algorithms, enabling construction of massive and complex alignments (e.g. an alignment of 5000 sequences in 4 h on a Pentium II). We demonstrate the utility of this algorithm on a family of multidomain SH2 proteins, and on EST assemblies
引用
收藏
页码:452 / 464
页数:13
相关论文
共 25 条
[1]   GAP COSTS FOR MULTIPLE SEQUENCE ALIGNMENT [J].
ALTSCHUL, SF .
JOURNAL OF THEORETICAL BIOLOGY, 1989, 138 (03) :297-309
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]  
[Anonymous], METHOD ENZYMOL
[4]   A STRATEGY FOR THE RAPID MULTIPLE ALIGNMENT OF PROTEIN SEQUENCES - CONFIDENCE LEVELS FROM TERTIARY STRUCTURE COMPARISONS [J].
BARTON, GJ ;
STERNBERG, MJE .
JOURNAL OF MOLECULAR BIOLOGY, 1987, 198 (02) :327-337
[5]   ESTABLISHING A HUMAN TRANSCRIPT MAP [J].
BOGUSKI, MS ;
SCHULER, GD .
NATURE GENETICS, 1995, 10 (04) :369-371
[6]   Alternative gene form discovery and candidate gene selection from gene indexing projects [J].
Burke, J ;
Wang, H ;
Hide, W ;
Davison, DB .
GENOME RESEARCH, 1998, 8 (03) :276-290
[7]   Hidden Markov models [J].
Eddy, SR .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 1996, 6 (03) :361-365
[8]   PROGRESSIVE SEQUENCE ALIGNMENT AS A PREREQUISITE TO CORRECT PHYLOGENETIC TREES [J].
FENG, DF ;
DOOLITTLE, RF .
JOURNAL OF MOLECULAR EVOLUTION, 1987, 25 (04) :351-360
[9]   Automated finishing with Autofinish [J].
Gordon, D ;
Desmarais, C ;
Green, P .
GENOME RESEARCH, 2001, 11 (04) :614-625
[10]   Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments [J].
Gotoh, O .
JOURNAL OF MOLECULAR BIOLOGY, 1996, 264 (04) :823-838