Multiple sequence alignment for phylogenetic purposes

被引:115
作者
Morrison, David A. [1 ]
机构
[1] Natl Vet Inst, Dept Parasitol, SWEPAR, S-75189 Uppsala, Sweden
[2] Swedish Univ Agr Sci, S-75189 Uppsala, Sweden
关键词
D O I
10.1071/SB06020
中图分类号
Q94 [植物学];
学科分类号
071001 ;
摘要
I have addressed the biological rather than bioinformatics aspects of molecular sequence alignment by covering a series of topics that have been under-valued, particularly within the context of phylogenetic analysis. First, phylogenetic analysis is only one of the many objectives of sequence alignment, and the most appropriate multiple alignment may not be the same for all of these purposes. Phylogenetic alignment thus occupies a specific place within a broader context. Second, homology assessment plays an intricate role in phylogenetic analysis, with sequence alignment consisting of primary homology assessment and tree building being secondary homology assessment. The objective of phylogenetic alignment thus distinguishes it from other sorts of alignment. Third, I summarise what is known about the serious limitations of using phenetic similarity as a criterion for automated multiple alignment, and provide an overview of what is currently being done to improve these computerised procedures. This synthesises information that is apparently not widely known among phylogeneticists. Fourth, I then consider the recent development of automated procedures for combining alignment and tree building, thus integrating primary and secondary homology assessment. Finally, I outline various strategies for increasing the biological content of sequence alignment procedures, which consists of taking into account known evolutionary processes when making alignment decisions. These procedures can be objective and repeatable, and can involve computerised algorithms to automate much of the work. Perhaps the most important suggestion is that alignment should be seen as a process where new sequences are added to a pre-existing alignment that has been manually curated by the biologist.
引用
收藏
页码:479 / 539
页数:61
相关论文
共 399 条
[1]   Sequence length variation, indel costs, and congruence in sensitivity analysis [J].
Aagesen, L ;
Petersen, G ;
Seberg, O .
CLADISTICS, 2005, 21 (01) :15-30
[2]  
ABOITIZ F, 1987, CELL, V51, P515, DOI 10.1016/0092-8674(87)90117-6
[3]  
ACHAZ G, 2006, IN PRESS BIOINFORMAT, DOI DOI 10.1093/BIOINFORMATICS/BT1519
[4]   Combining multiple structure and sequence alignments to improve sequence detection and alignment: Application to the SH2 domains of Janus kinases [J].
Al-Lazikani, B ;
Sheinerman, FB ;
Honig, B .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2001, 98 (26) :14796-14801
[5]   THE POSTERIOR PROBABILITY-DISTRIBUTION OF ALIGNMENTS AND ITS APPLICATION TO PARAMETER-ESTIMATION OF EVOLUTIONARY TREES AND TO OPTIMIZATION OF MULTIPLE ALIGNMENTS [J].
ALLISON, L ;
WALLACE, CS .
JOURNAL OF MOLECULAR EVOLUTION, 1994, 39 (04) :418-430
[6]   Multiple sequence alignment with arbitrary gap costs: Computing an optimal solution using polyhedral combinatorics [J].
Althaus, E ;
Caprara, A ;
Lenhof, HP ;
Reinert, K .
BIOINFORMATICS, 2002, 18 :S4-S16
[7]  
Anbarasu LA, 2000, CURR SCI INDIA, V78, P858
[8]   The tmRDB and SRPDB resources [J].
Andersen, Ebbe Sloth ;
Rosenblad, Magnus Alm ;
Larsen, Niels ;
Westergaard, Jesper Cairo ;
Burks, Jody ;
Wower, Iwona K. ;
Wower, Jacek ;
Gorodkin, Jan ;
Samuelsson, Tore ;
Zwieb, Christian .
NUCLEIC ACIDS RESEARCH, 2006, 34 :D163-D168
[9]  
[Anonymous], 1950, PHYLOGENETIC SYSTEMA
[10]  
[Anonymous], HAWAII INT C SYSTEMS