Evaluation of iterative alignment algorithms for multiple alignment

被引:37
作者
Wallace, IM [1 ]
Orla, O [1 ]
Higgins, DG [1 ]
机构
[1] Univ Coll Dublin, Conway Inst Biomol & Biomed Res, Dublin, Ireland
基金
爱尔兰科学基金会;
关键词
D O I
10.1093/bioinformatics/bti159
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Iteration has been used a number of times as an optimization method to produce multiple alignments, either alone or in combination with other methods. Iteration has a great advantage in that it is often very simple both in terms of coding the algorithms and the complexity of the time and memory requirements. In this paper, we systematically test several different iteration strategies by comparing the results on sets of alignment test cases. Results: We tested three schemes where iteration is used to improve an existing alignment. This was found to be remarkably effective and could induce a significant improvement in the accuracy of alignments from most packages. For example the average accuracy of ClustalW was improved by over 6% on the hardest test cases. Iteration was found to be even more powerful when it was directly incorporated into a progressive alignment scheme. Here, iteration was used to improve subalignments at each step of progressive alignment. The beneficial effects of iteration come, in part, from the ability to get round the usual local minimum problem with progressive alignment. This ability can also be used to help reduce the complexity of T-Coffee, without losing accuracy. Alignments can be generated, using T-Coffee, to align subgroups of sequences, which can then be iteratively improved and merged.
引用
收藏
页码:1408 / 1414
页数:7
相关论文
共 20 条
[1]   A STRATEGY FOR THE RAPID MULTIPLE ALIGNMENT OF PROTEIN SEQUENCES - CONFIDENCE LEVELS FROM TERTIARY STRUCTURE COMPARISONS [J].
BARTON, GJ ;
STERNBERG, MJE .
JOURNAL OF MOLECULAR BIOLOGY, 1987, 198 (02) :327-337
[2]  
Do CB, 2004, PROCEEDING OF THE NINETEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE SIXTEENTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE, P703
[3]   Multiple sequence alignment in parallel on a workstation cluster [J].
Ebedes, J ;
Datta, A .
BIOINFORMATICS, 2004, 20 (07) :1193-1195
[4]   MUSCLE: multiple sequence alignment with high accuracy and high throughput [J].
Edgar, RC .
NUCLEIC ACIDS RESEARCH, 2004, 32 (05) :1792-1797
[5]   Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments [J].
Gotoh, O .
JOURNAL OF MOLECULAR BIOLOGY, 1996, 264 (04) :823-838
[6]  
Gupta S K, 1995, J Comput Biol, V2, P459, DOI 10.1089/cmb.1995.2.459
[7]  
HIROSAWA M, 1995, COMPUT APPL BIOSCI, V11, P13
[8]   Evaluation of protein multiple alignments by SAM-T99 using the BAliBASE multiple alignment test set [J].
Karplus, K ;
Hu, BR .
BIOINFORMATICS, 2001, 17 (08) :713-720
[9]   MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform [J].
Katoh, K ;
Misawa, K ;
Kuma, K ;
Miyata, T .
NUCLEIC ACIDS RESEARCH, 2002, 30 (14) :3059-3066
[10]   Generating consensus sequences from partial order multiple sequence alignment graphs [J].
Lee, C .
BIOINFORMATICS, 2003, 19 (08) :999-1008