Development and validation of a consistency based multiple structure alignment algorithm

被引:18
作者
Ebert, J
Brutlag, D [1 ]
机构
[1] Stanford Univ, Program Biophys, Stanford, CA 94305 USA
[2] Stanford Univ, Dept Biochem, Stanford, CA 94305 USA
关键词
D O I
10.1093/bioinformatics/btl046
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
We introduce an algorithm that uses the information gained from simultaneous consideration of an entire group of related proteins to create multiple structure alignments (MSTAs). Consistency-based alignment (CBA) first harnesses the information contained within regions that are consistently aligned among a set of pairwise superpositions in order to realign pairs of proteins through both global and local refinement methods. It then constructs a multiple alignment that is maximally consistent with the improved pairwise alignments. We validate CBA's alignments by assessing their accuracy in regions where at least two of the aligned structures contain the same conserved sequence motif. Results: CBA correctly aligns well over 90% of motif residues in superpositions of proteins belonging to the same family or superfamily, and it outperforms a number of previously reported MSTA algorithms.
引用
收藏
页码:1080 / 1087
页数:8
相关论文
共 34 条
[1]  
AKUTSU T, 1999, GEN INF SER WORKSH G, V10, P3
[2]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[3]   The ASTRAL compendium for protein structure and sequence analysis [J].
Brenner, SE ;
Koehl, P ;
Levitt, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :254-256
[4]  
Bystroff C, 2002, BIOINFORMATICS, V18, P54
[5]   ProbCons: Probabilistic consistency-based multiple sequence alignment [J].
Do, CB ;
Mahabhashyam, MSP ;
Brudno, M ;
Batzoglou, S .
GENOME RESEARCH, 2005, 15 (02) :330-340
[6]  
Dongen Svan, 2000, GRAPH CLUSTERING FLO
[7]  
Doolittle R.F., 1986, Of Urfs and Orfs: A Primer on How to Analyze Derived Amino Acid Sequences
[8]   MASS: multiple structural alignment by secondary structures [J].
Dror, O. ;
Benyamini, H. ;
Nussinov, R. ;
Wolfson, H. .
BIOINFORMATICS, 2003, 19 :i95-i104
[9]  
Gerstein M, 1996, Proc Int Conf Intell Syst Mol Biol, V4, P59
[10]  
GOTOH O, 1990, B MATH BIOL, V52, P509, DOI 10.1007/BF02462264