DbClustal: rapid and reliable global multiple alignments of protein sequences detected by database searches

被引:150
作者
Thompson, JD [1 ]
Plewniak, F [1 ]
Thierry, JC [1 ]
Poch, O [1 ]
机构
[1] ULP, INSERM, CNRS, Inst Genet & Biol Mol & Cellulaire,Lab Biol & Gen, F-67404 Illkirch Graffenstaden, France
关键词
D O I
10.1093/nar/28.15.2919
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
DbClustal addresses the important problem of the automatic multiple alignment of the top scoring full-length sequences detected by a database homology search. By combining the advantages of both local and global alignment algorithms into a single system, DbClustal is able to provide accurate global alignments of highly divergent, complex sequence sets. Local alignment information is incorporated into a ClustalW global alignment in the form of a list of anchor points between pairs of sequences. The method is demonstrated using anchors supplied by the Blast post-processing program, Ballast. The rapidity and reliability of DbClustal have been demonstrated using the recently annotated Pyrococcus abyssi proteome where the number of alignments with totally misaligned sequences was reduced from 20% to <2%. A web site has been implemented proposing BlastP database searches with automatic alignment of the top hits by DbClustal.
引用
收藏
页码:2919 / 2926
页数:8
相关论文
共 26 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   Histone Sequence Database: new histone fold family members [J].
Baxevanis, AD ;
Landsman, D .
NUCLEIC ACIDS RESEARCH, 1998, 26 (01) :372-375
[3]   A symmetric-iterated multiple alignment of protein sequences [J].
Brocchieri, L ;
Karlin, S .
JOURNAL OF MOLECULAR BIOLOGY, 1998, 276 (01) :249-264
[4]   Combining many multiple alignments in one improved alignment [J].
Bucka-Lassen, K ;
Caprani, O ;
Hein, J .
BIOINFORMATICS, 1999, 15 (02) :122-130
[5]   Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments [J].
Gotoh, O .
JOURNAL OF MOLECULAR BIOLOGY, 1996, 264 (04) :823-838
[6]   Automated protein sequence database classification. I. Integration of compositional similarity search, local similarity search, and multiple sequence alignment [J].
Gracy, J ;
Argos, P .
BIOINFORMATICS, 1998, 14 (02) :164-173
[7]   EbEST: An automated tool using expressed sequence tags to delineate gene structure [J].
Jiang, J ;
Jacob, HJ .
GENOME RESEARCH, 1998, 8 (03) :268-275
[8]   Hidden Markov models for detecting remote protein homologies [J].
Karplus, K ;
Barrett, C ;
Hughey, R .
BIOINFORMATICS, 1998, 14 (10) :846-856
[9]  
Koretke KK, 1999, PROTEINS, P141
[10]   DETECTING SUBTLE SEQUENCE SIGNALS - A GIBBS SAMPLING STRATEGY FOR MULTIPLE ALIGNMENT [J].
LAWRENCE, CE ;
ALTSCHUL, SF ;
BOGUSKI, MS ;
LIU, JS ;
NEUWALD, AF ;
WOOTTON, JC .
SCIENCE, 1993, 262 (5131) :208-214