Within the twilight zone: A sensitive profile-profile comparison tool based on information theory

被引:203
作者
Yona, G [1 ]
Levitt, M [1 ]
机构
[1] Stanford Univ, Dept Biol Struct, Stanford, CA 94305 USA
关键词
profile-profile comparison; PSI-BLAST; structural alignment; remote homologies;
D O I
10.1006/jmbi.2001.5293
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
This paper presents a novel approach to profile-profile comparison. The method compares two input profiles (like those that are generated by PSI-BLAST) and assigns a similarity score to assess their statistical similarity. Our profile-profile comparison tool, which allows for gaps, can be used to detect weak similarities between protein families. It has also been optimized to produce alignments that are in very good agreement with structural alignments. Tests show that the profile-profile alignments are indeed highly correlated with similarities between secondary structure elements and tertiary structure. Exhaustive evaluations show that our method is significantly more sensitive in detecting distant homologies than the popular profile-based search programs PSI-BLAST and IMPALA. The relative improvement is the same order of magnitude as the improvement of PSI-BLAST relative to BLAST. Our new tool often detects similarities that fall within the twilight zone of sequence similarity. (C) 2002 Elsevier Science Ltd.
引用
收藏
页码:1257 / 1275
页数:19
相关论文
共 57 条
[1]  
Altschul SF, 1996, METHOD ENZYMOL, V266, P460
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]  
[Anonymous], 1994, Ann. Prob
[4]   The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1999 [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 1999, 27 (01) :49-54
[5]   Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships [J].
Brenner, SE ;
Chothia, C ;
Hubbard, TJP .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (11) :6073-6078
[6]   The ASTRAL compendium for protein structure and sequence analysis [J].
Brenner, SE ;
Koehl, P ;
Levitt, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :254-256
[7]   LiveBench-1: Continuous benchmarking of protein structure prediction servers [J].
Bujnicki, JM ;
Elofsson, A ;
Fischer, D ;
Rychlewski, L .
PROTEIN SCIENCE, 2001, 10 (02) :352-361
[8]   STRONG LIMIT-THEOREMS OF EMPIRICAL FUNCTIONALS FOR LARGE EXCEEDANCES OF PARTIAL-SUMS OF IID VARIABLES [J].
DEMBO, A ;
KARLIN, S .
ANNALS OF PROBABILITY, 1991, 19 (04) :1737-1755
[9]  
DOOLITTLE RF, 1992, PROTEIN SCI, V1, P191
[10]  
ELYANIV R, 1997, ADV NEURAL INFORMATI, V10, P465