Detection of protein fold similarity based on correlation of amino acid properties

被引:27
作者
Grigoriev, IV
Kim, SH
机构
[1] Univ Calif Berkeley, Dept Chem, Berkeley, CA 94720 USA
[2] Univ Calif Berkeley, E O Lawrence Berkeley Natl Lab, Berkeley, CA 94720 USA
关键词
D O I
10.1073/pnas.96.25.14318
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
An increasing number of proteins with weak sequence similarity have bean found to assume similar three-dimensional fold and often have similar or related biochemical or biophysical functions. We propose a method for detecting the fold similarity between two proteins with low sequence similarity based on their amino acid properties alone. The method, the proximity correlation matrix (PCM) method, is built on the observation that the physical properties of neighboring amino acid residues in sequence at structurally equivalent positions of two proteins of similar fold are often correlated even when amino acid sequences are different. The hydrophobicity is shown to be the most strongly correlated property for all protein fold classes. The PCM method was tested on 420 proteins belonging to 64 different known folds, each having at least three proteins with little sequence similarity. The method was able to detect fold similarities for 40% of the 420 sequences. Compared with sequence comparison and several fold-recognition methods, the method demonstrates good performance in detecting fold similarities among the proteins with low sequence identity. Applied to the complete genome of Methanococcus jannaschii, the method recognized the folds for 22 hypothetical proteins.
引用
收藏
页码:14318 / 14323
页数:6
相关论文
共 36 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[3]   A METHOD TO IDENTIFY PROTEIN SEQUENCES THAT FOLD INTO A KNOWN 3-DIMENSIONAL STRUCTURE [J].
BOWIE, JU ;
LUTHY, R ;
EISENBERG, D .
SCIENCE, 1991, 253 (5016) :164-170
[4]   Complete genome sequence of the methanogenic archaeon, Methanococcus jannaschii [J].
Bult, CJ ;
White, O ;
Olsen, GJ ;
Zhou, LX ;
Fleischmann, RD ;
Sutton, GG ;
Blake, JA ;
FitzGerald, LM ;
Clayton, RA ;
Gocayne, JD ;
Kerlavage, AR ;
Dougherty, BA ;
Tomb, JF ;
Adams, MD ;
Reich, CI ;
Overbeek, R ;
Kirkness, EF ;
Weinstock, KG ;
Merrick, JM ;
Glodek, A ;
Scott, JL ;
Geoghagen, NSM ;
Weidman, JF ;
Fuhrmann, JL ;
Nguyen, D ;
Utterback, TR ;
Kelley, JM ;
Peterson, JD ;
Sadow, PW ;
Hanna, MC ;
Cotton, MD ;
Roberts, KM ;
Hurst, MA ;
Kaine, BP ;
Borodovsky, M ;
Klenk, HP ;
Fraser, CM ;
Smith, HO ;
Woese, CR ;
Venter, JC .
SCIENCE, 1996, 273 (5278) :1058-1073
[5]  
Chou P Y, 1978, Adv Enzymol Relat Areas Mol Biol, V47, P45
[6]   PROTEIN-STRUCTURE PREDICTION - RECOGNITION OF PRIMARY, SECONDARY, AND TERTIARY STRUCTURAL FEATURES FROM AMINO-ACID-SEQUENCE [J].
EISENHABER, F ;
PERSSON, B ;
ARGOS, P .
CRITICAL REVIEWS IN BIOCHEMISTRY AND MOLECULAR BIOLOGY, 1995, 30 (01) :1-94
[7]  
FAUCHERE JL, 1983, EUR J MED CHEM, V18, P369
[8]  
Fischer D, 1996, PROTEIN SCI, V5, P947
[9]   How representative are the known structures of the proteins in a complete genome? A comprehensive structural census [J].
Gerstein, M .
FOLDING & DESIGN, 1998, 3 (06) :497-512
[10]  
Gerstein M, 1998, PROTEIN SCI, V7, P445