The continuity of protein structure space is an intrinsic property of proteins

被引:64
作者
Skolnick, Jeffrey [1 ]
Arakaki, Adrian K. [1 ]
Lee, Seung Yup [1 ]
Brylinski, Michal [1 ]
机构
[1] Georgia Inst Technol, Ctr Study Syst Biol, Atlanta, GA 30318 USA
基金
美国国家卫生研究院;
关键词
completeness of fold space; connectivity of protein structure space; graph representation of protein structural relationships; evolution of protein folds; protein structure alignments; STRUCTURE ALIGNMENT; SCOP DATABASE; FOLD SPACE; TM-SCORE; PREDICTION; ALGORITHM; CLASSIFICATION; BENCHMARKING; ORIGIN; SINGLE;
D O I
10.1073/pnas.0907683106
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The classical view of the space of protein structures is that it is populated by a discrete set of protein folds. For proteins up to 200 residues long, by using structural alignments and building upon ideas of the completeness and continuity of structure space, we show that nearly any structure is significantly related to any other using a transitive set of no more than 7 intermediate structurally related proteins. This result holds for all structures in the Protein Data Bank, even when structural relationships between evolutionary related proteins ( as detected by threading or functional analyses) are excluded. A similar picture holds for an artificial library of compact, hydrogen-bonded, homopolypeptide structures. The 3 sets share the global connectivity features of random graphs, in which the local connectivity of each node (i.e., the number of neighboring structures per protein) is preserved. This high connectivity supports the continuous view of single-domain protein structure space. More importantly, these results do not depend on evolution, rather just on the physics of protein structures. The fact that evolutionary divergence need not be invoked to explain the continuous nature of protein structure space has implications for how the universe of protein structures might have originated, and how function should be transferred between proteins of similar structure.
引用
收藏
页码:15690 / 15695
页数:6
相关论文
共 38 条
[1]   SCOP database in 2004: refinements integrate structure and sequence family data [J].
Andreeva, A ;
Howorth, D ;
Brenner, SE ;
Hubbard, TJP ;
Chothia, C ;
Murzin, AG .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D226-D229
[2]   Data growth and its impact on the SCOP database: new developments [J].
Andreeva, Antonina ;
Howorth, Dave ;
Chandonia, John-Marc ;
Brenner, Steven E. ;
Hubbard, Tim J. P. ;
Chothia, Cyrus ;
Murzin, Alexey G. .
NUCLEIC ACIDS RESEARCH, 2008, 36 :D419-D425
[3]   A COMPUTER VISION-BASED TECHNIQUE FOR 3-D SEQUENCE-INDEPENDENT STRUCTURAL COMPARISON OF PROTEINS [J].
BACHAR, O ;
FISCHER, D ;
NUSSINOV, R ;
WOLFSON, H .
PROTEIN ENGINEERING, 1993, 6 (03) :279-288
[4]   The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data [J].
Berman, Helen ;
Henrick, Kim ;
Nakamura, Haruki ;
Markley, John L. .
NUCLEIC ACIDS RESEARCH, 2007, 35 :D301-D303
[5]   Benchmarking of TASSER in the ab initio limit [J].
Borreguero, Jose M. ;
Skolnick, Jeffrey .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2007, 68 (01) :48-56
[6]   A threading-based method (FINDSITE) for ligand-binding site prediction and functional annotation [J].
Brylinski, Michal ;
Skolnick, Jeffrey .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2008, 105 (01) :129-134
[7]   Finding the consensus shape for a protein family [J].
Chew, LP ;
Kedem, K .
ALGORITHMICA, 2004, 38 (01) :115-129
[8]   Emergence of protein fold families through rational design [J].
Ding, Feng ;
Dokholyan, Nikolay V. .
PLOS COMPUTATIONAL BIOLOGY, 2006, 2 (07) :725-733
[9]   Expanding protein universe and its origin from the biological Big Bang [J].
Dokholyan, NV ;
Shakhnovich, B ;
Shakhnovich, EI .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (22) :14132-14136
[10]   ALGORITHM-97 - SHORTEST PATH [J].
FLOYD, RW .
COMMUNICATIONS OF THE ACM, 1962, 5 (06) :345-345