The structure of the protein universe and genome evolution

被引:407
作者
Koonin, EV [1 ]
Wolf, YI [1 ]
Karev, GP [1 ]
机构
[1] Natl Lib Med, Natl Ctr Biotechnol Informat, NIH, Bethesda, MD 20894 USA
关键词
D O I
10.1038/nature01256
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Despite the practically unlimited number of possible protein sequences, the number of basic shapes in which proteins fold seems not only to be finite, but also to be relatively small, with probably no more than 10,000 folds in existence. Moreover, the distribution of proteins among these folds is highly non-homogeneous some folds and superfamilies are extremely abundant, but most are rare. Protein folds and families encoded in diverse genomes show similar size distributions with notable mathematical properties, which also extend to the number of connections between domains in multidomain proteins. All these distributions follow asymptotic power laws, such as have been identified in a wide variety of biological and physical systems, and which are typically associated with scale-free networks. These findings suggest that genome evolution is driven by extremely general mechanisms based on the preferential attachment principle.
引用
收藏
页码:218 / 223
页数:6
相关论文
共 77 条
[1]   Statistical mechanics of complex networks [J].
Albert, R ;
Barabási, AL .
REVIEWS OF MODERN PHYSICS, 2002, 74 (01) :47-97
[2]  
ALEXANDROV NN, 1994, PROTEIN SCI, V3, P866
[3]   Comparative genomics and evolution of proteins involved in RNA metabolism [J].
Anantharaman, V ;
Koonin, EV ;
Aravind, L .
NUCLEIC ACIDS RESEARCH, 2002, 30 (07) :1427-1464
[4]   Regulatory potential, phyletic distribution and evolution of ancient, intracellular small-molecule-binding domains [J].
Anantharaman, V ;
Koonin, EV ;
Aravind, L .
JOURNAL OF MOLECULAR BIOLOGY, 2001, 307 (05) :1271-1292
[5]  
[Anonymous], 2012, Introduction to protein structure
[6]  
[Anonymous], 1949, Human behaviour and the principle of least-effort
[7]  
Apic G, 2001, Bioinformatics, V17 Suppl 1, pS83
[8]   Guilt by association: Contextual information in genome analysis [J].
Aravind, L .
GENOME RESEARCH, 2000, 10 (08) :1074-1077
[9]   Trends in protein evolution inferred from sequence and structure analysis [J].
Aravind, L ;
Mazumder, R ;
Vasudevan, S ;
Koonin, EV .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 2002, 12 (03) :392-399
[10]  
Barabasi A.L., 2002, The formula: the universal laws of success