Touring Protein Space with Matt

被引:9
作者
Daniels, Noah M. [1 ]
Kumar, Anoop [1 ]
Cowen, Lenore J. [1 ]
Menke, Matt [1 ]
机构
[1] Tufts Univ, Medford, MA 02155 USA
关键词
SCOP; hierarchical classification; structure alignment; fold space; automated classification; FOLD SPACE; STRUCTURE CLASSIFICATIONS; STRUCTURAL DOMAINS; SCOP; CATH; DATABASE; ALIGNMENT; SEQUENCE; ASSIGNMENT; EVOLUTION;
D O I
10.1109/TCBB.2011.70
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Using the Matt structure alignment program, we take a tour of protein space, producing a hierarchical clustering scheme that divides protein structural domains into clusters based on geometric dissimilarity. While it was known that purely structural, geometric, distance-based measures of structural similarity, such as Dali/FSSP, could largely replicate hand-curated schemes such as SCOP at the family level, it was an open question as to whether any such scheme could approximate SCOP at the more distant superfamily and fold levels. We partially answer this question in the affirmative, by designing a clustering scheme based on Matt that approximately matches SCOP at the superfamily level, and demonstrates qualitative differences in performance between Matt and DaliLite. Implications for the debate over the organization of protein fold space are discussed. Based on our clustering of protein space, we introduce the Mattbench benchmark set, a new collection of structural alignments useful for testing sequence aligners on more distantly homologous proteins.
引用
收藏
页码:286 / 293
页数:8
相关论文
共 41 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   SCOP database in 2004: refinements integrate structure and sequence family data [J].
Andreeva, A ;
Howorth, D ;
Brenner, SE ;
Hubbard, TJP ;
Chothia, C ;
Murzin, AG .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D226-D229
[3]   Accuracy analysis of multiple structure alignments [J].
Berbalk, Christoph ;
Schwaiger, Christine S. ;
Lackner, Peter .
PROTEIN SCIENCE, 2009, 18 (10) :2027-2035
[4]  
Cheek S., 2006, BMC BIOINFORMATICS, V7
[5]   A fast SCOP fold classification system using content-based E-predict algorithm [J].
Chi, Pin-Hao ;
Shyu, Chi-Ren ;
Xu, Dong .
BMC BIOINFORMATICS, 2006, 7 (1)
[6]   Evolution of protein structural classes and protein sequence families [J].
Choi, In-Geol ;
Kim, Sung-Hou .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2006, 103 (38) :14056-14061
[7]  
Daniels N, 2010, LECT N BIOINFORMAT, V6053, P18, DOI 10.1007/978-3-642-13078-6_4
[8]   A consensus view of fold space: Combining SCOP, CATH, and the Dali Domain Dictionary [J].
Day, R ;
Beck, DAC ;
Armen, RS ;
Daggett, V .
PROTEIN SCIENCE, 2003, 12 (10) :2150-2160
[9]  
Gerstein M, 1998, PROTEIN SCI, V7, P445
[10]   Automated assignment of SCOP and CATH protein structure classifications from FSSP scores [J].
Getz, G ;
Vendruscolo, M ;
Sachs, D ;
Domany, E .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2002, 46 (04) :405-415