SCOPe: Manual Curation and Artifact Removal in the Structural Classification of Proteins - extended Database

被引:70
作者
Chandonia, John-Marc [1 ,2 ]
Fox, Naomi K. [2 ,4 ]
Brenner, Steven E. [1 ,3 ]
机构
[1] Lawrence Berkeley Natl Lab, Environm Genom & Syst Biol Div, Berkeley, CA 94720 USA
[2] Lawrence Berkeley Natl Lab, Mol Biophys & Integrated Bioimaging Div, Berkeley, CA 94720 USA
[3] Univ Calif Berkeley, Dept Plant & Microbial Biol, 461A Koshland Hall, Berkeley, CA 94720 USA
[4] Invitae, 458 Brannan St, San Francisco, CA 94107 USA
基金
美国国家卫生研究院;
关键词
structure classification; protein evolution; database; ASTRAL COMPENDIUM; DOMAIN; CATH; SEQUENCES;
D O I
10.1016/j.jmb.2016.11.023
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
SCOPe (Structural Classification of Proteins extended, http://scop.berkeley.edu) is a database of relationships between protein structures that extends the Structural Classification of Proteins (SCOP) database. SCOP is an expert-curated ordering of domains from the majority of proteins of known structure in a hierarchy according to structural and evolutionary relationships. SCOPe classifies the majority of protein structures released since SCOP development concluded in 2009, using a combination of manual curation and highly precise automated tools, aiming to have the same accuracy as fully hand-curated SCOP releases. SCOPe also incorporates and updates the ASTRAL compendium, which provides several databases and tools to aid in the analysis of the sequences and structures of proteins classified in SCOPe. SCOPe continues high-quality manual classification of new superfamilies, a key feature of SCOP. Artifacts such as expression tags are now separated into their own class, in order to distinguish them from the homology-based annotations in the remainder of the SCOPe hierarchy. SCOPe 2.06 contains 77,439 Protein Data Bank entries, double the 38,221 structures classified in SCOP. (C) 2016 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
引用
收藏
页码:348 / 355
页数:8
相关论文
共 30 条
[1]   SCOP database in 2004: refinements integrate structure and sequence family data [J].
Andreeva, A ;
Howorth, D ;
Brenner, SE ;
Hubbard, TJP ;
Chothia, C ;
Murzin, AG .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D226-D229
[2]   Data growth and its impact on the SCOP database: new developments [J].
Andreeva, Antonina ;
Howorth, Dave ;
Chandonia, John-Marc ;
Brenner, Steven E. ;
Hubbard, Tim J. P. ;
Chothia, Cyrus ;
Murzin, Alexey G. .
NUCLEIC ACIDS RESEARCH, 2008, 36 :D419-D425
[3]   SCOP2 prototype: a new approach to protein structure mining [J].
Andreeva, Antonina ;
Howorth, Dave ;
Chothia, Cyrus ;
Kulesha, Eugene ;
Murzin, Alexey G. .
NUCLEIC ACIDS RESEARCH, 2014, 42 (D1) :D310-D314
[4]  
[Anonymous], STRUCT LOND ENGL
[5]  
Bateman A, 2002, NUCLEIC ACIDS RES, V30, P276, DOI [10.1093/nar/gkp985, 10.1093/nar/gkh121, 10.1093/nar/gkr1065]
[6]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[7]   The ASTRAL compendium for protein structure and sequence analysis [J].
Brenner, SE ;
Koehl, P ;
Levitt, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :254-256
[8]  
Brenner SE, 2000, PROTEIN SCI, V9, P197
[9]  
Brenner SE, 1996, METHOD ENZYMOL, V266, P635
[10]   The ASTRAL Compendium in 2004 [J].
Chandonia, JM ;
Hon, G ;
Walker, NS ;
Lo Conte, L ;
Koehl, P ;
Levitt, M ;
Brenner, SE .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D189-D192