RepeatsDB: a database of tandem repeat protein structures

被引:51
作者
Di Domenico, Tomas [1 ]
Potenza, Emilio [1 ]
Walsh, Ian [1 ]
Parra, R. Gonzalo [2 ]
Giollo, Manuel [1 ,3 ]
Minervini, Giovanni [1 ]
Piovesan, Damiano [1 ]
Ihsan, Awais [1 ,4 ]
Ferrari, Carlo [3 ]
Kajava, Andrey V. [5 ,6 ]
Tosatto, Silvio C. E. [1 ]
机构
[1] Univ Padua, Dept Biomed Sci, I-35131 Padua, Italy
[2] Univ Buenos Aires, Dept Biol Chem, Buenos Aires, DF, Argentina
[3] Univ Padua, Dept Informat Engn, I-35121 Padua, Italy
[4] COMSATS Inst Informat Technol, Dept Biosci, Sahiwal, Pakistan
[5] CNRS, Ctr Rech Biochim Macromol, F-34293 Montpellier 5, France
[6] Inst Biol Computat, F-34293 Montpellier 5, France
关键词
LEUCINE-RICH REPEAT; SEQUENCES; RECOGNITION; PERIODICITY; VALIDATION; EVOLUTION;
D O I
10.1093/nar/gkt1175
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
RepeatsDB (http://repeatsdb.bio.unipd.it/) is a database of annotated tandem repeat protein structures. Tandem repeats pose a difficult problem for the analysis of protein structures, as the underlying sequence can be highly degenerate. Several repeat types haven been studied over the years, but their annotation was done in a case-by-case basis, thus making large-scale analysis difficult. We developed RepeatsDB to fill this gap. Using state-of-the-art repeat detection methods and manual curation, we systematically annotated the Protein Data Bank, predicting 10 745 repeat structures. In all, 2797 structures were classified according to a recently proposed classification schema, which was expanded to accommodate new findings. In addition, detailed annotations were performed in a subset of 321 proteins. These annotations feature information on start and end positions for the repeat regions and units. RepeatsDB is an ongoing effort to systematically classify and annotate structural protein repeats in a consistent way. It provides users with the possibility to access and download high-quality datasets either interactively or programmatically through web services.
引用
收藏
页码:D352 / D357
页数:6
相关论文
共 46 条
  • [1] Swelfe: a detector of internal repeats in sequences and structures
    Abraham, Anne-Laure
    Rocha, Eduardo P. C.
    Pothier, Joel
    [J]. BIOINFORMATICS, 2008, 24 (13) : 1536 - 1537
  • [2] Comparison of ARM and HEAT protein repeats
    Andrade, MA
    Petosa, C
    O'Donoghue, SI
    Müller, CW
    Bork, P
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 2001, 309 (01) : 1 - 18
  • [3] Protein repeats: Structures, functions, and evolution
    Andrade, MA
    Perez-Iratxeta, C
    Ponting, CP
    [J]. JOURNAL OF STRUCTURAL BIOLOGY, 2001, 134 (2-3) : 117 - 131
  • [4] Data growth and its impact on the SCOP database: new developments
    Andreeva, Antonina
    Howorth, Dave
    Chandonia, John-Marc
    Brenner, Steven E.
    Hubbard, Tim J. P.
    Chothia, Cyrus
    Murzin, Alexey G.
    [J]. NUCLEIC ACIDS RESEARCH, 2008, 36 : D419 - D425
  • [5] [Anonymous], DATABASE
  • [6] [Anonymous], NUCLEIC ACIDS RES
  • [7] Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkh121, 10.1093/nar/gkr1065]
  • [8] Structure and distribution of pentapeptide repeats in bacteria
    Bateman, A
    Murzin, AG
    Teichmann, SA
    [J]. PROTEIN SCIENCE, 1998, 7 (06) : 1477 - 1480
  • [9] The leucine-rich repeat structure
    Bella, J.
    Hindle, K. L.
    McEwan, P. A.
    Lovell, S. C.
    [J]. CELLULAR AND MOLECULAR LIFE SCIENCES, 2008, 65 (15) : 2307 - 2333
  • [10] The Future of the Protein Data Bank
    Berman, Helen M.
    Kleywegt, Gerard J.
    Nakamura, Haruki
    Markley, John L.
    [J]. BIOPOLYMERS, 2013, 99 (03) : 218 - 222