RepSeq-A database of amino acid repeats present in lower eukaryotic pathogens

被引:27
作者
Depledge, Daniel P. [1 ]
Lower, Ryan P. J. [1 ]
Smith, Deborah F. [1 ]
机构
[1] Univ York, Dept Biol, Immunol & Infect Unit, York YO10 5YW, N Yorkshire, England
基金
英国惠康基金;
关键词
D O I
10.1186/1471-2105-8-122
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Amino acid repeat-containing proteins have a broad range of functions and their identification is of relevance to many experimental biologists. In human-infective protozoan parasites (such as the Kinetoplastid and Plasmodium species), they are implicated in immune evasion and have been shown to influence virulence and pathogenicity. RepSeq http://repseq.gugbe.com is a new database of amino acid repeat-containing proteins found in lower eukaryotic pathogens. The RepSeq database is accessed via a web-based application which also provides links to related online tools and databases for further analyses. Results: The RepSeq algorithm typically identifies more than 98% of repeat-containing proteins and is capable of identifying both perfect and mismatch repeats. The proportion of proteins that contain repeat elements varies greatly between different families and even species (3 - 35% of the total protein content). The most common motif type is the Sequence Repeat Region (SRR) - a repeated motif containing multiple different amino acid types. Proteins containing Single Amino Acid Repeats (SAARs) and Di-Peptide Repeats (DPRs) typically account for 0.5 - 1.0% of the total protein number. Notable exceptions are P. falciparum and D. discoideum, in which 33.67% and 34.28% respectively of the predicted proteomes consist of repeat-containing proteins. These numbers are due to large insertions of low complexity single and multi-codon repeat regions. Conclusion: The RepSeq database provides a repository for repeat-containing proteins found in parasitic protozoa. The database allows for both individual and cross-species proteome analyses and also allows users to upload sequences of interest for analysis by the RepSeq algorithm. Identification of repeat-containing proteins provides researchers with a defined subset of proteins which can be analysed by expression profiling and functional characterisation, thereby facilitating study of pathogenicity and virulence factors in the parasitic protozoa. While primarily designed for kinetoplastid work, the RepSeq algorithm and database retain full functionality when used to analyse other species.
引用
收藏
页数:10
相关论文
共 24 条
[1]   Amino acid reiterations in yeast are overrepresented in particular classes of proteins and show evidence of a slippage-like mutational process [J].
Albà, MM ;
Santibàñez-Koref, MF ;
Hancock, JM .
JOURNAL OF MOLECULAR EVOLUTION, 1999, 49 (06) :789-797
[2]   Conservation of polyglutamine tract size between mice and humans depends on codon interruption [J].
Albà, MM ;
Santibáñez-Koref, MF ;
Hancock, JM .
MOLECULAR BIOLOGY AND EVOLUTION, 1999, 16 (11) :1641-1644
[3]   Comparative analysis of amino acid repeats in rodents and humans [J].
Albà, MM ;
Guigó, R .
GENOME RESEARCH, 2004, 14 (04) :549-554
[4]   Tandem repeats finder: a program to analyze DNA sequences [J].
Benson, G .
NUCLEIC ACIDS RESEARCH, 1999, 27 (02) :573-580
[5]   A unique insertion in Plasmodium berghei glucose-6-phosphate dehydrogenase-6-phosphogluconolactonase:: evolutionary and functional studies [J].
Clarke, JL ;
Sodeinde, O ;
Mason, PJ .
MOLECULAR AND BIOCHEMICAL PARASITOLOGY, 2003, 127 (01) :1-8
[6]   COPASAAR - A database for proteomic analysis of single amino acid repeats [J].
Depledge, DP ;
Dalby, AR .
BMC BIOINFORMATICS, 2005, 6 (1)
[7]   Genome sequence of the human malaria parasite Plasmodium falciparum [J].
Gardner, MJ ;
Hall, N ;
Fung, E ;
White, O ;
Berriman, M ;
Hyman, RW ;
Carlton, JM ;
Pain, A ;
Nelson, KE ;
Bowman, S ;
Paulsen, IT ;
James, K ;
Eisen, JA ;
Rutherford, K ;
Salzberg, SL ;
Craig, A ;
Kyes, S ;
Chan, MS ;
Nene, V ;
Shallom, SJ ;
Suh, B ;
Peterson, J ;
Angiuoli, S ;
Pertea, M ;
Allen, J ;
Selengut, J ;
Haft, D ;
Mather, MW ;
Vaidya, AB ;
Martin, DMA ;
Fairlamb, AH ;
Fraunholz, MJ ;
Roos, DS ;
Ralph, SA ;
McFadden, GI ;
Cummings, LM ;
Subramanian, GM ;
Mungall, C ;
Venter, JC ;
Carucci, DJ ;
Hoffman, SL ;
Newbold, C ;
Davis, RW ;
Fraser, CM ;
Barrell, B .
NATURE, 2002, 419 (6906) :498-511
[8]   The evolution of amino acid repeat arrays in Plasmodium and other organisms [J].
Hughes, AL .
JOURNAL OF MOLECULAR EVOLUTION, 2004, 59 (04) :528-535
[9]   Molecular cloning and characterization of a novel repeat-containing Leishmania major gene, ppg1, that encodes a membrane-associated form of proteophosphoglycan with a putative glycosylphosphatidylinositol anchor [J].
Ilg, T ;
Montgomery, J ;
Stierhof, YD ;
Handman, E .
JOURNAL OF BIOLOGICAL CHEMISTRY, 1999, 274 (44) :31410-31420
[10]   ProtRepeatsDB: a database of amino acid repeats in genomes [J].
Kalita, Mridul K. ;
Ramasamy, Gowthaman ;
Duraisamy, Sekhar ;
Chauhan, Virander S. ;
Gupta, Dinesh .
BMC BIOINFORMATICS, 2006, 7 (1)