SARP: A Novel Algorithm to Assess Compositional Biases in Protein Sequences

被引:13
作者
Antonets, Kirill S. [1 ]
Nizhnikov, Anton A. [1 ,2 ]
机构
[1] St Petersburg State Univ, Dept Genet & Biotechnol, St Petersburg 199034, Russia
[2] Russian Acad Sci, St Petersburg Branch, NI Vavilov Inst Gen Genet, St Petersburg 196140, Russia
来源
EVOLUTIONARY BIOINFORMATICS | 2013年 / 9卷
关键词
algorithm; protein; sequence analysis; probability; composition; REGIONS; DOMAINS;
D O I
10.4137/EBO.S12299
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The composition of a defined set of subunits (nucleotides, amino acids) is one of the key features of biological sequences. Compositional biases are local shifts in amino acid or nucleotide frequencies that can occur as an adaptation of an organism to an extreme ecological niche, or as the signature of a specific function or localization of the corresponding protein. The calculation of probability is a method for annotating compositional bias and providing accurate detection of biased subsequences. Here, we present a Sequence Analysis based on the Ranking of Probabilities (SARP), a novel algorithm for the annotation of compositional biases based on ranking subsequences by their probabilities. SARP provides the same accuracy as the previously published Lower Probability Sub-sequences (LPS) algorithm but performs at an approximately 230-fold faster rate. It can be recommended for use when working with large datasets to reduce the time and resources required.
引用
收藏
页码:263 / 273
页数:11
相关论文
共 17 条
[1]   A Systematic Survey Identifies Prions and Illuminates Sequence Features of Prionogenic Proteins [J].
Alberti, Simon ;
Halfmann, Randal ;
King, Oliver ;
Kapila, Atul ;
Lindquist, Susan .
CELL, 2009, 137 (01) :146-158
[2]   Studies in the physical chemistry of the proteins. III. The relation between the amino acid composition of casein and its capacity to combine with base. [J].
Cohn, EJ ;
Berggren, REL .
JOURNAL OF GENERAL PHYSIOLOGY, 1924, 7 (01) :45-79
[3]   RNA-binding proteins with prion-like domains in ALS and FTLD-U [J].
Gitler, Aaron D. ;
Shorter, James .
PRION, 2011, 5 (03) :179-187
[4]   PrionHome: A Database of Prions and Other Sequences Relevant to Prion Phenomena [J].
Harbi, Djamel ;
Parthiban, Marimuthu ;
Gendoo, Deena M. A. ;
Ehsani, Sepehr ;
Kumar, Manish ;
Schmitt-Ulms, Gerold ;
Sowdhamini, Ramanathan ;
Harrison, Paul M. .
PLOS ONE, 2012, 7 (02)
[5]   LPS-annotate: complete annotation of compositionally biased regions in the protein knowledgebase [J].
Harbi, Djamel ;
Kumar, Manish ;
Harrison, Paul M. .
DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2011, :baq031
[6]   Evolution of budding yeast prion-determinant sequences across diverse fungi [J].
Harrison, Luke B. ;
Yu, Zhan ;
Stajich, Jason E. ;
Dietrich, Fred S. ;
Harrison, Paul M. .
JOURNAL OF MOLECULAR BIOLOGY, 2007, 368 (01) :273-282
[8]   A method to assess compositional bias in biological sequences and its application to prion-like glutamine/asparagine-rich domains in eukaryotic proteomes [J].
Harrison, PM ;
Gerstein, M .
GENOME BIOLOGY, 2003, 4 (06)
[9]   The SR protein family of splicing factors: master regulators of gene expression [J].
Long, Jennifer C. ;
Caceres, Javier F. .
BIOCHEMICAL JOURNAL, 2009, 417 :15-27
[10]   A census of glutamine/asparagine-rich regions: Implications for their conserved function and the prediction of novel prions [J].
Michelitsch, MD ;
Weissman, JS .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (22) :11910-11915