Revealing aperiodic aspects of solenoid proteins from sequence information

被引:2
作者
Hrabe, Thomas [1 ]
Jaroszewski, Lukasz [1 ]
Godzik, Adam [1 ]
机构
[1] Sanford Burnham Prebys Med Discovery Inst, Dept Bioinformat & Syst Biol, La Jolla, CA 92037 USA
关键词
LEUCINE-RICH-REPEAT; PEAK DETECTION; RECOGNITION; PERIODICITY; ALIGNMENT; FEATURES;
D O I
10.1093/bioinformatics/btw319
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Repeat proteins, which contain multiple repeats of short sequence motifs, form a large but seldom-studied group of proteins. Methods focusing on the analysis of 3D structures of such proteins identified many subtle effects in length distribution of individual motifs that are important for their functions. However, similar analysis was yet not applied to the vast majority of repeat proteins with unknown 3D structures, mostly because of the extreme diversity of the underlying motifs and the resulting difficulty to detect those. Results: We developed FAIT, a sequence-based algorithm for the precise assignment of individual repeats in repeat proteins and introduced a framework to classify and compare aperiodicity patterns for large protein families. FAIT extracts repeat positions by post-processing FFAS alignment matrices with image processing methods. On examples of proteins with Leucine Rich Repeat (LRR) domains and other solenoids like proteins, we show that the automated analysis with FAIT correctly identifies exact lengths of individual repeats based entirely on sequence information.
引用
收藏
页码:2776 / 2782
页数:7
相关论文
共 30 条
  • [1] Protein repeats: Structures, functions, and evolution
    Andrade, MA
    Perez-Iratxeta, C
    Ponting, CP
    [J]. JOURNAL OF STRUCTURAL BIOLOGY, 2001, 134 (2-3) : 117 - 131
  • [2] Insights into the species-specific TLR4 signaling mechanism in response to Rhodobacter sphaeroides lipid A detection
    Anwar, Muhammad Ayaz
    Panneerselvam, Suresh
    Shah, Masaud
    Choi, Sangdun
    [J]. SCIENTIFIC REPORTS, 2015, 5
  • [3] Designs on a curve
    Bazan, J. Fernando
    Kajava, Andrey V.
    [J]. NATURE STRUCTURAL & MOLECULAR BIOLOGY, 2015, 22 (02) : 103 - 105
  • [4] De novo identification of highly diverged protein repeats by probabilistic consistency
    Biegert, A.
    Soeding, J.
    [J]. BIOINFORMATICS, 2008, 24 (06) : 807 - 814
  • [5] RepeatsDB: a database of tandem repeat protein structures
    Di Domenico, Tomas
    Potenza, Emilio
    Walsh, Ian
    Parra, R. Gonzalo
    Giollo, Manuel
    Minervini, Giovanni
    Piovesan, Damiano
    Ihsan, Awais
    Ferrari, Carlo
    Kajava, Andrey V.
    Tosatto, Silvio C. E.
    [J]. NUCLEIC ACIDS RESEARCH, 2014, 42 (D1) : D352 - D357
  • [6] Forsyth D.A., 2003, COMPUTER VISION
  • [7] Detecting Repetitions and Periodicities in Proteins by Tiling the Structural Space
    Gonzalo Parra, R.
    Espada, Rocio
    Sanchez, Ignacio E.
    Sippl, Manfred J.
    Ferreiro, Diego U.
    [J]. JOURNAL OF PHYSICAL CHEMISTRY B, 2013, 117 (42) : 12887 - 12897
  • [8] Heger A, 2000, PROTEINS, V41, P224, DOI 10.1002/1097-0134(20001101)41:2<224::AID-PROT70>3.0.CO
  • [9] 2-Z
  • [10] AMINO-ACID SUBSTITUTION MATRICES FROM PROTEIN BLOCKS
    HENIKOFF, S
    HENIKOFF, JG
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1992, 89 (22) : 10915 - 10919