A method for partitioning the information contained in a protein sequence between its structure and function

被引:5
|
作者
Possenti, Andrea [1 ,2 ,3 ,4 ]
Vendruscolo, Michele [4 ]
Camilloni, Carlo [5 ]
Tiana, Guido [1 ,2 ,3 ]
机构
[1] Univ Milan, Ctr Complex & Biosyst, Via Celoria 16, I-20133 Milan, Italy
[2] Univ Milan, Dept Phys, Via Celoria 16, I-20133 Milan, Italy
[3] INFN, Via Celoria 16, I-20133 Milan, Italy
[4] Univ Cambridge, Dept Chem, Lensfield Rd, Cambridge CB2 1EW, England
[5] Univ Milan, Dipartimento Biosci, Via Celoria 26, I-20133 Milan, Italy
关键词
designed proteins; information content; intrinsically disordered proteins; protein folding/function; structure prediction; TRANSITION-STATE; PREDICTION; RESIDUES; ENTROPY; AGGREGATION; FRUSTRATION; PRINCIPLES; STABILITY; MECHANISM; DATABASE;
D O I
10.1002/prot.25527
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Proteins employ the information stored in the genetic code and translated into their sequences to carry out well-defined functions in the cellular environment. The possibility to encode for such functions is controlled by the balance between the amount of information supplied by the sequence and that left after that the protein has folded into its structure. We study the amount of information necessary to specify the protein structure, providing an estimate that keeps into account the thermodynamic properties of protein folding. We thus show that the information remaining in the protein sequence after encoding for its structure (the information gap') is very close to what needed to encode for its function and interactions. Then, by predicting the information gap directly from the protein sequence, we show that it may be possible to use these insights from information theory to discriminate between ordered and disordered proteins, to identify unknown functions, and to optimize artificially-designed protein sequences.
引用
收藏
页码:956 / 964
页数:9
相关论文
共 50 条
  • [41] NetGO 2.0: improving large-scale protein function prediction with massive sequence, text, domain, family and network information
    Yao, Shuwei
    You, Ronghui
    Wang, Shaojun
    Xiong, Yi
    Huang, Xiaodi
    Zhu, Shanfeng
    NUCLEIC ACIDS RESEARCH, 2021, 49 (W1) : W469 - W475
  • [42] Deciphering Molecular Virulence Mechanism of Mycobacterium tuberculosis Dop isopeptidase Based on Its Sequence-Structure-Function Linkage
    Prathiviraj, R.
    Chellapandi, P.
    PROTEIN JOURNAL, 2020, 39 (01) : 33 - 45
  • [43] Tertiary structural models for human interleukin-6 and evaluation by a sequence-structure compatibility method and NMR experimental information
    Sumikawa, H
    Fukuhara, K
    Suzuki, E
    Matsuo, Y
    Nishikawa, K
    FEBS LETTERS, 1997, 404 (2-3) : 234 - 240
  • [44] EVALUATION OF THE SEQUENCE TEMPLATE METHOD FOR PROTEIN-STRUCTURE PREDICTION - DISCRIMINATION OF THE (BETA/ALPHA)8-BARREL FOLD
    PICKETT, SD
    SAQI, MAS
    STERNBERG, MJE
    JOURNAL OF MOLECULAR BIOLOGY, 1992, 228 (01) : 170 - 187
  • [45] SESNet: sequence-structure feature-integrated deep learning method for data-efficient protein engineering
    Li, Mingchen
    Kang, Liqi
    Xiong, Yi
    Wang, Yu Guang
    Fan, Guisheng
    Tan, Pan
    Hong, Liang
    JOURNAL OF CHEMINFORMATICS, 2023, 15 (01)
  • [46] Crystal structure of a hypothetical protein, TM841 of Thermotoga maritima, reveals its function as a fatty acid-binding protein
    Schulze-Gahmen, U
    Pelaschier, J
    Yokota, H
    Kim, R
    Kim, SH
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2003, 50 (04) : 526 - 530
  • [47] A Simple Method Based on Multiple Alignment and Phylogeny to Derive a Correlation between the Protein Fold and Sequence via Motif Search
    Rizvi, Syed Baquer
    Shukla, Anil Kumar
    Dubey, Vikash Kumar
    INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES, 2009, 1 (03) : 235 - 243
  • [48] A simple method based on multiple alignment and phylogeny to derive a correlation between the protein fold and sequence via motif search
    Syed Baquer Rizvi
    Anil Kumar Shukla
    Vikash Kumar Dubey
    Interdisciplinary Sciences: Computational Life Sciences, 2009, 1 : 235 - 243
  • [49] RFDT: A Rotation Forest-based Predictor for Predicting Drug-Target Interactions Using Drug Structure and Protein Sequence Information
    Wang, Lei
    You, Zhu-Hong
    Chen, Xing
    Yan, Xin
    Liu, Gang
    Zhang, Wei
    CURRENT PROTEIN & PEPTIDE SCIENCE, 2018, 19 (05) : 445 - 454
  • [50] TALE-cmap: Protein function prediction based on a TALE-based architecture and the structure information from contact map
    Qiu, Xiao-Yao
    Wu, Hao
    Shao, Jiangyi
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 149