A method for partitioning the information contained in a protein sequence between its structure and function

被引：5

作者：

Possenti, Andrea ^{[1
,2
,3
,4
]}

Vendruscolo, Michele ^{[4
]}

Camilloni, Carlo ^{[5
]}

Tiana, Guido ^{[1
,2
,3
]}

机构：

[1] Univ Milan, Ctr Complex & Biosyst, Via Celoria 16, I-20133 Milan, Italy

[2] Univ Milan, Dept Phys, Via Celoria 16, I-20133 Milan, Italy

[3] INFN, Via Celoria 16, I-20133 Milan, Italy

[4] Univ Cambridge, Dept Chem, Lensfield Rd, Cambridge CB2 1EW, England

[5] Univ Milan, Dipartimento Biosci, Via Celoria 26, I-20133 Milan, Italy

来源：

PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS | 2018年 / 86卷 / 09期

关键词：

designed proteins; information content; intrinsically disordered proteins; protein folding/function; structure prediction; TRANSITION-STATE; PREDICTION; RESIDUES; ENTROPY; AGGREGATION; FRUSTRATION; PRINCIPLES; STABILITY; MECHANISM; DATABASE;

D O I：

10.1002/prot.25527

中图分类号：

Q5 [生物化学]; Q7 [分子生物学];

学科分类号：

071010 ; 081704 ;

摘要：

Proteins employ the information stored in the genetic code and translated into their sequences to carry out well-defined functions in the cellular environment. The possibility to encode for such functions is controlled by the balance between the amount of information supplied by the sequence and that left after that the protein has folded into its structure. We study the amount of information necessary to specify the protein structure, providing an estimate that keeps into account the thermodynamic properties of protein folding. We thus show that the information remaining in the protein sequence after encoding for its structure (the information gap') is very close to what needed to encode for its function and interactions. Then, by predicting the information gap directly from the protein sequence, we show that it may be possible to use these insights from information theory to discriminate between ordered and disordered proteins, to identify unknown functions, and to optimize artificially-designed protein sequences.

引用

页码：956 / 964

页数：9

共 50 条

[41] NetGO 2.0: improving large-scale protein function prediction with massive sequence, text, domain, family and network information
Yao, Shuwei
You, Ronghui
Wang, Shaojun
Xiong, Yi
Huang, Xiaodi
Zhu, Shanfeng
NUCLEIC ACIDS RESEARCH, 2021, 49 (W1) : W469 - W475
[42] Deciphering Molecular Virulence Mechanism of Mycobacterium tuberculosis Dop isopeptidase Based on Its Sequence-Structure-Function Linkage
Prathiviraj, R.
Chellapandi, P.
PROTEIN JOURNAL, 2020, 39 (01) : 33 - 45
[43] Tertiary structural models for human interleukin-6 and evaluation by a sequence-structure compatibility method and NMR experimental information
Sumikawa, H
Fukuhara, K
Suzuki, E
Matsuo, Y
Nishikawa, K
FEBS LETTERS, 1997, 404 (2-3) : 234 - 240
[44] EVALUATION OF THE SEQUENCE TEMPLATE METHOD FOR PROTEIN-STRUCTURE PREDICTION - DISCRIMINATION OF THE (BETA/ALPHA)8-BARREL FOLD
PICKETT, SD
SAQI, MAS
STERNBERG, MJE
JOURNAL OF MOLECULAR BIOLOGY, 1992, 228 (01) : 170 - 187
[45] SESNet: sequence-structure feature-integrated deep learning method for data-efficient protein engineering
Li, Mingchen
Kang, Liqi
Xiong, Yi
Wang, Yu Guang
Fan, Guisheng
Tan, Pan
Hong, Liang
JOURNAL OF CHEMINFORMATICS, 2023, 15 (01)
[46] Crystal structure of a hypothetical protein, TM841 of Thermotoga maritima, reveals its function as a fatty acid-binding protein
Schulze-Gahmen, U
Pelaschier, J
Yokota, H
Kim, R
Kim, SH
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2003, 50 (04) : 526 - 530
[47] A Simple Method Based on Multiple Alignment and Phylogeny to Derive a Correlation between the Protein Fold and Sequence via Motif Search
Rizvi, Syed Baquer
Shukla, Anil Kumar
Dubey, Vikash Kumar
INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES, 2009, 1 (03) : 235 - 243
[48] A simple method based on multiple alignment and phylogeny to derive a correlation between the protein fold and sequence via motif search
Syed Baquer Rizvi
Anil Kumar Shukla
Vikash Kumar Dubey
Interdisciplinary Sciences: Computational Life Sciences, 2009, 1 : 235 - 243
[49] RFDT: A Rotation Forest-based Predictor for Predicting Drug-Target Interactions Using Drug Structure and Protein Sequence Information
Wang, Lei
You, Zhu-Hong
Chen, Xing
Yan, Xin
Liu, Gang
Zhang, Wei
CURRENT PROTEIN & PEPTIDE SCIENCE, 2018, 19 (05) : 445 - 454
[50] TALE-cmap: Protein function prediction based on a TALE-based architecture and the structure information from contact map
Qiu, Xiao-Yao
Wu, Hao
Shao, Jiangyi
COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 149

← 1 2 3 4 5 →