Inverse statistical physics of protein sequences: a key issues review

被引:142
作者
Cocco, Simona [1 ,2 ]
Feinauer, Christoph [3 ]
Figliuzzi, Matteo [3 ]
Monasson, Remi [2 ,4 ]
Weigt, Martin [3 ]
机构
[1] Sorbonne Univ UPMC, Ecole Normale Super, Lab Phys Stat, UMR 8549,CNRS, Paris, France
[2] Sorbonne Univ UPMC, PSL Res, Paris, France
[3] Sorbonne Univ, UPMC, Inst Biol Paris Seine, CNRS,Lab Biol Computat & Quantitat,UMR 7238, Paris, France
[4] Sorbonne Univ UPMC, Lab Phys Theor, Ecole Normale Super, UMR 8549,CNRS, Paris, France
关键词
inverse problems; inverse Ising/Potts problem; statistical inference; protein sequence analysis; coevolution; protein structure prediction; protein-protein interaction; DIRECT-COUPLING ANALYSIS; COEVOLUTIONARY INFORMATION; STRUCTURE PREDICTION; RESIDUE COEVOLUTION; MOLECULAR-DYNAMICS; CONTACT PREDICTION; STRUCTURAL BASIS; HIV EVOLUTION; CO-VARIATION; FAMILIES;
D O I
10.1088/1361-6633/aa9965
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
In the course of evolution, proteins undergo important changes in their amino acid sequences, while their three-dimensional folded structure and their biological function remain remarkably conserved. Thanks to modern sequencing techniques, sequence data accumulate at unprecedented pace. This provides large sets of so-called homologous, i.e. evolutionarily related protein sequences, to which methods of inverse statistical physics can be applied. Using sequence data as the basis for the inference of Boltzmann distributions from samples of microscopic configurations or observables, it is possible to extract information about evolutionary constraints and thus protein function and structure. Here we give an overview over some biologically important questions, and how statistical-mechanics inspired modeling approaches can help to answer them. Finally, we discuss some open questions, which we expect to be addressed over the next years.
引用
收藏
页数:17
相关论文
共 96 条
  • [1] ACKLEY DH, 1985, COGNITIVE SCI, V9, P147
  • [2] Anfinsen C B, 1975, Adv Protein Chem, V29, P205, DOI 10.1016/S0065-3233(08)60413-1
  • [3] [Anonymous], 2006, ARXIVQBIO0611072
  • [4] Inverse Ising Inference Using All the Data
    Aurell, Erik
    Ekeberg, Magnus
    [J]. PHYSICAL REVIEW LETTERS, 2012, 108 (09)
  • [5] Inference of sparse combinatorial-control networks from gene-expression data: a message passing approach
    Bailly-Bechet, Marc
    Braunstein, Alfredo
    Pagnani, Andrea
    Weigt, Martin
    Zecchina, Riccardo
    [J]. BMC BIOINFORMATICS, 2010, 11
  • [6] Learning generative models for protein fold families
    Balakrishnan, Sivaraman
    Kamisetty, Hetunandan
    Carbonell, Jaime G.
    Lee, Su-In
    Langmead, Christopher James
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2011, 79 (04) : 1061 - 1078
  • [7] Fast and Accurate Multivariate Gaussian Modeling of Protein Families: Predicting Residue Contacts and Protein-Interaction Partners
    Baldassi, Carlo
    Zamparo, Marco
    Feinauer, Christoph
    Procaccini, Andrea
    Zecchina, Riccardo
    Weigt, Martin
    Pagnani, Andrea
    [J]. PLOS ONE, 2014, 9 (03):
  • [8] Improving landscape inference by integrating heterogeneous data in the inverse Ising problem
    Barrat-Charlaix, Pierre
    Figliuzzi, Matteo
    Weigt, Martin
    [J]. SCIENTIFIC REPORTS, 2016, 6
  • [9] ACE: adaptive cluster expansion for maximum entropy graphical model inference
    Barton, J. P.
    De Leonardis, E.
    Coucke, A.
    Cocco, S.
    [J]. BIOINFORMATICS, 2016, 32 (20) : 3089 - 3097
  • [10] Relative rate and location of intra-host HIV evolution to evade cellular immunity are predictable
    Barton, John P.
    Goonetilleke, Nilu
    Butler, Thomas C.
    Walker, Bruce D.
    McMichael, Andrew J.
    Chakraborty, Arup K.
    [J]. NATURE COMMUNICATIONS, 2016, 7