Probabilistic models and machine learning in structural bioinformatics

被引:12
作者
Hamelryck, Thomas [1 ]
机构
[1] Univ Copenhagen, Dept Biol, Bioinformat Ctr, DK-2200 Copenhagen N, Denmark
关键词
INDIRECT FOURIER TRANSFORMATION; MAXIMUM-LIKELIHOOD; PROTEIN STRUCTURES; STATISTICAL POTENTIALS; HIERARCHICAL-MODELS; INFORMATION-THEORY; MEAN FORCE; FOLD; REFINEMENT; PREDICTION;
D O I
10.1177/0962280208099492
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Structural bioinformatics is concerned with the molecular structure of biomacromolecules on a genomic scale, using computational methods. Classic problems in structural bioinformatics include the prediction of protein and RNA structure from sequence, the design of artificial proteins or enzymes, and the automated analysis and comparison of biomacromolecules in atomic detail. The determination of macromolecular structure from experimental data (for example coming from nuclear magnetic resonance, X-ray crystallography or small angle X-ray scattering) has close ties with the field of structural bioinformatics. Recently, probabilistic models and machine learning methods based on Bayesian principles are providing efficient and rigorous solutions to challenging problems that were long regarded as intractable. In this review, I will highlight some important recent developments in the prediction, analysis and experimental determination of macromolecular structure that are based on such methods. These developments include generative models of protein structure, the estimation of the parameters of energy functions that are used in structure prediction, the superposition of macromolecules and structure determination methods that are based on inference. Although this review is not exhaustive, I believe the selected topics give a good impression of the exciting new, probabilistic road the field of structural bioinformatics is taking.
引用
收藏
页码:505 / 526
页数:22
相关论文
共 83 条
[1]   PRINCIPLES THAT GOVERN FOLDING OF PROTEIN CHAINS [J].
ANFINSEN, CB .
SCIENCE, 1973, 181 (4096) :223-230
[2]  
[Anonymous], 1998, What remains to be discovered: mapping the secrets of the universe, the origins of life, and the future of the human race
[3]  
Baddeley A, 2007, LECT NOTES MATH, V1892, P1
[4]  
Banerjee A, 2005, J MACH LEARN RES, V6, P1345
[5]   ANTIPODALLY SYMMETRIC DISTRIBUTION ON SPHERE [J].
BINGHAM, C .
ANNALS OF STATISTICS, 1974, 2 (06) :1201-1225
[6]  
BOOMSMA W, 2006, INTERDISCIPLINARY ST, V25, P91
[7]   A generative, probabilistic model of local protein structure [J].
Boomsma, Wouter ;
Mardia, Kanti V. ;
Taylor, Charles C. ;
Ferkinghoff-Borg, Jesper ;
Krogh, Anders ;
Hamelryck, Thomas .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2008, 105 (26) :8932-8937
[8]   FREE R-VALUE - A NOVEL STATISTICAL QUANTITY FOR ASSESSING THE ACCURACY OF CRYSTAL-STRUCTURES [J].
BRUNGER, AT .
NATURE, 1992, 355 (6359) :472-475
[9]   SPIN-GLASSES AND THE STATISTICAL-MECHANICS OF PROTEIN FOLDING [J].
BRYNGELSON, JD ;
WOLYNES, PG .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1987, 84 (21) :7524-7528
[10]   Development of novel statistical potentials for protein fold recognition [J].
Buchete, NV ;
Straub, JE ;
Thirumalai, D .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 2004, 14 (02) :225-232