Statistical Framework for Uncertainty Quantification in Computational Molecular Modeling

被引:3
作者
Rasheed, Muhibur [1 ]
Clement, Nathan [1 ]
Bhowmick, Abhishek [1 ]
Bajaj, Chandrajit [1 ]
机构
[1] Univ Texas Austin, Dept Comp Sci, Austin, TX 78712 USA
来源
PROCEEDINGS OF THE 7TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS | 2016年
关键词
Uncertainty Quantification; Sampling; Molecular Modeling; SIMULATION; ACCURACY; DOCKING; ENERGY;
D O I
10.1145/2975167.2975182
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Computational molecular modeling often involves noisy data including uncertainties in model parameters, computational approximations etc., all of which propagates to uncertainties in all computed quantities of interest (QOI). This is a fundamental problem that is often left ignored or treated without sufficient rigor. In this article, we introduce a statistical framework for modeling such uncertainties and providing certificates of accuracy for several QOI. Our framework treats sources of uncertainty as random variables with known distributions, and provides both a theoretical and an empirical technique for propagating those uncertainties to the QOI, also modeled as a random variable. Moreover, the framework also enables one to model uncertainties in a multi-step pipeline, where the outcome of one step cascades into the next. While there are many sources of uncertainty, in this article we have applied our framework to only positional uncertainties of atoms in high resolution models, and in the form of B-factors and their effect in computed molecular properties. The empirical approach requires sufficiently sampling over the joint space of the random variables. We show that using novel pseudo-random number generation techniques, it is possible to achieve the required coverage using very few samples. We have also developed intuitive visualization models to analyze uncertainties at different stages of molecular modeling. We strongly believe this framework would be immensely valuable in evaluating predicted computational models, and provide statistical guarantees on their accuracy.
引用
收藏
页码:146 / 155
页数:10
相关论文
共 46 条
  • [21] SOLVATION ENERGY IN PROTEIN FOLDING AND BINDING
    EISENBERG, D
    MCLACHLAN, AD
    [J]. NATURE, 1986, 319 (6050) : 199 - 203
  • [22] Recent advances in the development and application of implicit solvent models in biomolecule simulations
    Feig, M
    Brooks, CL
    [J]. CURRENT OPINION IN STRUCTURAL BIOLOGY, 2004, 14 (02) : 217 - 224
  • [23] PSEUDORANDOM GENERATORS FOR COMBINATORIAL SHAPES
    Gopalan, Parikshit
    Meka, Raghu
    Reingold, Omer
    Zuckerman, David
    [J]. SIAM JOURNAL ON COMPUTING, 2013, 42 (03) : 1051 - 1076
  • [24] Bayesian inference applied to macromolecular structure determination
    Habeck, M
    Nilges, M
    Rieping, W
    [J]. PHYSICAL REVIEW E, 2005, 72 (03):
  • [26] Protein-protein docking benchmark version 4.0
    Hwang, Howook
    Vreven, Thom
    Janin, Joel
    Weng, Zhiping
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2010, 78 (15) : 3111 - 3114
  • [27] James F., 1996, METHODS
  • [28] X-ray refinement significantly underestimates the level of microscopic heterogeneity in biomolecular crystals
    Kuzmanic, Antonija
    Pannu, Navraj S.
    Zagrovic, Bojan
    [J]. NATURE COMMUNICATIONS, 2014, 5 : 3220
  • [29] Determination of Ensemble-Average Pairwise Root Mean-Square Deviation from Experimental B-Factors
    Kuzmanic, Antonija
    Zagrovic, Bojan
    [J]. BIOPHYSICAL JOURNAL, 2010, 98 (05) : 861 - 871
  • [30] Automated electron-density sampling reveals widespread conformational polymorphism in proteins
    Lang, P. Therese
    Ng, Ho-Leung
    Fraser, James S.
    Corn, Jacob E.
    Echols, Nathaniel
    Sales, Mark
    Holton, James M.
    Alber, Tom
    [J]. PROTEIN SCIENCE, 2010, 19 (07) : 1420 - 1431