Statistical Framework for Uncertainty Quantification in Computational Molecular Modeling

被引:3
作者
Rasheed, Muhibur [1 ]
Clement, Nathan [1 ]
Bhowmick, Abhishek [1 ]
Bajaj, Chandrajit [1 ]
机构
[1] Univ Texas Austin, Dept Comp Sci, Austin, TX 78712 USA
来源
PROCEEDINGS OF THE 7TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS | 2016年
关键词
Uncertainty Quantification; Sampling; Molecular Modeling; SIMULATION; ACCURACY; DOCKING; ENERGY;
D O I
10.1145/2975167.2975182
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Computational molecular modeling often involves noisy data including uncertainties in model parameters, computational approximations etc., all of which propagates to uncertainties in all computed quantities of interest (QOI). This is a fundamental problem that is often left ignored or treated without sufficient rigor. In this article, we introduce a statistical framework for modeling such uncertainties and providing certificates of accuracy for several QOI. Our framework treats sources of uncertainty as random variables with known distributions, and provides both a theoretical and an empirical technique for propagating those uncertainties to the QOI, also modeled as a random variable. Moreover, the framework also enables one to model uncertainties in a multi-step pipeline, where the outcome of one step cascades into the next. While there are many sources of uncertainty, in this article we have applied our framework to only positional uncertainties of atoms in high resolution models, and in the form of B-factors and their effect in computed molecular properties. The empirical approach requires sufficiently sampling over the joint space of the random variables. We show that using novel pseudo-random number generation techniques, it is possible to achieve the required coverage using very few samples. We have also developed intuitive visualization models to analyze uncertainties at different stages of molecular modeling. We strongly believe this framework would be immensely valuable in evaluating predicted computational models, and provide statistical guarantees on their accuracy.
引用
收藏
页码:146 / 155
页数:10
相关论文
共 46 条
  • [1] [Anonymous], ARXIV14085629
  • [2] Azuma K., 1967, TOHOKU MATH J, V19, P357, DOI DOI 10.2748/TMJ/1178243286
  • [3] TexMol: Interactive visual exploration of large flexible multi-component molecular complexes
    Bajaj, C
    Djeu, P
    Siddavanahalli, V
    Thane, A
    [J]. IEEE VISUALIZATION 2004, PROCEEEDINGS, 2004, : 243 - 250
  • [4] Bajaj C., 1997, Proceedings. Fourth Symposium on Solid Modeling and Applications, P217, DOI 10.1145/267734.267787
  • [5] Bajaj C., 2014, ARXIV14117753
  • [6] Bajaj C., 2006, TR0657 CS U TEX AUST
  • [7] AN EFFICIENT HIGHER-ORDER FAST MULTIPOLE BOUNDARY ELEMENT SOLUTION FOR POISSON-BOLTZMANN-BASED MOLECULAR ELECTROSTATICS
    Bajaj, Chandrajit
    Chen, Shun-Chuan
    Rand, Alexander
    [J]. SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2011, 33 (02) : 826 - 848
  • [8] FAST MOLECULAR SOLVATION ENERGETICS AND FORCES COMPUTATION
    Bajaj, Chandrajit
    Zhao, Wenqi
    [J]. SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2010, 31 (06) : 4524 - 4552
  • [9] Dynamic maintenance and visualization of molecular surfaces
    Bajaj, CL
    Pascucci, V
    Shamir, A
    Holt, RJ
    Netravali, AN
    [J]. DISCRETE APPLIED MATHEMATICS, 2003, 127 (01) : 23 - 51
  • [10] Generalized born models of macromolecular solvation effects
    Bashford, D
    Case, DA
    [J]. ANNUAL REVIEW OF PHYSICAL CHEMISTRY, 2000, 51 : 129 - 152