Uncertainty quantification of graph convolution neural network models of evolving processes

被引：0

作者：

Hauth, Jeremiah ^{[1
]}

Safta, Cosmin ^{[2
]}

Huan, Xun ^{[1
]}

Patel, Ravi G. ^{[3
]}

Jones, Reese E. ^{[2
]}

机构：

[1] Univ Michigan, Ann Arbor, MI USA

[2] Sandia Natl Labs, Livermore, CA 94550 USA

[3] Sandia Natl Labs, Albuquerque, NM USA

来源：

COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING | 2024年 / 429卷

关键词：

Neural networks; Uncertainty quantification; Recurrent networks; Neural ordinary differential equations; Stein variational gradient descent; FRAMEWORK; INFERENCE; LAWS;

D O I：

10.1016/j.cma.2024.117195

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

The application of neural network models to scientific machine learning tasks has proliferated in recent years. In particular, neural networks have proved to be adept at modeling processes with spatial-temporal complexity. Nevertheless, these highly parameterized models have garnered skepticism in their ability to produce outputs with quantified error bounds over the regimes of interest. Hence there is a need to find uncertainty quantification methods that are suitable for neural networks. In this work we present comparisons of the parametric uncertainty quantification of neural networks modeling complex spatial-temporal processes with Hamiltonian Monte Carlo and Stein variational gradient descent and its projected variant. Specifically we apply these methods to graph convolutional neural network models of evolving systems modeled with recurrent neural network and neural ordinary differential equations architectures. We show that Stein variational inference is a viable alternative to Monte Carlo methods with some clear advantages for complex neural network models. For our exemplars, Stein variational interference gave similar pushed forward uncertainty profiles through time compared to Hamiltonian Monte Carlo, albeit with generally more generous variance. Projected Stein variational gradient descent also produced similar uncertainty profiles to the non-projected counterpart, but large reductions in the active weight space were confounded by the stability of the neural network predictions and the convoluted likelihood landscape.

引用

页数：23

共 79 条

[31] Graves A., 2011, ADV NEURAL INFORM PR, V24, P2348, DOI DOI 10.5555/2986459.2986721
[32] Hairer E., 1996, SOLVING ORDINARY DIF
[33] Hairer E., 1993, Springer Series in Computational Mathematics, V8, DOI DOI 10.1007/978-3-540-78862-1
[34] Hairer E., 2006, Springer Series in Computational Mathematics, V31
[35] Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions
Halko, N.
Martinsson, P. G.
Tropp, J. A.
[J]. SIAM REVIEW, 2011, 53 (02) : 217 - 288
[36] Hauth Jeremiah, 2024, Advances in Intuitive Priors and Scalable Algorithms for Bayesian Deep Neural Network Models in Scientific Applications
[37] HYBRID DETERMINISTIC-STOCHASTIC GRADIENT LANGEVIN DYNAMICS FOR BAYESIAN LEARNING
He, Qi
Xin, Jack
[J]. COMMUNICATIONS IN INFORMATION AND SYSTEMS, 2012, 12 (03) : 221 - 232
[38] Jones RE, 2021, Arxiv, DOI arXiv:2111.14714
[39] Deep learning and multi-level featurization of graph representations of microstructural data
Jones, Reese
Safta, Cosmin
Frankel, Ari
[J]. COMPUTATIONAL MECHANICS, 2023, 72 (01) : 57 - 75
[40] BAYES FACTORS
KASS, RE
RAFTERY, AE
[J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1995, 90 (430) : 773 - 795

← 1 2 3 4 5 6 7 8 →