An adaptively weighted stochastic gradient MCMC algorithm for Monte Carlo simulation and global optimization

被引:7
作者
Deng, Wei [1 ]
Lin, Guang [2 ]
Liang, Faming [2 ,3 ]
机构
[1] Purdue Univ, Dept Stat, W Lafayette, IN 47907 USA
[2] Purdue Univ, Dept Math, W Lafayette, IN 47907 USA
[3] Purdue Univ, Sch Mech Engn, W Lafayette, IN 47907 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
Adaptive stochastic gradient Langevin dynamics; Dynamic importance sampling; Local traps; Stochastic approximation; APPROXIMATION; CONVERGENCE; LANGEVIN;
D O I
10.1007/s11222-022-10120-3
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We propose an adaptively weighted stochastic gradient Langevin dynamics (AWSGLD) algorithm for Bayesian learning of big data problems. The proposed algorithm is scalable and possesses a self-adjusting mechanism: It adaptively flattens the high-energy region and protrudes the low-energy region during simulations such that both Monte Carlo simulation and global optimization tasks can be greatly facilitated in a single run. The self-adjusting mechanism enables the proposed algorithm to be essentially immune to local traps. Theoretically, by showing the stability of the mean-field system and verifying the existence and regularity properties of the solution of Poisson equation, we establish the convergence of the AWSGLD algorithm, including both the convergence of the self-adapting parameters and the convergence of the weighted averaging estimators. Empirically, the AWSGLD algorithm is tested on multiple benchmark datasets including CIFAR100 and SVHN for both optimization and uncertainty estimation tasks. The numerical results indicate its great potential in Monte Carlo simulation and global optimization for modern machine learning tasks.
引用
收藏
页数:24
相关论文
共 67 条
[1]  
Abadi M., 2016, ARXIV160304467
[2]  
Ahn S., 2012, P 29 INT C MACH LEAR
[3]  
Aitchison Laurence, 2021, INT C LEARN REPR
[4]   Stability of stochastic approximation under verifiable conditions [J].
Andrieu, C ;
Moulines, É ;
Priouret, P .
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2005, 44 (01) :283-312
[5]  
[Anonymous], 2012, P INT C MACH LEARN
[6]   Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation [J].
Belkin, Mikhail .
ACTA NUMERICA, 2021, 30 :203-248
[7]  
Benveniste A., 1990, Adaptive Algorithms and Stochastic Approximations
[8]   MULTICANONICAL ALGORITHMS FOR 1ST ORDER PHASE-TRANSITIONS [J].
BERG, BA ;
NEUHAUS, T .
PHYSICS LETTERS B, 1991, 267 (02) :249-253
[9]  
Chen CY, 2016, JMLR WORKSH CONF PRO, V51, P1051
[10]  
Chen Changyou, 2015, ADV NEURAL INFORM PR, P2278