Adaptive sampling quasi-Newton methods for zeroth-order stochastic optimization

被引:0
作者
Raghu Bollapragada
Stefan M. Wild
机构
[1] The University of Texas at Austin,Operations Research and Industrial Engineering
[2] Lawrence Berkeley National Laboratory,Applied Mathematics and Computational Research Division
来源
Mathematical Programming Computation | 2023年 / 15卷
关键词
Derivative-free optimization; Stochastic oracles; Adaptive sampling; Common random numbers; 90C56; 65K05; 90C15; 90C30; 90C53;
D O I
暂无
中图分类号
学科分类号
摘要
We consider unconstrained stochastic optimization problems with no available gradient information. Such problems arise in settings from derivative-free simulation optimization to reinforcement learning. We propose an adaptive sampling quasi-Newton method where we estimate the gradients using finite differences of stochastic function evaluations within a common random number framework. We develop modified versions of a norm test and an inner product quasi-Newton test to control the sample sizes used in the stochastic approximations and provide global convergence results to the neighborhood of a locally optimal solution. We present numerical experiments on simulation optimization problems to illustrate the performance of the proposed algorithm. When compared with classical zeroth-order stochastic gradient methods, we observe that our strategies of adapting the sample sizes significantly improve performance in terms of the number of stochastic function evaluations required.
引用
收藏
页码:327 / 364
页数:37
相关论文
共 108 条
  • [1] Audet C(2021)Stochastic mesh adaptive direct search for blackbox optimization using probabilistic estimates Comput. Optim. Appl. 79 1-34
  • [2] Dzahini KJ(2022)Zeroth-order nonconvex stochastic optimization: handling constraints, high dimensionality, and saddle points Found. Comput. Math. 22 35-76
  • [3] Kokkolaras M(2019)Derivative-free optimization of noisy functions via quasi-Newton methods SIAM J. Optim. 29 965-993
  • [4] Le Digabel S(2022)A theoretical and empirical comparison of gradient approximations in derivative-free optimization Found. Comput. Math. 22 507-560
  • [5] Balasubramanian K(2020)A robust multi-batch l-BFGS method for machine learning Optim. Methods Softw. 35 191-219
  • [6] Ghadimi S(2019)Convergence rate analysis of a stochastic trust-region method via supermartingales INFORMS J. Optim. 1 92-119
  • [7] Berahas AS(1954)Multidimensional stochastic approximation methods Ann. Math. Stat. 25 737-744
  • [8] Byrd RH(2018)Adaptive sampling strategies for stochastic optimization SIAM J. Optim. 28 3312-3343
  • [9] Nocedal J(2018)Exact and inexact subsampled Newton methods for optimization IMA J. Numer. Anal. 39 545-578
  • [10] Berahas AS(2021)Optimization and supervised machine learning methods for fitting numerical physics models without derivatives J. Phys. G Nucl. Part. Phys. 48 223-311