Faster Gradient-Free Algorithms for Nonsmooth Nonconvex Stochastic Optimization

被引:0
作者
Chen, Lesi [1 ]
Xu, Jing [2 ]
Luo, Luo [1 ]
机构
[1] Fudan Univ, Sch Data Sci, Shanghai, Peoples R China
[2] Tsinghua Univ, Inst Interdisciplinary Informat Sci, Beijing, Peoples R China
来源
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202 | 2023年 / 202卷
基金
中国国家自然科学基金;
关键词
VARIABLE SELECTION; ZEROTH-ORDER;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider the optimization problem of the form min(x is an element of Rd) f(x) =(Delta) E-xi[F(x; xi)], where the component F(x; xi) is L-mean-squared Lipschitz but possibly nonconvex and nonsmooth. The recently proposed gradient-free method requires at most O(L(4)d(3/2)epsilon(-4) + Delta L(3)d(3/2)delta(-1)epsilon(-4)) stochastic zeroth-order oracle complexity to find a (delta, epsilon)-Goldstein stationary point of objective function, where Delta = f(x(0)) - inf(x is an element of Rd) f(x) and x(0) is the initial point of the algorithm. This paper proposes a more efficient algorithm using stochastic recursive gradient estimators, which improves the complexity to O(L(3)d(3/2)epsilon(-3) + Delta L(2)d(3/2)delta(-1)epsilon(-3)).
引用
收藏
页数:15
相关论文
共 58 条
[1]  
Allen-Zhu Z., 2018, ADV NEURAL INFORM PR, V31, P3716
[2]  
[Anonymous], 2012, Optima
[3]  
[Anonymous], 2018, NeurIPS
[4]  
Bach F., 2016, PMLR
[5]  
Bubeck Sebastien, 2012, C LEARNING THEORY
[6]   Towards Evaluating the Robustness of Neural Networks [J].
Carlini, Nicholas ;
Wagner, David .
2017 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP), 2017, :39-57
[7]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[8]  
Choromanski K., 2018, ICML
[9]  
Clarke F. H., 1983, Optimization and nonsmooth analysis
[10]  
Cutkosky A., 2023, ARXIV