Evolutionary Gradient Descent for Non-convex Optimization

被引:0
|
作者
Xue, Ke [1 ]
Qian, Chao [1 ]
Xu, Ling [2 ]
Fei, Xudong [2 ]
机构
[1] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing 210023, Peoples R China
[2] Huawei Technol, 2012 Lab, Shenzhen 518000, Peoples R China
来源
PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021 | 2021年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Non-convex optimization is often involved in artificial intelligence tasks, which may have many saddle points, and is NP-hard to solve. Evolutionary algorithms (EAs) are general-purpose derivativefree optimization algorithms with a good ability to find the global optimum, which can be naturally applied to non-convex optimization. Their performance is, however, limited due to low efficiency. Gradient descent (GD) runs efficiently, but only converges to a first-order stationary point, which may be a saddle point and thus arbitrarily bad. Some recent efforts have been put into combining EAs and GD. However, previous works either utilized only a specific component of EAs, or just combined them heuristically without theoretical guarantee. In this paper, we propose an evolutionary GD (EGD) algorithm by combining typical components, i.e., population and mutation, of EAs with GD. We prove that EGD can converge to a second-order stationary point by escaping the saddle points, and is more efficient than previous algorithms. Empirical results on non-convex synthetic functions as well as reinforcement learning (RL) tasks also show its superiority.
引用
收藏
页码:3221 / 3227
页数:7
相关论文
共 50 条
  • [1] Adaptive Stochastic Gradient Descent Method for Convex and Non-Convex Optimization
    Chen, Ruijuan
    Tang, Xiaoquan
    Li, Xiuting
    FRACTAL AND FRACTIONAL, 2022, 6 (12)
  • [2] Gradient Methods for Non-convex Optimization
    Prateek Jain
    Journal of the Indian Institute of Science, 2019, 99 : 247 - 256
  • [3] Gradient Methods for Non-convex Optimization
    Jain, Prateek
    JOURNAL OF THE INDIAN INSTITUTE OF SCIENCE, 2019, 99 (02) : 247 - 256
  • [4] Asynchronous Mini-Batch Gradient Descent with Variance Reduction for Non-Convex Optimization
    Huo, Zhouyuan
    Huang, Heng
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2043 - 2049
  • [5] Projected Gradient Descent for Non-Convex Sparse Spike Estimation
    Traonmilin, Yann
    Aujol, Jean-Francois
    Leclaire, Arthur
    IEEE SIGNAL PROCESSING LETTERS, 2020, 27 : 1110 - 1114
  • [6] Generalization Bound of Gradient Descent for Non-Convex Metric Learning
    Dong, Mingzhi
    Yang, Xiaochen
    Zhu, Rui
    Wang, Yujiang
    Xue, Jing-Hao
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [7] Scaling up stochastic gradient descent for non-convex optimisation
    Mohamad, Saad
    Alamri, Hamad
    Bouchachia, Abdelhamid
    MACHINE LEARNING, 2022, 111 (11) : 4039 - 4079
  • [8] Scaling up stochastic gradient descent for non-convex optimisation
    Saad Mohamad
    Hamad Alamri
    Abdelhamid Bouchachia
    Machine Learning, 2022, 111 : 4039 - 4079
  • [9] On the Convergence of (Stochastic) Gradient Descent with Extrapolation for Non-Convex Minimization
    Xu, Yi
    Yuan, Zhuoning
    Yang, Sen
    Jin, Rong
    Yang, Tianbao
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 4003 - 4009
  • [10] Efficient Smooth Non-Convex Stochastic Compositional Optimization via Stochastic Recursive Gradient Descent
    Hu, Wenqing
    Li, Chris Junchi
    Lian, Xiangru
    Liu, Ji
    Yuan, Huizhuo
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32