Evolutionary Gradient Descent for Non-convex Optimization

被引：0

作者：

Xue, Ke ^{[1
]}

Qian, Chao ^{[1
]}

Xu, Ling ^{[2
]}

Fei, Xudong ^{[2
]}

机构：

[1] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing 210023, Peoples R China

[2] Huawei Technol, 2012 Lab, Shenzhen 518000, Peoples R China

来源：

PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021 | 2021年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Non-convex optimization is often involved in artificial intelligence tasks, which may have many saddle points, and is NP-hard to solve. Evolutionary algorithms (EAs) are general-purpose derivativefree optimization algorithms with a good ability to find the global optimum, which can be naturally applied to non-convex optimization. Their performance is, however, limited due to low efficiency. Gradient descent (GD) runs efficiently, but only converges to a first-order stationary point, which may be a saddle point and thus arbitrarily bad. Some recent efforts have been put into combining EAs and GD. However, previous works either utilized only a specific component of EAs, or just combined them heuristically without theoretical guarantee. In this paper, we propose an evolutionary GD (EGD) algorithm by combining typical components, i.e., population and mutation, of EAs with GD. We prove that EGD can converge to a second-order stationary point by escaping the saddle points, and is more efficient than previous algorithms. Empirical results on non-convex synthetic functions as well as reinforcement learning (RL) tasks also show its superiority.

引用

页码：3221 / 3227

页数：7

共 50 条

[1] Adaptive Stochastic Gradient Descent Method for Convex and Non-Convex Optimization
Chen, Ruijuan
Tang, Xiaoquan
Li, Xiuting
FRACTAL AND FRACTIONAL, 2022, 6 (12)
[2] Gradient Methods for Non-convex Optimization
Prateek Jain
Journal of the Indian Institute of Science, 2019, 99 : 247 - 256
[3] Gradient Methods for Non-convex Optimization
Jain, Prateek
JOURNAL OF THE INDIAN INSTITUTE OF SCIENCE, 2019, 99 (02) : 247 - 256
[4] Asynchronous Mini-Batch Gradient Descent with Variance Reduction for Non-Convex Optimization
Huo, Zhouyuan
Huang, Heng
THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2043 - 2049
[5] Projected Gradient Descent for Non-Convex Sparse Spike Estimation
Traonmilin, Yann
Aujol, Jean-Francois
Leclaire, Arthur
IEEE SIGNAL PROCESSING LETTERS, 2020, 27 : 1110 - 1114
[6] Generalization Bound of Gradient Descent for Non-Convex Metric Learning
Dong, Mingzhi
Yang, Xiaochen
Zhu, Rui
Wang, Yujiang
Xue, Jing-Hao
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[7] Scaling up stochastic gradient descent for non-convex optimisation
Mohamad, Saad
Alamri, Hamad
Bouchachia, Abdelhamid
MACHINE LEARNING, 2022, 111 (11) : 4039 - 4079
[8] Scaling up stochastic gradient descent for non-convex optimisation
Saad Mohamad
Hamad Alamri
Abdelhamid Bouchachia
Machine Learning, 2022, 111 : 4039 - 4079
[9] On the Convergence of (Stochastic) Gradient Descent with Extrapolation for Non-Convex Minimization
Xu, Yi
Yuan, Zhuoning
Yang, Sen
Jin, Rong
Yang, Tianbao
PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 4003 - 4009
[10] Efficient Smooth Non-Convex Stochastic Compositional Optimization via Stochastic Recursive Gradient Descent
Hu, Wenqing
Li, Chris Junchi
Lian, Xiangru
Liu, Ji
Yuan, Huizhuo
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32

← 1 2 3 4 5 →