A mini-batch stochastic conjugate gradient algorithm with variance reduction

被引:0
|
作者
Caixia Kou
Han Yang
机构
[1] Beijing University of Posts and Telecommunications,School of Science
来源
关键词
Deep learning; Empirical risk minimization; Stochastic conjugate gradient; Linear convergence;
D O I
暂无
中图分类号
学科分类号
摘要
Stochastic gradient descent method is popular for large scale optimization but has slow convergence asymptotically due to the inherent variance. To remedy this problem, there have been many explicit variance reduction methods for stochastic descent, such as SVRG Johnson and Zhang [Advances in neural information processing systems, (2013), pp. 315–323], SAG Roux et al. [Advances in neural information processing systems, (2012), pp. 2663–2671], SAGA Defazio et al. [Advances in neural information processing systems, (2014), pp. 1646–1654] and so on. Conjugate gradient method, which has the same computation cost with gradient descent method, is considered. In this paper, in the spirit of SAGA, we propose a stochastic conjugate gradient algorithm which we call SCGA. With the Fletcher and Reeves type choices, we prove a linear convergence rate for smooth and strongly convex functions. We experimentally demonstrate that SCGA converges faster than the popular SGD type algorithms for four machine learning models, which may be convex, nonconvex or nonsmooth. Solving regression problems, SCGA is competitive with CGVR, which is the only one stochastic conjugate gradient algorithm with variance reduction so far, as we know.
引用
收藏
页码:1009 / 1025
页数:16
相关论文
共 50 条
  • [1] A mini-batch stochastic conjugate gradient algorithm with variance reduction
    Kou, Caixia
    Yang, Han
    JOURNAL OF GLOBAL OPTIMIZATION, 2023, 87 (2-4) : 1009 - 1025
  • [2] Stochastic Conjugate Gradient Algorithm With Variance Reduction
    Jin, Xiao-Bo
    Zhang, Xu-Yao
    Huang, Kaizhu
    Geng, Guang-Gang
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (05) : 1360 - 1369
  • [3] Accelerating Stochastic Variance Reduced Gradient Using Mini-Batch Samples on Estimation of Average Gradient
    Huang, Junchu
    Zhou, Zhiheng
    Xu, Bingyuan
    Huang, Yu
    ADVANCES IN NEURAL NETWORKS, PT I, 2017, 10261 : 346 - 353
  • [4] Asynchronous Mini-Batch Gradient Descent with Variance Reduction for Non-Convex Optimization
    Huo, Zhouyuan
    Huang, Heng
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2043 - 2049
  • [5] A MINI-BATCH STOCHASTIC GRADIENT METHOD FOR SPARSE LEARNING TO RANK
    Cheng, Fan
    Wang, Dongliang
    Zhang, Lei
    Su, Yansen
    Qiu, Jianfeng
    Suo, Yi
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2018, 14 (04): : 1207 - 1221
  • [6] An Asynchronous Mini-batch Algorithm for Regularized Stochastic Optimization
    Feyzmandavian, Hamid Reza
    Aytekin, Arda
    Johansson, Mikael
    2015 54TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2015, : 1384 - 1389
  • [7] An Asynchronous Mini-Batch Algorithm for Regularized Stochastic Optimization
    Feyzmahdavian, Hamid Reza
    Aytekin, Arda
    Johansson, Mikael
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2016, 61 (12) : 3740 - 3754
  • [8] A Mini-Batch Proximal Stochastic Recursive Gradient Algorithm with Diagonal Barzilai–Borwein Stepsize
    Teng-Teng Yu
    Xin-Wei Liu
    Yu-Hong Dai
    Jie Sun
    Journal of the Operations Research Society of China, 2023, 11 : 277 - 307
  • [9] An adaptive mini-batch stochastic gradient method for AUC maximization
    Cheng, Fan
    Zhang, Xia
    Zhang, Chuang
    Qiu, Jianfeng
    Zhang, Lei
    NEUROCOMPUTING, 2018, 318 : 137 - 150
  • [10] A Mini-Batch Proximal Stochastic Recursive Gradient Algorithm with Diagonal Barzilai-Borwein Stepsize
    Yu, Teng-Teng
    Liu, Xin-Wei
    Dai, Yu-Hong
    Sun, Jie
    JOURNAL OF THE OPERATIONS RESEARCH SOCIETY OF CHINA, 2023, 11 (02) : 277 - 307