Adaptive step size rules for stochastic optimization in large-scale learning

被引:3
|
作者
Yang, Zhuang [1 ]
Ma, Li [2 ]
机构
[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou 215006, Peoples R China
[2] Xizang Minzu Univ, Sch Management, Xianyang 712000, Peoples R China
基金
中国博士后科学基金;
关键词
Adaptive step size; Quasi-Newton; Stochastic optimization; Large-scale learning; MINI-BATCH ALGORITHMS; GRADIENT DESCENT; CONVERGENCE; APPROXIMATION; RATES;
D O I
10.1007/s11222-023-10218-2
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The importance of the step size in stochastic optimization has been confirmed both theoretically and empirically during the past few decades and reconsidered in recent years, especially for large-scale learning. Different rules of selecting the step size have been discussed since the arising of stochastic approximation methods. The first part of this work reviews the studies on several representative techniques of setting the step size, covering heuristic rules, meta-learning procedure, adaptive step size technique and line search technique. The second component of this work proposes a novel class of accelerating stochastic optimization methods by resorting to the Barzilai-Borwein (BB) technique with a diagonal selection rule for the metric, particularly, termed as DBB. We first explore the theoretical and empirical properties of variance reduced stochastic optimization algorithms with DBB. Especially, we study the theoretical and numerical properties of the resulting method under strongly convex and non-convex cases respectively. To great show the efficacy of the step size schedule of DBB, we extend it into more general stochastic optimization methods. The theoretical and empirical properties of such the case also developed under different cases. Extensive numerical results in machine learning are offered, suggesting that the proposed algorithms show much promise.
引用
收藏
页数:22
相关论文
共 50 条
  • [1] Adaptive step size rules for stochastic optimization in large-scale learning
    Zhuang Yang
    Li Ma
    Statistics and Computing, 2023, 33
  • [2] Adaptive Powerball Stochastic Conjugate Gradient for Large-Scale Learning
    Yang, Zhuang
    IEEE TRANSACTIONS ON BIG DATA, 2023, 9 (06) : 1598 - 1606
  • [3] A mini-batch algorithm for large-scale learning problems with adaptive step size
    He, Chongyang
    Zhang, Yiting
    Zhu, Dingyu
    Cao, Mingyuan
    Yang, Yueting
    DIGITAL SIGNAL PROCESSING, 2023, 143
  • [4] NEW ADAPTIVE BARZILAI-BORWEIN STEP SIZE AND ITS APPLICATION IN SOLVING LARGE-SCALE OPTIMIZATION PROBLEMS
    Li, Ting
    Wan, Zhong
    ANZIAM JOURNAL, 2019, 61 (01) : 76 - 98
  • [5] Improved Powered Stochastic Optimization Algorithms for Large-Scale Machine Learning
    Yang, Zhuang
    JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
  • [6] Adaptive Granularity Learning Distributed Particle Swarm Optimization for Large-Scale Optimization
    Wang, Zi-Jia
    Zhan, Zhi-Hui
    Kwong, Sam
    Jin, Hu
    Zhang, Jun
    IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (03) : 1175 - 1188
  • [7] Powered stochastic optimization with hypergradient descent for large-scale learning systems
    Yang, Zhuang
    Li, Xiaotian
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 238
  • [8] Optimal large-scale stochastic optimization of NDCG surrogates for deep learning
    Qiu, Zi-Hao
    Hu, Quanqi
    Zhong, Yongjian
    Tu, Wei-Wei
    Zhang, Lijun
    Yang, Tianbao
    MACHINE LEARNING, 2025, 114 (02)
  • [9] A STOCHASTIC QUASI-NEWTON METHOD FOR LARGE-SCALE OPTIMIZATION
    Byrd, R. H.
    Hansen, S. L.
    Nocedal, Jorge
    Singer, Y.
    SIAM JOURNAL ON OPTIMIZATION, 2016, 26 (02) : 1008 - 1031
  • [10] A Class of Parallel Doubly Stochastic Algorithms for Large-Scale Learning
    Mokhtari, Aryan
    Koppel, Alec
    Takac, Martin
    Ribeiro, Alejandro
    JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21