Adaptive step size rules for stochastic optimization in large-scale learning

被引：3

作者：

Yang, Zhuang ^{[1
]}

Ma, Li ^{[2
]}

机构：

[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou 215006, Peoples R China

[2] Xizang Minzu Univ, Sch Management, Xianyang 712000, Peoples R China

来源：

STATISTICS AND COMPUTING | 2023年 / 33卷 / 02期

基金：

中国博士后科学基金;

关键词：

Adaptive step size; Quasi-Newton; Stochastic optimization; Large-scale learning; MINI-BATCH ALGORITHMS; GRADIENT DESCENT; CONVERGENCE; APPROXIMATION; RATES;

D O I：

10.1007/s11222-023-10218-2

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

The importance of the step size in stochastic optimization has been confirmed both theoretically and empirically during the past few decades and reconsidered in recent years, especially for large-scale learning. Different rules of selecting the step size have been discussed since the arising of stochastic approximation methods. The first part of this work reviews the studies on several representative techniques of setting the step size, covering heuristic rules, meta-learning procedure, adaptive step size technique and line search technique. The second component of this work proposes a novel class of accelerating stochastic optimization methods by resorting to the Barzilai-Borwein (BB) technique with a diagonal selection rule for the metric, particularly, termed as DBB. We first explore the theoretical and empirical properties of variance reduced stochastic optimization algorithms with DBB. Especially, we study the theoretical and numerical properties of the resulting method under strongly convex and non-convex cases respectively. To great show the efficacy of the step size schedule of DBB, we extend it into more general stochastic optimization methods. The theoretical and empirical properties of such the case also developed under different cases. Extensive numerical results in machine learning are offered, suggesting that the proposed algorithms show much promise.

引用

页数：22

共 50 条

[1] Adaptive step size rules for stochastic optimization in large-scale learning
Zhuang Yang
Li Ma
Statistics and Computing, 2023, 33
[2] Adaptive Powerball Stochastic Conjugate Gradient for Large-Scale Learning
Yang, Zhuang
IEEE TRANSACTIONS ON BIG DATA, 2023, 9 (06) : 1598 - 1606
[3] A mini-batch algorithm for large-scale learning problems with adaptive step size
He, Chongyang
Zhang, Yiting
Zhu, Dingyu
Cao, Mingyuan
Yang, Yueting
DIGITAL SIGNAL PROCESSING, 2023, 143
[4] NEW ADAPTIVE BARZILAI-BORWEIN STEP SIZE AND ITS APPLICATION IN SOLVING LARGE-SCALE OPTIMIZATION PROBLEMS
Li, Ting
Wan, Zhong
ANZIAM JOURNAL, 2019, 61 (01) : 76 - 98
[5] Improved Powered Stochastic Optimization Algorithms for Large-Scale Machine Learning
Yang, Zhuang
JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
[6] Adaptive Granularity Learning Distributed Particle Swarm Optimization for Large-Scale Optimization
Wang, Zi-Jia
Zhan, Zhi-Hui
Kwong, Sam
Jin, Hu
Zhang, Jun
IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (03) : 1175 - 1188
[7] Powered stochastic optimization with hypergradient descent for large-scale learning systems
Yang, Zhuang
Li, Xiaotian
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 238
[8] Optimal large-scale stochastic optimization of NDCG surrogates for deep learning
Qiu, Zi-Hao
Hu, Quanqi
Zhong, Yongjian
Tu, Wei-Wei
Zhang, Lijun
Yang, Tianbao
MACHINE LEARNING, 2025, 114 (02)
[9] A STOCHASTIC QUASI-NEWTON METHOD FOR LARGE-SCALE OPTIMIZATION
Byrd, R. H.
Hansen, S. L.
Nocedal, Jorge
Singer, Y.
SIAM JOURNAL ON OPTIMIZATION, 2016, 26 (02) : 1008 - 1031
[10] A Class of Parallel Doubly Stochastic Algorithms for Large-Scale Learning
Mokhtari, Aryan
Koppel, Alec
Takac, Martin
Ribeiro, Alejandro
JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21

← 1 2 3 4 5 →