Stochastic Variance Reduced Gradient Methods Using a Trust-Region-Like Scheme

被引:8
|
作者
Yu, Tengteng [1 ]
Liu, Xin-Wei [2 ]
Dai, Yu-Hong [3 ,4 ]
Sun, Jie [2 ,5 ]
机构
[1] Hebei Univ Technol, Sch Artificial Intelligence, Tianjin 300401, Peoples R China
[2] Hebei Univ Technol, Inst Math, Tianjin 300401, Peoples R China
[3] Chinese Acad Sci, Acad Math & Syst Sci, Inst Computat Math & Sci Engn Comp, State Key Lab Sci & Engn Comp, Beijing 100190, Peoples R China
[4] Univ Chinese Acad Sci, Sch Math Sci, Beijing 100049, Peoples R China
[5] Natl Univ Singapore, Sch Business, Singapore 119245, Singapore
关键词
Stochastic variance reduced gradient; Trust region; Barzilai-Borwein stepsizes; Mini-batches; Empirical risk minimization; 90C06; 90C30; 90C90; 90C25; OPTIMIZATION; DESCENT;
D O I
10.1007/s10915-020-01402-x
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Stochastic variance reduced gradient (SVRG) methods are important approaches to minimize the average of a large number of cost functions frequently arising in machine learning and many other applications. In this paper, based on SVRG, we propose a SVRG-TR method which employs a trust-region-like scheme for selecting stepsizes. It is proved that the SVRG-TR method is linearly convergent in expectation for smooth strongly convex functions and enjoys a faster convergence rate than SVRG methods. In order to overcome the difficulty of tuning stepsizes by hand, we propose to combine the Barzilai-Borwein (BB) method to automatically compute stepsizes for the SVRG-TR method, named as the SVRG-TR-BB method. By incorporating mini-batching scheme with SVRG-TR and SVRG-TR-BB, respectively, we further propose two extended methods mSVRG-TR and mSVRG-TR-BB. Linear convergence and complexity of mSVRG-TR are given. Numerical experiments on some standard datasets show that SVRG-TR and SVRG-TR-BB are generally better than or comparable to SVRG with best-tuned stepsizes and some modern stochastic gradient methods, while mSVRG-TR and mSVRG-TR-BB are very competitive with mini-batch variants of recent successful stochastic gradient methods.
引用
收藏
页数:24
相关论文
共 29 条