Adaptive Powerball Stochastic Conjugate Gradient for Large-Scale Learning

被引:3
|
作者
Yang, Zhuang [1 ]
机构
[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou 215006, Peoples R China
关键词
Machine learning algorithms; Sensitivity; Machine learning; Ordinary differential equations; Information retrieval; Robustness; Computational complexity; Adaptive learning rate; conjugate gradient; large-scale learning; powerball function; stochastic optimization; QUASI-NEWTON METHOD;
D O I
10.1109/TBDATA.2023.3300546
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The extreme success of stochastic optimization (SO) in large-scale machine learning problems, information retrieval, bioinformatics, etc., has been widely reported, especially in recent years. As an effective tactic, conjugate gradient (CG) has been gaining its popularity in accelerating SO algorithms. This paper develops a novel type of stochastic conjugate gradient descent (SCG) algorithms from the perspective of the Powerball strategy and the hypergradient descent (HD) technique. The crucial idea behind the resulting methods is inspired by pursuing the equilibrium of ordinary differential equations (ODEs). We elucidate the effect of the Powerball strategy in SCG algorithms. The introduction of HD, on the other side, makes the resulting methods work with an online learning rate. Meanwhile, we provide a comprehension of the theoretical results for the resulting algorithms under non-convex assumptions. As a byproduct, we bridge the gap between the learning rate and powered stochastic optimization (PSO) algorithms, which is still an open problem. Resorting to numerical experiments on numerous benchmark datasets, we test the parameter sensitivity of the proposed methods and demonstrate the superior performance of our new algorithms over state-of-the-art algorithms.
引用
收藏
页码:1598 / 1606
页数:9
相关论文
共 50 条
  • [31] Robustness of large-scale stochastic matrices to localized perturbations
    Como, Giacomo
    Fagnani, Fabio
    2014 IEEE 53RD ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2014, : 3648 - 3653
  • [32] Another Conjugate Gradient Algorithm with Guaranteed Descent and Conjugacy Conditions for Large-scale Unconstrained Optimization
    Neculai Andrei
    Journal of Optimization Theory and Applications, 2013, 159 : 159 - 182
  • [33] A Modified Hestenes and Stiefel Conjugate Gradient Algorithm for Large-Scale Nonsmooth Minimizations and Nonlinear Equations
    Gonglin Yuan
    Zehong Meng
    Yong Li
    Journal of Optimization Theory and Applications, 2016, 168 : 129 - 152
  • [34] A Modified Three-Term Conjugate Gradient Algorithm for Large-Scale Nonsmooth Convex Optimization
    Hu, Wujie
    Yuan, Gonglin
    Hongtruong Pham
    CMC-COMPUTERS MATERIALS & CONTINUA, 2020, 62 (02): : 787 - 800
  • [35] Breaking the Curse of Kernelization: Budgeted Stochastic Gradient Descent for Large-Scale SVM Training
    Wang, Zhuang
    Crammer, Koby
    Vucetic, Slobodan
    JOURNAL OF MACHINE LEARNING RESEARCH, 2012, 13 : 3103 - 3131
  • [36] A Parameteric Conjugate Gradient Methods for Large Scale Problem
    Ma, Mingjuan
    Zhang, Yongpo
    Sun, Jiahui
    Mao, Huiyu
    PROCEEDINGS OF THE 2016 2ND INTERNATIONAL CONFERENCE ON ADVANCES IN ENERGY, ENVIRONMENT AND CHEMICAL ENGINEERING (AEECE 2016), 2016, 89 : 364 - 367
  • [37] Stochastic trust region inexact Newton method for large-scale machine learning
    Vinod Kumar Chauhan
    Anuj Sharma
    Kalpana Dahiya
    International Journal of Machine Learning and Cybernetics, 2020, 11 : 1541 - 1555
  • [38] Stochastic trust region inexact Newton method for large-scale machine learning
    Chauhan, Vinod Kumar
    Sharma, Anuj
    Dahiya, Kalpana
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2020, 11 (07) : 1541 - 1555
  • [39] AN ADAPTIVE MULTIPRECONDITIONED CONJUGATE GRADIENT ALGORITHM
    Spillane, Nicole
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2016, 38 (03) : A1896 - A1918
  • [40] Stochastic Gradients for Large-Scale Tensor Decomposition
    Kolda, Tamara G.
    Hong, David
    SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2020, 2 (04): : 1066 - 1095