Accelerated Variance Reduction Stochastic ADMM for Large-Scale Machine Learning

被引:21
|
作者
Liu, Yuanyuan [1 ]
Shang, Fanhua [1 ,2 ]
Liu, Hongying [1 ]
Kong, Lin [1 ]
Jiao, Licheng [1 ]
Lin, Zhouchen [3 ]
机构
[1] Xidian Univ, Sch Artificial Intelligence, Minist Educ, Key Lab Intelligent Percept & Image Understanding, Xian 710071, Shaanxi, Peoples R China
[2] Peng Cheng Lab, Shenzhen 518066, Peoples R China
[3] Peking Univ, Sch EECS, Key Lab Machine Percept MoE, Beijing 100871, Peoples R China
基金
中国国家自然科学基金;
关键词
Stochastic optimization; ADMM; variance reduction; momentum acceleration; strongly convex and non-strongly convex; smooth and non-smooth; ALTERNATING DIRECTION METHOD; DUAL COORDINATE ASCENT; CONVERGENCE; MULTIPLIERS; ALGORITHMS; DESCENT;
D O I
10.1109/TPAMI.2020.3000512
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, many stochastic variance reduced alternating direction methods of multipliers (ADMMs) (e.g., SAG-ADMM and SVRG-ADMM) have made exciting progress such as linear convergence rate for strongly convex (SC) problems. However, their best-known convergence rate for non-strongly convex (non-SC) problems is O(1/T) as opposed to O(1/T-2) of accelerated deterministic algorithms, where T is the number of iterations. Thus, there remains a gap in the convergence rates of existing stochastic ADMM and deterministic algorithms. To bridge this gap, we introduce a new momentum acceleration trick into stochastic variance reduced ADMM, and propose a novel accelerated SVRG-ADMM method (called ASVRG-ADMM) for the machine learning problems with the constraint Ax + By = c. Then we design a linearized proximal update rule and a simple proximal one for the two classes of ADMM-style problems with B = tau I and B not equal tau I, respectively, where I is an identity matrix and tau is an arbitrary bounded constant. Note that our linearized proximal update rule can avoid solving sub-problems iteratively. Moreover, we prove that ASVRG-ADMM converges linearly for SC problems. In particular, ASVRG-ADMM improves the convergence rate from O(1/T) to O(1/T-2) for non-SC problems. Finally, we apply ASVRG-ADMM to various machine learning problems, e.g., graph-guided fused Lasso, graph-guided logistic regression, graph-guided SVM, generalized graph-guided fused Lasso and multi-task learning, and show that ASVRG-ADMM consistently converges faster than the state-of-the-art methods.
引用
收藏
页码:4242 / 4255
页数:14
相关论文
共 50 条
  • [1] SAAGs: Biased stochastic variance reduction methods for large-scale learning
    Chauhan, Vinod Kumar
    Sharma, Anuj
    Dahiya, Kalpana
    APPLIED INTELLIGENCE, 2019, 49 (09) : 3331 - 3361
  • [2] SAAGs: Biased stochastic variance reduction methods for large-scale learning
    Vinod Kumar Chauhan
    Anuj Sharma
    Kalpana Dahiya
    Applied Intelligence, 2019, 49 : 3331 - 3361
  • [3] Variance Counterbalancing for Stochastic Large-scale Learning
    Lagari, Pola Lydia
    Tsoukalas, Lefteri H.
    Lagaris, Isaac E.
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2020, 29 (05)
  • [4] VR-SGD: A Simple Stochastic Variance Reduction Method for Machine Learning
    Shang, Fanhua
    Zhou, Kaiwen
    Liu, Hongying
    Cheng, James
    Tsang, Ivor W.
    Zhang, Lijun
    Tao, Dacheng
    Jiao, Licheng
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (01) : 188 - 202
  • [5] ACCELERATED SYMMETRIC ADMM AND ITS APPLICATIONS IN LARGE-SCALE SIGNAL PROCESSING
    Bai, Jianchao
    Guo, Ke
    Liang, Junli
    Jing, Yang
    So, H. C.
    JOURNAL OF COMPUTATIONAL MATHEMATICS, 2024, 42 (06): : 1605 - 1626
  • [6] A Flexible Stochastic Multi-Agent ADMM Method for Large-Scale Distributed Optimization
    Wu, Lin
    Wang, Yongbin
    Shi, Tuo
    IEEE ACCESS, 2022, 10 : 19045 - 19059
  • [7] Improved Powered Stochastic Optimization Algorithms for Large-Scale Machine Learning
    Yang, Zhuang
    JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
  • [8] Large-scale machine learning with fast and stable stochastic conjugate gradient
    Yang, Zhuang
    COMPUTERS & INDUSTRIAL ENGINEERING, 2022, 173
  • [9] An accelerated stochastic variance-reduced method for machine learning problems
    Yang, Zhuang
    Chen, Zengping
    Wang, Cheng
    KNOWLEDGE-BASED SYSTEMS, 2020, 198
  • [10] Painless Stochastic Conjugate Gradient for Large-Scale Machine Learning
    Yang, Zhuang
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (10) : 14645 - 14658