Accelerated Variance Reduction Stochastic ADMM for Large-Scale Machine Learning

被引：21

作者：

Liu, Yuanyuan ^{[1
]}

Shang, Fanhua ^{[1
,2
]}

Liu, Hongying ^{[1
]}

Kong, Lin ^{[1
]}

Jiao, Licheng ^{[1
]}

Lin, Zhouchen ^{[3
]}

机构：

[1] Xidian Univ, Sch Artificial Intelligence, Minist Educ, Key Lab Intelligent Percept & Image Understanding, Xian 710071, Shaanxi, Peoples R China

[2] Peng Cheng Lab, Shenzhen 518066, Peoples R China

[3] Peking Univ, Sch EECS, Key Lab Machine Percept MoE, Beijing 100871, Peoples R China

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2021年 / 43卷 / 12期

基金：

中国国家自然科学基金;

关键词：

Stochastic optimization; ADMM; variance reduction; momentum acceleration; strongly convex and non-strongly convex; smooth and non-smooth; ALTERNATING DIRECTION METHOD; DUAL COORDINATE ASCENT; CONVERGENCE; MULTIPLIERS; ALGORITHMS; DESCENT;

D O I：

10.1109/TPAMI.2020.3000512

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recently, many stochastic variance reduced alternating direction methods of multipliers (ADMMs) (e.g., SAG-ADMM and SVRG-ADMM) have made exciting progress such as linear convergence rate for strongly convex (SC) problems. However, their best-known convergence rate for non-strongly convex (non-SC) problems is O(1/T) as opposed to O(1/T-2) of accelerated deterministic algorithms, where T is the number of iterations. Thus, there remains a gap in the convergence rates of existing stochastic ADMM and deterministic algorithms. To bridge this gap, we introduce a new momentum acceleration trick into stochastic variance reduced ADMM, and propose a novel accelerated SVRG-ADMM method (called ASVRG-ADMM) for the machine learning problems with the constraint Ax + By = c. Then we design a linearized proximal update rule and a simple proximal one for the two classes of ADMM-style problems with B = tau I and B not equal tau I, respectively, where I is an identity matrix and tau is an arbitrary bounded constant. Note that our linearized proximal update rule can avoid solving sub-problems iteratively. Moreover, we prove that ASVRG-ADMM converges linearly for SC problems. In particular, ASVRG-ADMM improves the convergence rate from O(1/T) to O(1/T-2) for non-SC problems. Finally, we apply ASVRG-ADMM to various machine learning problems, e.g., graph-guided fused Lasso, graph-guided logistic regression, graph-guided SVM, generalized graph-guided fused Lasso and multi-task learning, and show that ASVRG-ADMM consistently converges faster than the state-of-the-art methods.

引用

页码：4242 / 4255

页数：14

共 50 条

[31] Accelerated proximal stochastic variance reduction for DC optimization
Lulu He
Jimin Ye
Jianwei E
Neural Computing and Applications, 2021, 33 : 13163 - 13181
[32] Accelerated proximal stochastic variance reduction for DC optimization
He, Lulu
Ye, Jimin
Jianwei, E.
NEURAL COMPUTING & APPLICATIONS, 2021, 33 (20) : 13163 - 13181
[33] Angel: a new large-scale machine learning system
Jiang, Jie
Yu, Lele
Jiang, Jiawei
Liu, Yuhong
Cui, Bin
NATIONAL SCIENCE REVIEW, 2018, 5 (02) : 216 - 236
[34] Angel: a new large-scale machine learning system
Jie Jiang
Lele Yu
Jiawei Jiang
Yuhong Liu
Bin Cui
National Science Review, 2018, 5 (02) : 216 - 236
[35] The Powerball Method With Biased Stochastic Gradient Estimation for Large-Scale Learning Systems
Yang, Zhuang
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, : 7435 - 7447
[36] A Class of Parallel Doubly Stochastic Algorithms for Large-Scale Learning
Mokhtari, Aryan
Koppel, Alec
Takac, Martin
Ribeiro, Alejandro
JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21
[37] Adaptive Powerball Stochastic Conjugate Gradient for Large-Scale Learning
Yang, Zhuang
IEEE TRANSACTIONS ON BIG DATA, 2023, 9 (06) : 1598 - 1606
[38] An online conjugate gradient algorithm for large-scale data analysis in machine learning
Xue, Wei
Wan, Pengcheng
Li, Qiao
Zhong, Ping
Yu, Gaohang
Tao, Tao
AIMS MATHEMATICS, 2021, 6 (02): : 1515 - 1537
[39] Accelerated Stochastic Variance Reduction for a Class of Convex Optimization Problems
He, Lulu
Ye, Jimin
Jianwei, E.
JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2023, 196 (03) : 810 - 828
[40] Accelerated Stochastic Variance Reduction for a Class of Convex Optimization Problems
Lulu He
Jimin Ye
E. Jianwei
Journal of Optimization Theory and Applications, 2023, 196 : 810 - 828

← 1 2 3 4 5 →