Accelerated Variance Reduction Stochastic ADMM for Large-Scale Machine Learning

被引：21

作者：

Liu, Yuanyuan ^{[1
]}

Shang, Fanhua ^{[1
,2
]}

Liu, Hongying ^{[1
]}

Kong, Lin ^{[1
]}

Jiao, Licheng ^{[1
]}

Lin, Zhouchen ^{[3
]}

机构：

[1] Xidian Univ, Sch Artificial Intelligence, Minist Educ, Key Lab Intelligent Percept & Image Understanding, Xian 710071, Shaanxi, Peoples R China

[2] Peng Cheng Lab, Shenzhen 518066, Peoples R China

[3] Peking Univ, Sch EECS, Key Lab Machine Percept MoE, Beijing 100871, Peoples R China

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2021年 / 43卷 / 12期

基金：

中国国家自然科学基金;

关键词：

Stochastic optimization; ADMM; variance reduction; momentum acceleration; strongly convex and non-strongly convex; smooth and non-smooth; ALTERNATING DIRECTION METHOD; DUAL COORDINATE ASCENT; CONVERGENCE; MULTIPLIERS; ALGORITHMS; DESCENT;

D O I：

10.1109/TPAMI.2020.3000512

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recently, many stochastic variance reduced alternating direction methods of multipliers (ADMMs) (e.g., SAG-ADMM and SVRG-ADMM) have made exciting progress such as linear convergence rate for strongly convex (SC) problems. However, their best-known convergence rate for non-strongly convex (non-SC) problems is O(1/T) as opposed to O(1/T-2) of accelerated deterministic algorithms, where T is the number of iterations. Thus, there remains a gap in the convergence rates of existing stochastic ADMM and deterministic algorithms. To bridge this gap, we introduce a new momentum acceleration trick into stochastic variance reduced ADMM, and propose a novel accelerated SVRG-ADMM method (called ASVRG-ADMM) for the machine learning problems with the constraint Ax + By = c. Then we design a linearized proximal update rule and a simple proximal one for the two classes of ADMM-style problems with B = tau I and B not equal tau I, respectively, where I is an identity matrix and tau is an arbitrary bounded constant. Note that our linearized proximal update rule can avoid solving sub-problems iteratively. Moreover, we prove that ASVRG-ADMM converges linearly for SC problems. In particular, ASVRG-ADMM improves the convergence rate from O(1/T) to O(1/T-2) for non-SC problems. Finally, we apply ASVRG-ADMM to various machine learning problems, e.g., graph-guided fused Lasso, graph-guided logistic regression, graph-guided SVM, generalized graph-guided fused Lasso and multi-task learning, and show that ASVRG-ADMM consistently converges faster than the state-of-the-art methods.

引用

页码：4242 / 4255

页数：14

共 50 条

[1] SAAGs: Biased stochastic variance reduction methods for large-scale learning
Chauhan, Vinod Kumar
Sharma, Anuj
Dahiya, Kalpana
APPLIED INTELLIGENCE, 2019, 49 (09) : 3331 - 3361
[2] SAAGs: Biased stochastic variance reduction methods for large-scale learning
Vinod Kumar Chauhan
Anuj Sharma
Kalpana Dahiya
Applied Intelligence, 2019, 49 : 3331 - 3361
[3] Variance Counterbalancing for Stochastic Large-scale Learning
Lagari, Pola Lydia
Tsoukalas, Lefteri H.
Lagaris, Isaac E.
INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2020, 29 (05)
[4] VR-SGD: A Simple Stochastic Variance Reduction Method for Machine Learning
Shang, Fanhua
Zhou, Kaiwen
Liu, Hongying
Cheng, James
Tsang, Ivor W.
Zhang, Lijun
Tao, Dacheng
Jiao, Licheng
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (01) : 188 - 202
[5] ACCELERATED SYMMETRIC ADMM AND ITS APPLICATIONS IN LARGE-SCALE SIGNAL PROCESSING
Bai, Jianchao
Guo, Ke
Liang, Junli
Jing, Yang
So, H. C.
JOURNAL OF COMPUTATIONAL MATHEMATICS, 2024, 42 (06): : 1605 - 1626
[6] A Flexible Stochastic Multi-Agent ADMM Method for Large-Scale Distributed Optimization
Wu, Lin
Wang, Yongbin
Shi, Tuo
IEEE ACCESS, 2022, 10 : 19045 - 19059
[7] Improved Powered Stochastic Optimization Algorithms for Large-Scale Machine Learning
Yang, Zhuang
JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
[8] Large-scale machine learning with fast and stable stochastic conjugate gradient
Yang, Zhuang
COMPUTERS & INDUSTRIAL ENGINEERING, 2022, 173
[9] An accelerated stochastic variance-reduced method for machine learning problems
Yang, Zhuang
Chen, Zengping
Wang, Cheng
KNOWLEDGE-BASED SYSTEMS, 2020, 198
[10] Painless Stochastic Conjugate Gradient for Large-Scale Machine Learning
Yang, Zhuang
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (10) : 14645 - 14658

← 1 2 3 4 5 →