Constrained Stochastic Gradient Descent for Large-scale Least Squares Problem

被引:0
|
作者
Mu, Yang [1 ]
Ding, Wei [1 ]
Zhou, Tianyi [2 ]
Tao, Dacheng [2 ]
机构
[1] Univ Massachusetts, 100 Morrissey Blvd, Boston, MA 02125 USA
[2] Univ Technol Sydney, Ultimo, NSW 2007, Australia
来源
19TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'13) | 2013年
关键词
Stochastic optimization; Large-scale least squares; online learning; APPROXIMATION; ALGORITHMS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The least squares problem is one of the most important regression problems in statistics, machine learning and data mining. In this paper, we present the Constrained Stochastic Gradient Descent (CSGD) algorithm to solve the large-scale least squares problem. CSGD improves the Stochastic Gradient Descent (SGD) by imposing a provable constraint that the linear regression line passes through the mean point of all the data points. It results in the best regret bound o(logT), and fastest convergence speed among all first order approaches. Empirical studies justify the effectiveness of CSGD by comparing it with SGD and other state-of-the-art approaches. An example is also given to show how to use CSGD to optimize SGD based least squares problems to achieve a better performance.
引用
收藏
页码:883 / 891
页数:9
相关论文
共 50 条
  • [21] On the flexibility of block coordinate descent for large-scale optimization
    Wang, Xiangfeng
    Zhang, Wenjie
    Yan, Junchi
    Yuan, Xiaoming
    Zha, Hongyuan
    NEUROCOMPUTING, 2018, 272 : 471 - 480
  • [22] A constrained matrix least-squares problem in structural dynamics model updating
    Yuan, Quan
    JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2015, 280 : 367 - 376
  • [23] Large-scale stochastic linear inversion using hierarchical matrices
    Ambikasaran, Sivaram
    Li, Judith Yue
    Kitanidis, Peter K.
    Darve, Eric
    COMPUTATIONAL GEOSCIENCES, 2013, 17 (06) : 913 - 927
  • [24] On greedy randomized coordinate descent methods for solving large linear least-squares problems
    Bai, Zhong-Zhi
    Wu, Wen-Ting
    NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS, 2019, 26 (04)
  • [25] Stochastic Gradients for Large-Scale Tensor Decomposition
    Kolda, Tamara G.
    Hong, David
    SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2020, 2 (04): : 1066 - 1095
  • [26] A two-step randomized Gauss-Seidel method for solving large-scale linear least squares problems
    不详
    ELECTRONIC RESEARCH ARCHIVE, 2022, 30 (02): : 755 - 779
  • [27] DIFFUSION-BASED DISTRIBUTED ADAPTIVE ESTIMATION UTILIZING GRADIENT-DESCENT TOTAL LEAST-SQUARES
    Arablouei, Reza
    Werner, Stefan
    Dogancay, Kutluyil
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 5308 - 5312
  • [28] Accelerated Variance Reduction Stochastic ADMM for Large-Scale Machine Learning
    Liu, Yuanyuan
    Shang, Fanhua
    Liu, Hongying
    Kong, Lin
    Jiao, Licheng
    Lin, Zhouchen
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (12) : 4242 - 4255
  • [29] Adaptive step size rules for stochastic optimization in large-scale learning
    Yang, Zhuang
    Ma, Li
    STATISTICS AND COMPUTING, 2023, 33 (02)
  • [30] Optimal large-scale stochastic optimization of NDCG surrogates for deep learning
    Qiu, Zi-Hao
    Hu, Quanqi
    Zhong, Yongjian
    Tu, Wei-Wei
    Zhang, Lijun
    Yang, Tianbao
    MACHINE LEARNING, 2025, 114 (02)