DINGO: Distributed Newton-Type Method for Gradient-Norm Optimization

被引:0
作者
Crane, Rixon [1 ]
Roosta, Fred [1 ]
机构
[1] Univ Queensland, Brisbane, Qld, Australia
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019) | 2019年 / 32卷
基金
澳大利亚研究理事会;
关键词
STOCHASTIC ALGORITHMS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For optimization of a large sum of functions in a distributed computing environment, we present a novel communication efficient Newton-type algorithm that enjoys a variety of advantages over similar existing methods. Our algorithm, DINGO, is derived by optimization of the gradient's norm as a surrogate function. DINGO does not impose any specific form on the underlying functions and its application range extends far beyond convexity and smoothness. The underlying sub-problems of DINGO are simple linear least-squares, for which a plethora of efficient algorithms exist. DINGO involves a few hyper-parameters that are easy to tune and we theoretically show that a strict reduction in the surrogate objective is guaranteed, regardless of the selected hyper-parameters.
引用
收藏
页数:11
相关论文
共 25 条
  • [11] Krizhevsky A., 2009, LEARNING MULTIPLE LA
  • [12] Mishra SK, 2008, NONCONVEX OPTIM, P1
  • [13] Mohri M., 2018, Foundations of Machine Learning
  • [14] Nocedal J, 2006, SPRINGER SER OPER RE, P1, DOI 10.1007/978-0-387-40065-5
  • [15] Penrose R, 1955, MATH P CAMBRIDGE PHI, V51, P406, DOI 10.1017/S0305004100030401
  • [16] Roosta F., 2018, ARXIV181000303
  • [17] Roosta-Khorasani F, 2014, ELECTRON T NUMER ANA, V42, P177
  • [18] STOCHASTIC ALGORITHMS FOR INVERSE PROBLEMS INVOLVING PDEs AND MANY MEASUREMENTS
    Roosta-Khorasani, Farbod
    van den Doel, Kees
    Ascher, Uri
    [J]. SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2014, 36 (05) : S3 - S22
  • [19] Shalev-Shwartz S., 2014, UNDERSTANDING MACHIN, DOI 10.1017/CBO9781107298019
  • [20] Shamir O, 2014, PR MACH LEARN RES, V32, P1000