DINGO: Distributed Newton-Type Method for Gradient-Norm Optimization

被引：0

作者：

Crane, Rixon ^{[1
]}

Roosta, Fred ^{[1
]}

机构：

[1] Univ Queensland, Brisbane, Qld, Australia

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019) | 2019年 / 32卷

基金：

澳大利亚研究理事会;

关键词：

STOCHASTIC ALGORITHMS;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

For optimization of a large sum of functions in a distributed computing environment, we present a novel communication efficient Newton-type algorithm that enjoys a variety of advantages over similar existing methods. Our algorithm, DINGO, is derived by optimization of the gradient's norm as a surrogate function. DINGO does not impose any specific form on the underlying functions and its application range extends far beyond convexity and smoothness. The underlying sub-problems of DINGO are simple linear least-squares, for which a plethora of efficient algorithms exist. DINGO involves a few hyper-parameters that are easy to tune and we theoretically show that a strict reduction in the surrogate objective is guaranteed, regardless of the selected hyper-parameters.

引用

页数：11

共 25 条

[11] Krizhevsky A., 2009, LEARNING MULTIPLE LA
[12] Mishra SK, 2008, NONCONVEX OPTIM, P1
[13] Mohri M., 2018, Foundations of Machine Learning
[14] Nocedal J, 2006, SPRINGER SER OPER RE, P1, DOI 10.1007/978-0-387-40065-5
[15] Penrose R, 1955, MATH P CAMBRIDGE PHI, V51, P406, DOI 10.1017/S0305004100030401
[16] Roosta F., 2018, ARXIV181000303
[17] Roosta-Khorasani F, 2014, ELECTRON T NUMER ANA, V42, P177
[18] STOCHASTIC ALGORITHMS FOR INVERSE PROBLEMS INVOLVING PDEs AND MANY MEASUREMENTS
Roosta-Khorasani, Farbod
van den Doel, Kees
Ascher, Uri
[J]. SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2014, 36 (05) : S3 - S22
[19] Shalev-Shwartz S., 2014, UNDERSTANDING MACHIN, DOI 10.1017/CBO9781107298019
[20] Shamir O, 2014, PR MACH LEARN RES, V32, P1000

← 1 2 3 →