NEWTON SKETCH: A NEAR LINEAR-TIME OPTIMIZATION ALGORITHM WITH LINEAR-QUADRATIC CONVERGENCE

被引：137

作者：

Pilanci, Mert ^{[1
]}

Wainwright, Martin J. ^{[1
,2
]}

机构：

[1] Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USA

[2] Univ Calif Berkeley, Dept Stat, Berkeley, CA 94720 USA

来源：

SIAM JOURNAL ON OPTIMIZATION | 2017年 / 27卷 / 01期

基金：

美国国家科学基金会;

关键词：

convex optimization; large-scale problems; Newton's method; random projection; randomized algorithms; random matrices; self-concordant functions; interior point method; JOHNSON-LINDENSTRAUSS;

D O I：

10.1137/15M1021106

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

We propose a randomized second-order method for optimization known as the Newton sketch: it is based on performing an approximate Newton step using a randomly projected Hessian. For self-concordant functions, we prove that the algorithm has superlinear convergence with exponentially high probability, with convergence and complexity guarantees that are independent of condition numbers and related problem-dependent quantities. Given a suitable initialization, similar guarantees also hold for strongly convex and smooth objectives without self-concordance. When implemented using randomized projections based on a subsampled Hadamard basis, the algorithm typically has substantially lower complexity than Newton's method. We also describe extensions of our methods to programs involving convex constraints that are equipped with self-concordant barriers. We discuss and illustrate applications to linear programs, quadratic programs with convex constraints, logistic regression, and other generalized linear models, as well as semidefinite programs.

引用

页码：205 / 245

页数：41

共 39 条

[1] Database-friendly random projections: Johnson-Lindenstrauss with binary coins [J].

Achlioptas, D .

JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2003, 66 (04) :671-687

[2]

Ailon N., 2006, STOC'06. Proceedings of the 38th Annual ACM Symposium on Theory of Computing, P557, DOI 10.1145/1132516.1132597

[3] Fast Dimension Reduction Using Rademacher Series on Dual BCH Codes [J].

Ailon, Nir ;

Liberty, Edo .

DISCRETE & COMPUTATIONAL GEOMETRY, 2009, 42 (04) :615-630

[4]

[Anonymous], 2014, Introductory Lectures on Convex Optimization: A Basic Course

[5]

[Anonymous], 1991, Probability in Banach Spaces

[6]

[Anonymous], 1999, NUMERICAL OPTIMIZATI

[7] Local Rademacher complexities [J].

Bartlett, PL ;

Bousquet, O ;

Mendelson, S .

ANNALS OF STATISTICS, 2005, 33 (04) :1497-1537

[8]

Bordes A, 2009, J MACH LEARN RES, V10, P1737

[9]

Boyd S., 2004, Convex optimization, DOI [10.1017/cbo97805118044 41, 10.1017/CBO9780511804441]

[10] A STOCHASTIC QUASI-NEWTON METHOD FOR LARGE-SCALE OPTIMIZATION [J].

Byrd, R. H. ;

Hansen, S. L. ;

Nocedal, Jorge ;

Singer, Y. .

SIAM JOURNAL ON OPTIMIZATION, 2016, 26 (02) :1008-1031

← 1 2 3 4 →