NEWTON SKETCH: A NEAR LINEAR-TIME OPTIMIZATION ALGORITHM WITH LINEAR-QUADRATIC CONVERGENCE

被引:137
作者
Pilanci, Mert [1 ]
Wainwright, Martin J. [1 ,2 ]
机构
[1] Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USA
[2] Univ Calif Berkeley, Dept Stat, Berkeley, CA 94720 USA
基金
美国国家科学基金会;
关键词
convex optimization; large-scale problems; Newton's method; random projection; randomized algorithms; random matrices; self-concordant functions; interior point method; JOHNSON-LINDENSTRAUSS;
D O I
10.1137/15M1021106
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We propose a randomized second-order method for optimization known as the Newton sketch: it is based on performing an approximate Newton step using a randomly projected Hessian. For self-concordant functions, we prove that the algorithm has superlinear convergence with exponentially high probability, with convergence and complexity guarantees that are independent of condition numbers and related problem-dependent quantities. Given a suitable initialization, similar guarantees also hold for strongly convex and smooth objectives without self-concordance. When implemented using randomized projections based on a subsampled Hadamard basis, the algorithm typically has substantially lower complexity than Newton's method. We also describe extensions of our methods to programs involving convex constraints that are equipped with self-concordant barriers. We discuss and illustrate applications to linear programs, quadratic programs with convex constraints, logistic regression, and other generalized linear models, as well as semidefinite programs.
引用
收藏
页码:205 / 245
页数:41
相关论文
共 39 条
[1]   Database-friendly random projections: Johnson-Lindenstrauss with binary coins [J].
Achlioptas, D .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2003, 66 (04) :671-687
[2]  
Ailon N., 2006, STOC'06. Proceedings of the 38th Annual ACM Symposium on Theory of Computing, P557, DOI 10.1145/1132516.1132597
[3]   Fast Dimension Reduction Using Rademacher Series on Dual BCH Codes [J].
Ailon, Nir ;
Liberty, Edo .
DISCRETE & COMPUTATIONAL GEOMETRY, 2009, 42 (04) :615-630
[4]  
[Anonymous], 2014, Introductory Lectures on Convex Optimization: A Basic Course
[5]  
[Anonymous], 1991, Probability in Banach Spaces
[6]  
[Anonymous], 1999, NUMERICAL OPTIMIZATI
[7]   Local Rademacher complexities [J].
Bartlett, PL ;
Bousquet, O ;
Mendelson, S .
ANNALS OF STATISTICS, 2005, 33 (04) :1497-1537
[8]  
Bordes A, 2009, J MACH LEARN RES, V10, P1737
[9]  
Boyd S., 2004, Convex optimization, DOI [10.1017/cbo97805118044 41, 10.1017/CBO9780511804441]
[10]   A STOCHASTIC QUASI-NEWTON METHOD FOR LARGE-SCALE OPTIMIZATION [J].
Byrd, R. H. ;
Hansen, S. L. ;
Nocedal, Jorge ;
Singer, Y. .
SIAM JOURNAL ON OPTIMIZATION, 2016, 26 (02) :1008-1031