STOCHASTIC FIRST-ORDER METHODS WITH RANDOM CONSTRAINT PROJECTION

被引:43
作者
Wang, Mengdi [1 ]
Bertsekas, Dimitri P. [2 ,3 ]
机构
[1] Princeton Univ, Dept Operat Res & Financial Engn, Princeton, NJ 08540 USA
[2] MIT, Dept Elect Engn & Comp Sci, Cambridge, MA 02139 USA
[3] MIT, LIDS, 77 Massachusetts Ave, Cambridge, MA 02139 USA
关键词
large-scale optimization; subgradient; projection; proximal; stochastic approximation; feasibility; randomized algorithm; online optimization; CONVEX-SETS; CONVERGENCE; OPTIMIZATION; ALGORITHMS;
D O I
10.1137/130931278
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We consider convex optimization problems with structures that are suitable for sequential treatment or online sampling. In particular, we focus on problems where the objective function is an expected value, and the constraint set is the intersection of a large number of simpler sets. We propose an algorithmic framework for stochastic first-order methods using random projection/proximal updates and random constraint updates, which contain as special cases several known algorithms as well as many new algorithms. To analyze the convergence of these algorithms in a unified manner, we prove a general coupled convergence theorem. It states that the convergence is obtained from an interplay between two coupled processes: progress toward feasibility and progress toward optimality. Under suitable stepsize assumptions, we show that the optimality error decreases at a rate of O(1/root k) and the feasibility error decreases at a rate of O(log k/k). We also consider a number of typical sampling processes for generating stochastic first-order information and random constraints, which are common in data-intensive applications, online learning, and simulation optimization. By using the coupled convergence theorem as a modular architecture, we are able to analyze the convergence of stochastic algorithms that use arbitrary combinations of these sampling processes.
引用
收藏
页码:681 / 717
页数:37
相关论文
共 27 条
[1]   Information-Theoretic Lower Bounds on the Oracle Complexity of Stochastic Convex Optimization [J].
Agarwal, Alekh ;
Bartlett, Peter L. ;
Ravikumar, Pradeep ;
Wainwright, Martin J. .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2012, 58 (05) :3235-3249
[2]  
[Anonymous], 1997, Contemporary Mathematics
[3]  
[Anonymous], 2009, Stochastic Approximation: A Dynamical Systems Viewpoint
[4]  
BAUSCHKE H. H., 1996, THESIS S FRAZER U BU
[5]   Projection algorithms for solving convex feasibility problems [J].
Bauschke, HH ;
Borwein, JM .
SIAM REVIEW, 1996, 38 (03) :367-426
[6]  
Bertsekas D. P., 1989, PARALLEL DISTRIBUTED, V23
[7]  
BERTSEKAS D. P., 2001, STUD COMPUT MATH, V8, P381
[8]   Incremental proximal methods for large scale convex optimization [J].
Bertsekas, Dimitri P. .
MATHEMATICAL PROGRAMMING, 2011, 129 (02) :163-195
[9]   RELAXED ALTERNATING PROJECTION METHODS [J].
Cegielski, Andrzej ;
Suchocka, Agnieszka .
SIAM JOURNAL ON OPTIMIZATION, 2008, 19 (03) :1093-1106
[10]   The rate of convergence for the cyclic projections algorithm I: Angles between convex sets [J].
Deutsch, Frank ;
Hundal, Hein .
JOURNAL OF APPROXIMATION THEORY, 2006, 142 (01) :36-55