Solving Stochastic Compositional Optimization is Nearly as Easy as Solving Stochastic Optimization

被引:40
作者
Chen, Tianyi [1 ,2 ]
Sun, Yuejiao [3 ]
Yin, Wotao [3 ]
机构
[1] Rensselaer Polytech Inst, Dept Elect Comp & Syst Engn, Troy, NY 12180 USA
[2] Rensselaer Polytech Inst, Inst Data Explorat & Applicat, Troy, NY 12180 USA
[3] Univ Calif Los Angeles, Dept Math, Los Angeles, CA 90025 USA
基金
美国国家科学基金会;
关键词
Stochastic processes; Optimization; Task analysis; Convergence; Signal processing algorithms; Gradient methods; Complexity theory; Stochastic approximation; stochastic compositional optimization; adaptive gradients; meta learning;
D O I
10.1109/TSP.2021.3092377
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Stochastic compositional optimization generalizes classic (non-compositional) stochastic optimization to the minimization of compositions of functions. Each composition may introduce an additional expectation. The series of expectations may be nested. Stochastic compositional optimization is gaining popularity in applications such as reinforcement learning and meta learning. This paper presents a new Stochastically Corrected Stochastic Compositional gradient method (SCSC). SCSC runs in a single-time scale with a single loop, uses a fixed batch size, and guarantees to converge at the same rate as the stochastic gradient descent (SGD) method for non-compositional stochastic optimization. This is achieved by making a careful improvement to a popular stochastic compositional gradient method. It is easy to apply SGD-improvement techniques to accelerate SCSC. This helps SCSC achieve state-of-the-art performance for stochastic compositional optimization. In particular, we apply Adam to SCSC, and the exhibited rate of convergence matches that of the original Adam on non-compositional stochastic optimization. We test SCSC using the model-agnostic meta-learning tasks.
引用
收藏
页码:4937 / 4948
页数:12
相关论文
共 41 条
[1]  
[Anonymous], 2020, ARXIV200101227
[2]  
Blanchet J., 2017, ARXIV171107564
[3]   Optimization Methods for Large-Scale Machine Learning [J].
Bottou, Leon ;
Curtis, Frank E. ;
Nocedal, Jorge .
SIAM REVIEW, 2018, 60 (02) :223-311
[4]   Faster parallel collision detection at high resolution for CNC milling applications [J].
Chen, Xin ;
Konobrytskyi, Dmytro ;
Tucker, Thomas M. ;
Kurfess, Thomas R. ;
Vuduc, Richard W. .
PROCEEDINGS OF THE 48TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP 2019), 2019,
[5]  
Dann C, 2014, J MACH LEARN RES, V15, P809
[6]  
Defazio A, 2014, ADV NEUR IN, V27
[7]  
Devraj Adithya M., 2019, Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, P9878
[8]  
Duchi J, 2011, J MACH LEARN RES, V12, P2121
[9]  
Fallah A., 2020, P INT C ART INT STAT
[10]  
Fang C, 2018, ADV NEUR IN, V31