Theoretical Convergence of Multi-Step Model-Agnostic Meta-Learning

被引：0

作者：

Ji, Kaiyi ^{[1
]}

Yang, Junjie ^{[1
]}

Liang, Yingbin ^{[1
]}

机构：

[1] Ohio State Univ, Dept Elect & Comp Engn, Columbus, OH 43210 USA

来源：

JOURNAL OF MACHINE LEARNING RESEARCH | 2022年 / 23卷

基金：

美国国家科学基金会;

关键词：

Computational complexity; convergence rate; finite-sum; meta-learning; multi-step MAML; nonconvex; resampling;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

As a popular meta-learning approach, the model-agnostic meta-learning (MAML) algo-rithm has been widely used due to its simplicity and effectiveness. However, the conver-gence of the general multi-step MAML still remains unexplored. In this paper, we develop a new theoretical framework to provide such convergence guarantee for two types of objective functions that are of interest in practice: (a) resampling case (e.g., reinforcement learning), where loss functions take the form in expectation and new data are sampled as the algo-rithm runs; and (b) finite-sum case (e.g., supervised learning), where loss functions take the finite-sum form with given samples. For both cases, we characterize the convergence rate and the computational complexity to attain an epsilon-accurate solution for multi-step MAML in the general nonconvex setting. In particular, our results suggest that an inner-stage stepsize needs to be chosen inversely proportional to the number N of inner-stage steps in order for N-step MAML to have guaranteed convergence. From the technical perspective, we develop novel techniques to deal with the nested structure of the meta gradient for multi-step MAML, which can be of independent interest.

引用

页数：41

共 52 条

[1] Al-Shedivat M., 2018, INT C LEARN REPR, P1
[2] Alquier P, 2017, PR MACH LEARN RES, V54, P261
[3] [Anonymous], 2018, ARXIV PREPRINT ARXIV
[4] Antoniou Antreas, 2019, ICLR
[5] Arora S., 2020, PR MACH LEARN RES, P367
[6] Infinite-horizon policy-gradient estimation
Baxter, J
Bartlett, PL
[J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2001, 15 : 319 - 350
[7] Bengio Y., 1991, IJCNN-91-Seattle: International Joint Conference on Neural Networks (Cat. No.91CH3049-4), DOI 10.1109/IJCNN.1991.155621
[8] Chen Fei, 2018, ARXIV180207876
[9] Collins Liam, 2020, ARXIV PREPRINT ARXIV
[10] Denevi G, 2018, ADV NEUR IN, V31

← 1 2 3 4 5 6 →