Theoretical Convergence of Multi-Step Model-Agnostic Meta-Learning

被引:0
作者
Ji, Kaiyi [1 ]
Yang, Junjie [1 ]
Liang, Yingbin [1 ]
机构
[1] Ohio State Univ, Dept Elect & Comp Engn, Columbus, OH 43210 USA
基金
美国国家科学基金会;
关键词
Computational complexity; convergence rate; finite-sum; meta-learning; multi-step MAML; nonconvex; resampling;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As a popular meta-learning approach, the model-agnostic meta-learning (MAML) algo-rithm has been widely used due to its simplicity and effectiveness. However, the conver-gence of the general multi-step MAML still remains unexplored. In this paper, we develop a new theoretical framework to provide such convergence guarantee for two types of objective functions that are of interest in practice: (a) resampling case (e.g., reinforcement learning), where loss functions take the form in expectation and new data are sampled as the algo-rithm runs; and (b) finite-sum case (e.g., supervised learning), where loss functions take the finite-sum form with given samples. For both cases, we characterize the convergence rate and the computational complexity to attain an epsilon-accurate solution for multi-step MAML in the general nonconvex setting. In particular, our results suggest that an inner-stage stepsize needs to be chosen inversely proportional to the number N of inner-stage steps in order for N-step MAML to have guaranteed convergence. From the technical perspective, we develop novel techniques to deal with the nested structure of the meta gradient for multi-step MAML, which can be of independent interest.
引用
收藏
页数:41
相关论文
共 52 条
  • [1] Al-Shedivat M., 2018, INT C LEARN REPR, P1
  • [2] Alquier P, 2017, PR MACH LEARN RES, V54, P261
  • [3] [Anonymous], 2018, ARXIV PREPRINT ARXIV
  • [4] Antoniou Antreas, 2019, ICLR
  • [5] Arora S., 2020, PR MACH LEARN RES, P367
  • [6] Infinite-horizon policy-gradient estimation
    Baxter, J
    Bartlett, PL
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2001, 15 : 319 - 350
  • [7] Bengio Y., 1991, IJCNN-91-Seattle: International Joint Conference on Neural Networks (Cat. No.91CH3049-4), DOI 10.1109/IJCNN.1991.155621
  • [8] Chen Fei, 2018, ARXIV180207876
  • [9] Collins Liam, 2020, ARXIV PREPRINT ARXIV
  • [10] Denevi G, 2018, ADV NEUR IN, V31