Learning to Learn Task-Adaptive Hyperparameters for Few-Shot Learning

被引:27
作者
Baik, Sungyong [1 ]
Choi, Myungsub [2 ]
Choi, Janghoon [3 ]
Kim, Heewon [4 ]
Lee, Kyoung Mu [5 ]
机构
[1] Hanyang Univ, Dept Data Sci, Seoul 04763, South Korea
[2] Samsung Adv Inst Technol, Seoul 04763, South Korea
[3] Kyungpook Natl Univ, Grad Sch Data Sci, Seoul 41566, South Korea
[4] Soongsil Univ, Coll IT, Global Sch Media, Seoul 06978, South Korea
[5] Seoul Natl Univ, Automat & Syst Res Inst ASRI, Dept Elect & Comp Engn, Seoul 08826, South Korea
关键词
Task analysis; Optimization; Mathematical models; Adaptation models; Visualization; Training; Neural networks; Few-shot learning; MAML; meta-learning; video frame interpolation; visual tracking;
D O I
10.1109/TPAMI.2023.3261387
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The objective of few-shot learning is to design a system that can adapt to a given task with only few examples while achieving generalization. Model-agnostic meta-learning (MAML), which has recently gained the popularity for its simplicity and flexibility, learns a good initialization for fast adaptation to a task under few-data regime. However, its performance has been relatively limited especially when novel tasks are different from tasks previously seen during training. In this work, instead of searching for a better initialization, we focus on designing a better fast adaptation process. Consequently, we propose a new task-adaptive weight update rule that greatly enhances the fast adaptation process. Specifically, we introduce a small meta-network that can generate per-step hyperparameters for each given task: learning rate and weight decay coefficients. The experimental results validate that learning a good weight update rule for fast adaptation is the equally important component that has drawn relatively less attention in the recent few-shot learning approaches. Surprisingly, fast adaptation from random initialization with ALFA can already outperform MAML. Furthermore, the proposed weight-update rule is shown to consistently improve the task-adaptation capability of MAML across diverse problem domains: few-shot classification, cross-domain few-shot classification, regression, visual tracking, and video frame interpolation.
引用
收藏
页码:1441 / 1454
页数:14
相关论文
共 88 条
[1]  
Andrychowicz M, 2016, ADV NEUR IN, V29
[2]  
Antoniou A., 2019, PROC INT C LEARN REP
[3]  
Baik S, 2020, ADV NEUR IN, V33
[4]   Learning to Forget for Meta-Learning via Task-and-Layer-Wise Attenuation [J].
Baik, Sungyong ;
Oh, Junghoon ;
Hong, Seokil ;
Lee, Kyoung Mu .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (11) :7718-7730
[5]   Meta-Learning with Task-Adaptive Loss Function for Few-Shot Learning [J].
Baik, Sungyong ;
Choi, Janghoon ;
Kim, Heewon ;
Cho, Dohee ;
Min, Jaesik ;
Lee, Kyoung Mu .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :9445-9454
[6]   Learning to Forget for Meta-Learning [J].
Baik, Sungyong ;
Hong, Seokil ;
Lee, Kyoung Mu .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :2376-2384
[7]   A Database and Evaluation Methodology for Optical Flow [J].
Baker, Simon ;
Scharstein, Daniel ;
Lewis, J. P. ;
Roth, Stefan ;
Black, Michael J. ;
Szeliski, Richard .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2011, 92 (01) :1-31
[8]  
Bao WB, 2019, Arxiv, DOI arXiv:1810.08768
[9]   Depth-Aware Video Frame Interpolation [J].
Bao, Wenbo ;
Lai, Wei-Sheng ;
Ma, Chao ;
Zhang, Xiaoyun ;
Gao, Zhiyong ;
Yang, Ming-Hsuan .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3698-3707
[10]   Improved Few-Shot Visual Classification [J].
Bateni, Peyman ;
Goyal, Raghav ;
Masrani, Vaden ;
Wood, Frank ;
Sigal, Leonid .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :14481-14490