The general framework for few-shot learning by kernel HyperNetworks

被引:3
作者
Sendera, Marcin [1 ,3 ]
Przewiezlikowski, Marcin [1 ,3 ,4 ]
Miksa, Jan [1 ]
Rajski, Mateusz [1 ]
Karanowski, Konrad [2 ]
Zieba, Maciej
Tabor, Jacek [1 ]
Spurek, Przemyslaw [1 ]
机构
[1] Jagiellonian Univ, Fac Math & Comp Sci, Lojasiewicza 6, Krakow, Poland
[2] Wroclaw Univ Sci & Technol, Dept Artificial Intelligence, Wyb Wyspianskiego 27, Wroclaw, Poland
[3] Jagiellonian Univ, Doctoral Sch Exact & Nat Sci, Lojasiewicza 11, Krakow, Poland
[4] IDEAS NCBR, Chmielna 69, Warsaw, Poland
关键词
Few-shot learning; Meta-learning; HyperNetworks; Kernel methods; Bayesian neural networks;
D O I
10.1007/s00138-023-01403-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Few-shot models aim at making predictions using a minimal number of labeled examples from a given task. The main challenge in this area is the one-shot setting, where only one element represents each class. We propose the general framework for few-shot learning via kernel HyperNetworks-the fusion of kernels and hypernetwork paradigm. Firstly, we introduce the classical realization of this framework, dubbed HyperShot. Compared to reference approaches that apply a gradient-based adjustment of the parameters, our models aim to switch the classification module parameters depending on the task's embedding. In practice, we utilize a hypernetwork, which takes the aggregated information from support data and returns the classifier's parameters handcrafted for the considered problem. Moreover, we introduce the kernel-based representation of the support examples delivered to hypernetwork to create the parameters of the classification module. Consequently, we rely on relations between the support examples' embeddings instead of the backbone models' direct feature values. Thanks to this approach, our model can adapt to highly different tasks. While such a method obtains very good results, it is limited by typical problems such as poorly quantified uncertainty due to limited data size. We further show that incorporating Bayesian neural networks into our general framework, an approach we call BayesHyperShot, solves this issue.
引用
收藏
页数:16
相关论文
共 54 条
  • [21] Li ZG, 2017, Arxiv, DOI arXiv:1707.09835
  • [22] Nguyen C, 2020, IEEE WINT CONF APPL, P3079, DOI 10.1109/WACV45572.2020.9093536
  • [23] Nichol A, 2018, Arxiv, DOI arXiv:1803.02999
  • [24] Oreshkin BN, 2018, ADV NEUR IN, V31
  • [25] Patacchiola M., 2020, Advances in Neural Information Pro-cessing Systems, V33, P16108
  • [26] Temporal-Relational CrossTransformers for Few-Shot Action Recognition
    Perrett, Toby
    Masullo, Alessandro
    Burghardt, Tilo
    Mirmehdi, Majid
    Damen, Dima
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 475 - 484
  • [27] Przewiezlikowski M, 2024, Arxiv, DOI [arXiv:2205.15745, 10.48550/ARXIV.2205.15745, DOI 10.48550/ARXIV.2205.15745]
  • [28] Few-Shot Image Recognition by Predicting Parameters from Activations
    Qiao, Siyuan
    Liu, Chenxi
    Shen, Wei
    Yuille, Alan
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 7229 - 7238
  • [29] Rajasegaran J., 2020, ABS201009291 CORR
  • [30] Rajeswaran A, 2019, ADV NEUR IN, V32