Meta-training with Demonstration Retrieval for Efficient Few-shot Learning

被引:0
作者
Mueller, Aaron [1 ]
Narang, Kanika [2 ]
Mathias, Lambert [2 ]
Wang, Qifan [2 ]
Firooz, Hamed [2 ]
机构
[1] Johns Hopkins Univ, Baltimore, MD 21218 USA
[2] Meta AI, Menlo Pk, CA USA
来源
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023 | 2023年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large language models show impressive results on few-shot NLP tasks. However, these models are memory and computation-intensive. Meta-training allows one to leverage smaller models for few-shot generalization in a domain-general and task-agnostic manner (Min et al., 2022a; Wei et al., 2022; Chen et al., 2022); however, these methods alone results in models that may not have sufficient parameterization or knowledge to adapt quickly to a large variety of tasks. To overcome this issue, we propose meta-training with demonstration retrieval, where we use a dense passage retriever to retrieve semantically similar labeled demonstrations to each example for more varied supervision. By separating external knowledge from model parameters, we can use meta-training to train parameter-efficient models that generalize well on a larger variety of tasks. We construct a meta-training set from UNIFIEDQA and CROSSFIT, and propose a demonstration bank based on UNIFIEDQA tasks. To our knowledge, our work is the first to combine retrieval with meta-training, to use DPR models to retrieve demonstrations, and to leverage demonstrations from many tasks simultaneously, rather than randomly sampling demonstrations from the training set of the target task. Our approach outperforms a variety of targeted parameter-efficient and retrieval-augmented few-shot methods on QA, NLI, and text classification tasks (including SQuAD, QNLI, and TREC). Our approach can be metatrained and fine-tuned quickly on a single GPU.
引用
收藏
页码:6049 / 6064
页数:16
相关论文
共 50 条
  • [41] Defensive Few-Shot Learning
    Li, Wenbin
    Wang, Lei
    Zhang, Xingxing
    Qi, Lei
    Huo, Jing
    Gao, Yang
    Luo, Jiebo
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (05) : 5649 - 5667
  • [42] Imbalanced Few-Shot Learning Based on Meta-transfer Learning
    Chu, Yan
    Sun, Xianghui
    Jiang Songhao
    Xie, Tianwen
    Wang, Zhengkui
    Shan, Wen
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VIII, 2023, 14261 : 357 - 369
  • [43] Stress Testing of Meta-learning Approaches for Few-shot Learning
    Aimen, Aroof
    Sidheekh, Sahil
    Madan, Vineet
    Krishnan, Narayanan C.
    AAAI WORKSHOP ON META-LEARNING AND METADL CHALLENGE, VOL 140, 2021, 140 : 38 - 44
  • [44] Fractal Few-Shot Learning
    Zhou, Fobao
    Huang, Wenkai
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (11) : 16353 - 16367
  • [45] Survey on Few-shot Learning
    Zhao K.-L.
    Jin X.-L.
    Wang Y.-Z.
    Ruan Jian Xue Bao/Journal of Software, 2021, 32 (02): : 349 - 369
  • [46] Variational Few-Shot Learning
    Zhang, Jian
    Zhao, Chenglong
    Ni, Bingbing
    Xu, Minghao
    Yang, Xiaokang
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 1685 - 1694
  • [47] Reinforced Self-Supervised Training for Few-Shot Learning
    Yan, Zhichao
    An, Yuexuan
    Xue, Hui
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 731 - 735
  • [48] Hybrid Consistency Training with Prototype Adaptation for Few-Shot Learning
    Ye, Meng
    Lin, Xiao
    Burachas, Giedrius
    Divakaran, Ajay
    Yao, Yi
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 2725 - 2734
  • [49] Pareto Self-Supervised Training for Few-Shot Learning
    Chen, Zhengyu
    Ge, Jixie
    Zhan, Heshen
    Huang, Siteng
    Wang, Donglin
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 13658 - 13667
  • [50] Fast Few-Shot Classification by Few-Iteration Meta-Learning
    Tripathi, Ardhendu Shekhar
    Danelljan, Martin
    Van Gool, Luc
    Timofte, Radu
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 9522 - 9528