Meta-training with Demonstration Retrieval for Efficient Few-shot Learning

被引:0
作者
Mueller, Aaron [1 ]
Narang, Kanika [2 ]
Mathias, Lambert [2 ]
Wang, Qifan [2 ]
Firooz, Hamed [2 ]
机构
[1] Johns Hopkins Univ, Baltimore, MD 21218 USA
[2] Meta AI, Menlo Pk, CA USA
来源
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023 | 2023年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large language models show impressive results on few-shot NLP tasks. However, these models are memory and computation-intensive. Meta-training allows one to leverage smaller models for few-shot generalization in a domain-general and task-agnostic manner (Min et al., 2022a; Wei et al., 2022; Chen et al., 2022); however, these methods alone results in models that may not have sufficient parameterization or knowledge to adapt quickly to a large variety of tasks. To overcome this issue, we propose meta-training with demonstration retrieval, where we use a dense passage retriever to retrieve semantically similar labeled demonstrations to each example for more varied supervision. By separating external knowledge from model parameters, we can use meta-training to train parameter-efficient models that generalize well on a larger variety of tasks. We construct a meta-training set from UNIFIEDQA and CROSSFIT, and propose a demonstration bank based on UNIFIEDQA tasks. To our knowledge, our work is the first to combine retrieval with meta-training, to use DPR models to retrieve demonstrations, and to leverage demonstrations from many tasks simultaneously, rather than randomly sampling demonstrations from the training set of the target task. Our approach outperforms a variety of targeted parameter-efficient and retrieval-augmented few-shot methods on QA, NLI, and text classification tasks (including SQuAD, QNLI, and TREC). Our approach can be metatrained and fine-tuned quickly on a single GPU.
引用
收藏
页码:6049 / 6064
页数:16
相关论文
共 50 条
  • [31] Meta-BN Net for few-shot learning
    Wei Gao
    Mingwen Shao
    Jun Shu
    Xinkai Zhuang
    Frontiers of Computer Science, 2023, 17
  • [32] Meta-BN Net for few-shot learning
    Wei GAO
    Mingwen SHAO
    Jun SHU
    Xinkai ZHUANG
    Frontiers of Computer Science, 2023, 17 (01) : 76 - 83
  • [33] StyleAdv: Meta Style Adversarial Training for Cross-Domain Few-Shot Learning
    Fu, Yuqian
    Xie, Yu
    Fu, Yanwei
    Jiang, Yu-Gang
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 24575 - 24584
  • [34] Task Agnostic Meta-Learning for Few-Shot Learning
    Jamal, Muhammad Abdullah
    Qi, Guo-Jun
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 11711 - 11719
  • [35] Few-Shot Conversational Dense Retrieval
    Yu, Shi
    Liu, Zhenghao
    Xiong, Chenyan
    Feng, Tao
    Liu, Zhiyuan
    SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 829 - 838
  • [36] Meta-RCNN: Meta Learning for Few-Shot Object Detection
    Wu, Xiongwei
    Sahoo, Doyen
    Hoi, Steven
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1679 - 1687
  • [37] Few-Shot Learning on Graph Convolutional Network Based on Meta learning
    Liu X.-L.
    Feng L.
    Liao L.-X.
    Gong X.
    Su H.
    Wang J.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2024, 52 (03): : 885 - 897
  • [38] Adversarially Robust Few-Shot Learning: A Meta-Learning Approach
    Goldblum, Micah
    Fowl, Liam
    Goldstein, Tom
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS (NEURIPS 2020), 2020, 33
  • [39] MetaDiff: Meta-Learning with Conditional Diffusion for Few-Shot Learning
    Zhang, Baoquan
    Luo, Chuyao
    Yu, Demin
    Li, Xutao
    Lin, Huiwei
    Ye, Yunming
    Zhang, Bowen
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 15, 2024, : 16687 - 16695
  • [40] Variational Few-Shot Learning
    Zhang, Jian
    Zhao, Chenglong
    Ni, Bingbing
    Xu, Minghao
    Yang, Xiaokang
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 1685 - 1694