True Few-Shot Learning with Language Models

被引:0
|
作者
Perez, Ethan [1 ]
Kiela, Douwe [2 ]
Cho, Kyunghyun [1 ,3 ]
机构
[1] NYU, New York, NY 10003 USA
[2] Facebook AI Res, Menlo Pk, CA USA
[3] Learning Machines & Brains, Toronto, ON, Canada
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021) | 2021年
关键词
INFORMATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pretrained language models (LMs) perform well on many tasks even when learning from a few examples, but prior work uses many held-out examples to tune various aspects of learning, such as hyperparameters, training objectives, and natural language templates ("prompts"). Here, we evaluate the few-shot ability of LMs when such held-out examples are unavailable, a setting we call true few-shot learning. We test two model selection criteria, cross-validation and minimum description length, for choosing LM prompts and hyperparameters in the true few-shot setting. On average, both marginally outperform random selection and greatly underperform selection based on held-out examples. Moreover, selection criteria often prefer models that perform significantly worse than randomly-selected ones. We find similar results even when taking into account our uncertainty in a model's true performance during selection, as well as when varying the amount of computation and number of examples used for selection. Overall, our findings suggest that prior work significantly overestimated the true few-shot ability of LMs given the difficulty of few-shot model selection.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Multimodal Few-Shot Learning with Frozen Language Models
    Tsimpoukelli, Maria
    Menick, Jacob
    Cabi, Serkan
    Eslami, S. M. Ali
    Vinyals, Oriol
    Hill, Felix
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [2] Language Models are Few-Shot Butlers
    Micheli, Vincent
    Fleuret, Francois
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 9312 - 9318
  • [3] Language Models are Few-Shot Learners
    Brown, Tom B.
    Mann, Benjamin
    Ryder, Nick
    Subbiah, Melanie
    Kaplan, Jared
    Dhariwal, Prafulla
    Neelakantan, Arvind
    Shyam, Pranav
    Sastry, Girish
    Askell, Amanda
    Agarwal, Sandhini
    Herbert-Voss, Ariel
    Krueger, Gretchen
    Henighan, Tom
    Child, Rewon
    Ramesh, Aditya
    Ziegler, Daniel M.
    Wu, Jeffrey
    Winter, Clemens
    Hesse, Christopher
    Chen, Mark
    Sigler, Eric
    Litwin, Mateusz
    Gray, Scott
    Chess, Benjamin
    Clark, Jack
    Berner, Christopher
    McCandlish, Sam
    Radford, Alec
    Sutskever, Ilya
    Amodei, Dario
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [4] ATLAS: Few-shot Learning with Retrieval Augmented Language Models
    Izacard, Gautier
    Lewis, Patrick
    Lomeli, Maria
    Hosseini, Lucas
    Petroni, Fabio
    Schick, Timo
    Dwivedi-Yu, Jane
    Joulin, Armand
    Riedel, Sebastian
    Grave, Edouard
    JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
  • [5] Learning Meta Soft Prompt for Few-Shot Language Models
    Chien, Jen-Tzung
    Chen, Ming-Yen
    Xue, Jing-Hao
    2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 57 - 62
  • [6] Few-shot Subgoal Planning with Language Models
    Logeswaran, Lajanugen
    Fu, Yao
    Lee, Moontae
    Lee, Honglak
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 5493 - 5506
  • [7] A few-shot learning method based on knowledge graph in large language models
    Wang, Feilong
    Shi, Donghui
    Aguilar, Jose
    Cui, Xinyi
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2024,
  • [8] PERFECT: Prompt-free and Efficient Few-shot Learning with Language Models
    Mahabadi, Rabeeh Karimi
    Zettlemoyer, Luke
    Henderson, James
    Saeidi, Marzieh
    Mathias, Lambert
    Stoyanov, Veselin
    Yazdani, Majid
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 3638 - 3652
  • [9] Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models
    Logan, Robert L.
    Balazevic, Ivana
    Wallace, Eric
    Petroni, Fabio
    Singh, Sameer
    Riedel, Sebastian
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 2824 - 2835
  • [10] Large Language Models Enable Few-Shot Clustering
    Viswanathan, Vijay
    Gashteovski, Kiril
    Lawrence, Carolin
    Wu, Tongshuang
    Neubig, Graham
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2024, 12 : 321 - 333