In-Context Learning Creates Task Vectors

被引:0
|
作者
Hendel, Roee [1 ]
Geva, Mor [2 ]
Globerson, Amir [1 ,3 ]
机构
[1] Tel Aviv Univ, Tel Aviv, Israel
[2] Google DeepMind, Mountain View, CA USA
[3] Google, Mountain View, CA 94043 USA
来源
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023) | 2023年
基金
欧洲研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In-context learning (ICL) in Large Language Models (LLMs) has emerged as a powerful new learning paradigm. However, its underlying mechanism is still not well understood. In particular, it is challenging to map it to the "standard" machine learning framework, where one uses a training set S to find a best-fitting function f(x) in some hypothesis class. Here we make progress on this problem by showing that the functions learned by ICL often have a very simple structure: they correspond to the transformer LLM whose only inputs are the query x and a single "task vector" calculated from the training set. Thus, ICL can be seen as compressing S into a single task vector.(S) and then using this task vector to modulate the transformer to produce the output. We support the above claim via comprehensive experiments across a range of models and tasks.
引用
收藏
页码:9318 / 9333
页数:16
相关论文
共 50 条
  • [1] What In-Context Learning "Learns" In-Context: Disentangling Task Recognition and Task Learning
    Pan, Jane
    Gao, Tianyu
    Chen, Howard
    Chen, Danqi
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 8298 - 8319
  • [2] In-Context In-Context Learning with Transformer Neural Processes
    Ashman, Matthew
    Diaconu, Cristiana
    Weller, Adrian
    Turner, Richard E.
    SYMPOSIUM ON ADVANCES IN APPROXIMATE BAYESIAN INFERENCE, 2024, 253 : 1 - 29
  • [3] The Learnability of In-Context Learning
    Wies, Noam
    Levine, Yoav
    Shashua, Amnon
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [4] A glance at in-context learning
    Wu, Yongliang
    Yang, Xu
    FRONTIERS OF COMPUTER SCIENCE, 2024, 18 (05)
  • [5] Transformers as Statisticians: Provable In-Context Learning with In-Context Algorithm Selection
    Bai, Yu
    Chen, Fan
    Wang, Huan
    Xiong, Caiming
    Mei, Song
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [6] Learning To Retrieve Prompts for In-Context Learning
    Rubin, Ohad
    Herzig, Jonathan
    Berant, Jonathan
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 2655 - 2671
  • [7] In-context learning of state estimators
    Busetto, R.
    Breschi, V.
    Forgione, M.
    Piga, D.
    Formentin, S.
    IFAC PAPERSONLINE, 2024, 58 (15): : 145 - 150
  • [8] Generative Calibration for In-context Learning
    Jiang, Zhongtao
    Zhang, Yuanzhe
    Liu, Cao
    Zhao, Jun
    Liu, Kang
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 2312 - 2333
  • [9] Distinguishability Calibration to In-Context Learning
    Li, Hongjing
    Yan, Hanqi
    Li, Yanran
    Qian, Li
    He, Yulan
    Gui, Lin
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 1385 - 1397
  • [10] Pretraining task diversity and the emergence of non-Bayesian in-context learning for regression
    Raventos, Allan
    Paul, Mansheej
    Chen, Feng
    Ganguli, Surya
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,