In-Context Learning Creates Task Vectors

被引：0

作者：

Hendel, Roee ^{[1
]}

Geva, Mor ^{[2
]}

Globerson, Amir ^{[1
,3
]}

机构：

[1] Tel Aviv Univ, Tel Aviv, Israel

[2] Google DeepMind, Mountain View, CA USA

[3] Google, Mountain View, CA 94043 USA

来源：

FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023) | 2023年

基金：

欧洲研究理事会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In-context learning (ICL) in Large Language Models (LLMs) has emerged as a powerful new learning paradigm. However, its underlying mechanism is still not well understood. In particular, it is challenging to map it to the "standard" machine learning framework, where one uses a training set S to find a best-fitting function f(x) in some hypothesis class. Here we make progress on this problem by showing that the functions learned by ICL often have a very simple structure: they correspond to the transformer LLM whose only inputs are the query x and a single "task vector" calculated from the training set. Thus, ICL can be seen as compressing S into a single task vector.(S) and then using this task vector to modulate the transformer to produce the output. We support the above claim via comprehensive experiments across a range of models and tasks.

引用

页码：9318 / 9333

页数：16

共 50 条

[1] What In-Context Learning "Learns" In-Context: Disentangling Task Recognition and Task Learning
Pan, Jane
Gao, Tianyu
Chen, Howard
Chen, Danqi
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 8298 - 8319
[2] In-Context In-Context Learning with Transformer Neural Processes
Ashman, Matthew
Diaconu, Cristiana
Weller, Adrian
Turner, Richard E.
SYMPOSIUM ON ADVANCES IN APPROXIMATE BAYESIAN INFERENCE, 2024, 253 : 1 - 29
[3] The Learnability of In-Context Learning
Wies, Noam
Levine, Yoav
Shashua, Amnon
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[4] A glance at in-context learning
Wu, Yongliang
Yang, Xu
FRONTIERS OF COMPUTER SCIENCE, 2024, 18 (05)
[5] Transformers as Statisticians: Provable In-Context Learning with In-Context Algorithm Selection
Bai, Yu
Chen, Fan
Wang, Huan
Xiong, Caiming
Mei, Song
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[6] Learning To Retrieve Prompts for In-Context Learning
Rubin, Ohad
Herzig, Jonathan
Berant, Jonathan
NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 2655 - 2671
[7] In-context learning of state estimators
Busetto, R.
Breschi, V.
Forgione, M.
Piga, D.
Formentin, S.
IFAC PAPERSONLINE, 2024, 58 (15): : 145 - 150
[8] Generative Calibration for In-context Learning
Jiang, Zhongtao
Zhang, Yuanzhe
Liu, Cao
Zhao, Jun
Liu, Kang
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 2312 - 2333
[9] Distinguishability Calibration to In-Context Learning
Li, Hongjing
Yan, Hanqi
Li, Yanran
Qian, Li
He, Yulan
Gui, Lin
17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 1385 - 1397
[10] Pretraining task diversity and the emergence of non-Bayesian in-context learning for regression
Raventos, Allan
Paul, Mansheej
Chen, Feng
Ganguli, Surya
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,

← 1 2 3 4 5 →