Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation

被引：0

作者：

Wang, Jixuan ^{[1
,2
,3
]}

Wang, Kuan-Chieh ^{[1
,2
]}

Rudzicz, Frank ^{[1
,2
,4
]}

Brudno, Michael ^{[1
,2
,3
]}

机构：

[1] Univ Toronto, Toronto, ON, Canada

[2] Vector Inst, Toronto, ON, Canada

[3] Univ Hlth Network, Toronto, ON, Canada

[4] Unity Hlth Toronto, Toronto, ON, Canada

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021) | 2021年 / 34卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Large pretrained language models (LMs) like BERT have improved performance in many disparate natural language processing (NLP) tasks. However, fine tuning such models requires a large number of training examples for each target task. Simultaneously, many realistic NLP problems are "few shot", without a sufficiently large training set. In this work, we propose a novel conditional neural process-based approach for few-shot text classification that learns to transfer from other diverse tasks with rich annotation. Our key idea is to represent each task using gradient information from a base model and to train an adaptation network that modulates a text classifier conditioned on the task representation. While previous task-aware few-shot learners represent tasks by input encoding, our novel task representation is more powerful, as the gradient captures input-output relationships of a task. Experimental results show that our approach outperforms traditional fine-tuning, sequential transfer learning, and state-of-the-art meta learning approaches on a collection of diverse few-shot tasks. We further conducted analysis and ablations to justify our design choices.

引用

页数：13

共 56 条

[1] TASK2VEC: Task Embedding for Meta-Learning [J].

Achille, Alessandro ;

Lam, Michael ;

Tewari, Rahul ;

Ravichandran, Avinash ;

Maji, Subhransu ;

Fowlkes, Charless ;

Soatto, Stefano ;

Perona, Pietro .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6439-6448

[2]

Adragna R., 2020, ARXIV201106485

[3]

[Anonymous], Yelp Open Dataset

[4] Learning to Forget for Meta-Learning [J].

Baik, Sungyong ;

Hong, Seokil ;

Lee, Kyoung Mu .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :2376-2384

[5]

Bansal T, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), P522

[6]

Bansal Trapit, 2020, P 28 INT C COMP LING, P5108

[7]

Bao Yujia, 2020, INT C LEARNING REPRE

[8]

Bingel J., 2017, Short Papers, V2, P164

[9]

Blitzer J., 2007, P 45 ANN M ASS COMP, P440

[10]

Bluche T., 2018, ABS180510190 CORR

← 1 2 3 4 5 6 →