Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation

被引:0
作者
Wang, Jixuan [1 ,2 ,3 ]
Wang, Kuan-Chieh [1 ,2 ]
Rudzicz, Frank [1 ,2 ,4 ]
Brudno, Michael [1 ,2 ,3 ]
机构
[1] Univ Toronto, Toronto, ON, Canada
[2] Vector Inst, Toronto, ON, Canada
[3] Univ Hlth Network, Toronto, ON, Canada
[4] Unity Hlth Toronto, Toronto, ON, Canada
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021) | 2021年 / 34卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large pretrained language models (LMs) like BERT have improved performance in many disparate natural language processing (NLP) tasks. However, fine tuning such models requires a large number of training examples for each target task. Simultaneously, many realistic NLP problems are "few shot", without a sufficiently large training set. In this work, we propose a novel conditional neural process-based approach for few-shot text classification that learns to transfer from other diverse tasks with rich annotation. Our key idea is to represent each task using gradient information from a base model and to train an adaptation network that modulates a text classifier conditioned on the task representation. While previous task-aware few-shot learners represent tasks by input encoding, our novel task representation is more powerful, as the gradient captures input-output relationships of a task. Experimental results show that our approach outperforms traditional fine-tuning, sequential transfer learning, and state-of-the-art meta learning approaches on a collection of diverse few-shot tasks. We further conducted analysis and ablations to justify our design choices.
引用
收藏
页数:13
相关论文
共 56 条
[1]   TASK2VEC: Task Embedding for Meta-Learning [J].
Achille, Alessandro ;
Lam, Michael ;
Tewari, Rahul ;
Ravichandran, Avinash ;
Maji, Subhransu ;
Fowlkes, Charless ;
Soatto, Stefano ;
Perona, Pietro .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6439-6448
[2]  
Adragna R., 2020, ARXIV201106485
[3]  
[Anonymous], Yelp Open Dataset
[4]   Learning to Forget for Meta-Learning [J].
Baik, Sungyong ;
Hong, Seokil ;
Lee, Kyoung Mu .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :2376-2384
[5]  
Bansal T, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), P522
[6]  
Bansal Trapit, 2020, P 28 INT C COMP LING, P5108
[7]  
Bao Yujia, 2020, INT C LEARNING REPRE
[8]  
Bingel J., 2017, Short Papers, V2, P164
[9]  
Blitzer J., 2007, P 45 ANN M ASS COMP, P440
[10]  
Bluche T., 2018, ABS180510190 CORR