Revisiting Self-Training for Few-Shot Learning of Language Model

被引:0
作者
Chen, Yiming [1 ,2 ]
Zhang, Yan [1 ]
Zhang, Chen [1 ]
Lee, Grandee [1 ]
Cheng, Ran [2 ]
Li, Haizhou [1 ,3 ,4 ]
机构
[1] Natl Univ Singapore, Singapore, Singapore
[2] Southern Univ Sci & Technol, Shenzhen, Peoples R China
[3] Chinese Univ Hong Kong Shenzhen, Shenzhen, Peoples R China
[4] Kriston AI Lab, Shenzhen, Peoples R China
来源
2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021) | 2021年
基金
中国国家自然科学基金; 新加坡国家研究基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As unlabeled data carry rich task-relevant information, they are proven useful for fewshot learning of language model. The question is how to effectively make use of such data. In this work, we revisit the self-training technique for language model fine-tuning and present a state-of-the-art prompt-based few-shot learner, SFLM. Given two views of a text sample via weak and strong augmentation techniques, SFLM generates a pseudo label on the weakly augmented version. Then, the model predicts the same pseudo label when fine-tuned with the strongly augmented version. This simple approach is shown to outperform other state-of-the-art supervised and semi-supervised counterparts on six sentence classification and six sentence-pair classification benchmarking tasks. In addition, SFLM only relies on a few in-domain unlabeled data. We conduct a comprehensive analysis to demonstrate the robustness of our proposed approach under various settings, including augmentation techniques, model scale, and fewshot knowledge transfer across tasks.(1)
引用
收藏
页码:9125 / 9135
页数:11
相关论文
共 50 条
[1]  
Bansal T, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), P522
[2]  
Bao Yujia, 2020, INT C LEARN REPR
[3]  
Bentivogli Luisa, 2009, The Fifth PASCAL Recognizing Textual Entailment Challenge
[4]  
Blum A., 1998, Proceedings of the Eleventh Annual Conference on Computational Learning Theory, P92, DOI 10.1145/279943.279962
[5]  
Bowman Samuel R., 2015, P 2015 C EMPIRICAL M, P632, DOI 10.18653/v1/D15-1075
[6]  
Brown Tom, 2020, NeurIPS
[7]  
Clark K., P INT C LEARN REPR I, P1
[8]  
Conneau A, 2018, PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), P1699
[9]   AutoAugment: Learning Augmentation Strategies from Data [J].
Cubuk, Ekin D. ;
Zoph, Barret ;
Mane, Dandelion ;
Vasudevan, Vijay ;
Le, Quoc V. .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :113-123
[10]  
Dagan I., 2005, MACHINE LEARNING CHA, P177