Revisiting Self-Training for Few-Shot Learning of Language Model

被引：0

作者：

Chen, Yiming ^{[1
,2
]}

Zhang, Yan ^{[1
]}

Zhang, Chen ^{[1
]}

Lee, Grandee ^{[1
]}

Cheng, Ran ^{[2
]}

Li, Haizhou ^{[1
,3
,4
]}

机构：

[1] Natl Univ Singapore, Singapore, Singapore

[2] Southern Univ Sci & Technol, Shenzhen, Peoples R China

[3] Chinese Univ Hong Kong Shenzhen, Shenzhen, Peoples R China

[4] Kriston AI Lab, Shenzhen, Peoples R China

来源：

2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021) | 2021年

基金：

中国国家自然科学基金; 新加坡国家研究基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

As unlabeled data carry rich task-relevant information, they are proven useful for fewshot learning of language model. The question is how to effectively make use of such data. In this work, we revisit the self-training technique for language model fine-tuning and present a state-of-the-art prompt-based few-shot learner, SFLM. Given two views of a text sample via weak and strong augmentation techniques, SFLM generates a pseudo label on the weakly augmented version. Then, the model predicts the same pseudo label when fine-tuned with the strongly augmented version. This simple approach is shown to outperform other state-of-the-art supervised and semi-supervised counterparts on six sentence classification and six sentence-pair classification benchmarking tasks. In addition, SFLM only relies on a few in-domain unlabeled data. We conduct a comprehensive analysis to demonstrate the robustness of our proposed approach under various settings, including augmentation techniques, model scale, and fewshot knowledge transfer across tasks.(1)

引用

页码：9125 / 9135

页数：11

共 50 条

[1]

Bansal T, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), P522

[2]

Bao Yujia, 2020, INT C LEARN REPR

[3]

Bentivogli Luisa, 2009, The Fifth PASCAL Recognizing Textual Entailment Challenge

[4]

Blum A., 1998, Proceedings of the Eleventh Annual Conference on Computational Learning Theory, P92, DOI 10.1145/279943.279962

[5]

Bowman Samuel R., 2015, P 2015 C EMPIRICAL M, P632, DOI 10.18653/v1/D15-1075

[6]

Brown Tom, 2020, NeurIPS

[7]

Clark K., P INT C LEARN REPR I, P1

[8]

Conneau A, 2018, PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), P1699

[9] AutoAugment: Learning Augmentation Strategies from Data [J].

Cubuk, Ekin D. ;

Zoph, Barret ;

Mane, Dandelion ;

Vasudevan, Vijay ;

Le, Quoc V. .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :113-123

[10]

Dagan I., 2005, MACHINE LEARNING CHA, P177

← 1 2 3 4 5 →