Distillation of encoder-decoder transformers for sequence labelling

被引:0
作者
Farina, Marco [1 ]
Pappadopulo, Duccio [1 ]
Gupta, Anant [1 ]
Huang, Leslie [1 ]
Irsoy, Ozan [1 ]
Solorio, Thamar [1 ,2 ]
机构
[1] Bloomberg, New York, NY USA
[2] Univ Houston, Dept Comp Sci, Houston, TX USA
来源
17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023 | 2023年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Driven by encouraging results on a wide range of tasks, the field of NLP is experiencing an accelerated race to develop bigger language models. This race for bigger models has also underscored the need to continue the pursuit of practical distillation approaches that can leverage the knowledge acquired by these big models in a compute-efficient manner. Having this goal in mind, we build on recent work to propose a hallucination-free framework for sequence tagging that is especially suited for distillation. We show empirical results of new state-of-the-art performance across multiple sequence labelling datasets and validate the usefulness of this framework for distilling a large model in a few-shot learning scenario.
引用
收藏
页码:2539 / 2549
页数:11
相关论文
共 37 条
[1]  
Alec RadfordKarthik Narasimhan., 2018, IMPROVING LANGUAGE U
[2]  
Artetxe Mikel., 2021, Efficient large scale language modeling with mixtures of experts
[3]  
Athiwaratkun B, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), P375
[4]  
Black Sid., 2021, GPT-Neo: Large Scale Autoregressive Language Modeling with MeshTensorflow
[5]  
Brown TB, 2020, ADV NEUR IN, V33
[6]  
Bucila Cristian., 2006, P 12 ACM SIGKDD INT, P535, DOI DOI 10.1145/1150402.1150464
[7]  
Chowdhery A, 2022, Arxiv, DOI arXiv:2204.02311
[8]  
Coucke A., 2018, CORR
[9]  
De Cao N., 2020, AUTOREGRESSIVE ENTIT
[10]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171