Non-Autoregressive Text Generation with Pre-trained Language Models

被引:0
作者
Su, Yixuan [1 ]
Cai, Deng [2 ]
Wang, Yan [3 ]
Vandyke, David [4 ]
Baker, Simon [1 ]
Li, Piji [3 ]
Collier, Nigel [1 ]
机构
[1] Univ Cambridge, Language Technol Lab, Cambridge, England
[2] Chinese Univ Hong Kong, Hong Kong, Peoples R China
[3] Tencent AI Lab, Bellevue, WA USA
[4] Apple, Cupertino, CA USA
来源
16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021) | 2021年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Non-autoregressive generation (NAG) has recently attracted great attention due to its fast inference speed. However, the generation quality of existing NAG models still lags behind their autoregressive counterparts. In this work, we show that BERT can be employed as the backbone of a NAG model to greatly improve performance. Additionally, we devise mechanisms to alleviate the two common problems of vanilla NAG models: the inflexibility of prefixed output length and the conditional independence of individual token predictions. Lastly, to further increase the speed advantage of the proposed model, we propose a new decoding strategy, ratio-first, for applications where the output lengths can be approximately estimated beforehand. For a comprehensive evaluation, we test the proposed model on three text generation tasks, including text summarization, sentence compression and machine translation. Experimental results show that our model significantly outperforms existing non-autoregressive baselines and achieves competitive performance with many strong autoregressive models. In addition, we also conduct extensive analysis experiments to reveal the effect of each proposed component.(1)
引用
收藏
页码:234 / 243
页数:10
相关论文
共 28 条
  • [1] [Anonymous], 2017, P 2017 C EMP METH NA, DOI DOI 10.18653/V1/D17-1222
  • [2] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
  • [3] Filippova K., 2013, P 2013 C EMP METH NA, P1481
  • [4] Filippova Katja, 2015, P 2015 C EMP METH NA, P360, DOI DOI 10.18653/V1/D15-1042
  • [5] Gehring J, 2017, PR MACH LEARN RES, V70
  • [6] Ghazvininejad M, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P6112
  • [7] Gu Jiatao, 2018, INT C LEARN REPR
  • [8] Guo JL, 2019, AAAI CONF ARTIF INTE, P3723
  • [9] Kamigaito H., 2018, Long Papers, V1, P1716, DOI 10.18653/v1/N18
  • [10] Kingma D. P., 2014, 3 INT C LEARNING REP