Improving Transformer with Sequential Context Representations for Abstractive Text Summarization

被引:23
作者
Cai, Tian [1 ,2 ]
Shen, Mengjun [1 ,2 ]
Peng, Huailiang [1 ,2 ]
Jiang, Lei [1 ]
Dai, Qiong [1 ]
机构
[1] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Cyber Secur, Beijing, Peoples R China
来源
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING (NLPCC 2019), PT I | 2019年 / 11838卷
基金
美国国家科学基金会;
关键词
Transformer; Abstractive summarization;
D O I
10.1007/978-3-030-32233-5_40
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent dominant approaches for abstractive text summarization are mainly RNN-based encoder-decoder framework, these methods usually suffer from the poor semantic representations for long sequences. In this paper, we propose a new abstractive summarization model, called RC-Transformer (RCT). The model is not only capable of learning longterm dependencies, but also addresses the inherent shortcoming of Transformer on insensitivity to word order information. We extend the Transformer with an additional RNN-based encoder to capture the sequential context representations. In order to extract salient information effectively, we further construct a convolution module to filter the sequential context with local importance. The experimental results on Gigaword and DUC-2004 datasets show that our proposed model achieves the state-of-the-art performance, even without introducing external information. In addition, our model also owns an advantage in speed over the RNN-based models.
引用
收藏
页码:512 / 524
页数:13
相关论文
共 20 条
[1]  
Cao ZQ, 2018, PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, P152
[2]  
Chopra S., 2016, P 2016 C N AM CHAPTE, P93
[3]  
Dauphin YN, 2017, PR MACH LEARN RES, V70
[4]  
Gehring J, 2017, PR MACH LEARN RES, V70
[5]  
Li Piji, 2017, P 2017 C EMPIRICAL M, P2091, DOI [10.18653/v1/D17-1222, DOI 10.18653/V1/D17-1222]
[6]  
Lin C.-Y., 2004, P WORKSH TEXT SUMM A, P74
[7]  
Liu Peter J., 2018, P INT C LEARN REPR
[8]  
Nallapati R, 2016, Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, DOI DOI 10.18653/V1/K16-1028
[9]  
Nallapati R, 2017, AAAI CONF ARTIF INTE, P3075
[10]  
Paulus Romain, 2018, 6 INT C LEARN REPR I