A Semantic Supervision Method for Abstractive Summarization

被引:4
作者
Hu, Sunqiang [1 ]
Li, Xiaoyu [1 ]
Deng, Yu [1 ]
Peng, Yu [1 ]
Lin, Bin [2 ]
Yang, Shan [3 ]
机构
[1] Univ Elect Sci & Technol China, Sch Informat & Software Engn, Chengdu 610054, Peoples R China
[2] Sichuan Normal Univ, Sch Engn, Chengdu 610066, Peoples R China
[3] Jackson State Univ, Dept Chem Phys & Atmospher Sci, Jackson, MS 39217 USA
来源
CMC-COMPUTERS MATERIALS & CONTINUA | 2021年 / 69卷 / 01期
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Text summarization; semantic supervision; capsule network;
D O I
10.32604/cmc.2021.017441
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, many text summarization models based on pretraining methods have achieved very good results. However, in these text summarization models, semantic deviations are easy to occur between the original input representation and the representation that passed multi-layer encoder, which may result in inconsistencies between the generated summary and the source text content. The Bidirectional Encoder Representations from Transformers (BERT) improves the performance of many tasks in Natural Language Processing (NLP). Although BERT has a strong capability to encode context, it lacks the fine-grained semantic representation. To solve these two problems, we proposed a semantic supervision method based on Capsule Network. Firstly, we extracted the fine-grained semantic representation of the input and encoded result in BERT by Capsule Network. Secondly, we used the fine-grained semantic representation of the input to supervise the fine-grained semantic representation of the encoded result. Then we evaluated our model on a popular Chinese social media dataset (LCSTS), and the result showed that our model achieved higher ROUGE scores (including R-1, R-2), and our model outperformed baseline systems. Finally, we conducted a comparative study on the stability of the model, and the experimental results showed that our model was more stable.
引用
收藏
页码:145 / 158
页数:14
相关论文
共 27 条
[1]  
[Anonymous], 2019, INT C MACH LEARN
[2]  
[Anonymous], 2015, CONF EMP METH NAT LA
[3]  
Ba J.L., 2016, stat, VVolume 29, P3617, DOI 10.48550/arXiv.1607.06450
[4]  
Bahdanau D, 2016, Arxiv, DOI [arXiv:1409.0473, DOI 10.48550/ARXIV.1409.0473]
[5]  
Cho K., 2014, C EMP METH NAT LANG, P1724, DOI [10.3115/v1/D14-1179, DOI 10.3115/V1/D14-1179]
[6]  
Cho K., 2015, P SSST 8 8 WORKSHOP, P103, DOI DOI 10.3115/V1/W14-4012
[7]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[8]  
Dong L, 2019, ADV NEURAL INFORM PR, P13063
[9]  
Gu JT, 2016, PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, P1631
[10]  
Hochreiter S., 1997, Neural Computation, V9, P1735