Deep Differential Amplifier for Extractive Summarization

被引:0
作者
Jia, Ruipeng [1 ,2 ]
Cao, Yanan [1 ,2 ]
Fang, Fang [1 ,2 ]
Zhou, Yuchen [1 ]
Fang, Zheng [1 ]
Liu, Yanbing [1 ,2 ]
Wang, Shi [3 ]
机构
[1] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Cyber Secur, Beijing, Peoples R China
[3] Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China
来源
59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021) | 2021年
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For sentence-level extractive summarization, there is a disproportionate ratio of selected and unselected sentences, leading to flatting the summary features when optimizing the classification. The imbalanced sentence classification in extractive summarization is inherent, which can't be addressed by data sampling or data augmentation algorithms easily. In order to address this problem, we innovatively consider the single-document extractive summarization as a rebalance problem and present a deep differential amplifier framework to enhance the features of summary sentences. Specifically, we calculate and amplify the semantic difference between each sentence and other sentences, and apply the residual unit to deepen the differential amplifier architecture. Furthermore, the corresponding objective loss of the minority class is boosted by a weighted cross-entropy. In this way, our model pays more attention to the pivotal information of one sentence, that is different from previous approaches which model all informative context in the source document. Experimental results on two benchmark datasets show that our summarizer performs competitively against state-of-the-art methods. Our source code will be available on Github.
引用
收藏
页码:366 / 376
页数:11
相关论文
共 58 条
  • [1] [Anonymous], 2016, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2016.90
  • [2] [Anonymous], 2016, Residual networks behave like ensembles of relatively shallow networks
  • [3] Asai A., 2020, P 58 ANN M ASS COMPU, P5642, DOI DOI 10.18653/V1/2020.ACL-MAIN.499
  • [4] Bae S., 2019, P 2 WORKSHOP NEW FRO, P10
  • [5] Balduzzi D, 2017, PR MACH LEARN RES, V70
  • [6] Baziotis C, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P673
  • [7] Bi Keping, 2020, AREDSUM ADAPTIVE RED
  • [8] Carbonell J., 1998, Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P335, DOI 10.1145/290941.291025
  • [9] Chen YC, 2018, PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, P675
  • [10] Cheng JP, 2016, PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, P484