Intra-document and Inter-document Redundancy in Multi-document Summarization

被引:1
作者
Carrillo-Mendoza, Pabel [1 ]
Calvo, Hiram [1 ]
Gelbukh, Alexander [1 ]
机构
[1] Inst Politecn Nacl, CIC, Ave Juan de Dios Batiz, Mexico City 07738, DF, Mexico
来源
ADVANCES IN COMPUTATIONAL INTELLIGENCE, MICAI 2016, PT I | 2017年 / 10061卷
关键词
Multi-document summarization; Graph-based methods; Unsupervised summarization; Doc2vec; Intra-document redundancy; Per-document redundancy; Inter-document redundancy; Cross-documents redundancy;
D O I
10.1007/978-3-319-62434-1_9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-document summarization differs from single-document summarization in excessive redundancy of mentions of some events or ideas. We show how the amount of redundancy in a document collection can be used for assigning importance to sentences in multi-document extractive summarization: for instance, an idea could be important if it is redundant across documents because of its popularity; on the other hand, an idea could be important if it is not redundant across documents because of its novelty. We propose an unsupervised graph-based technique that, based on proper similarity measures, allows us to experiment with intra-document and inter-document redundancy. Our experiments on DUC corpora show promising results.
引用
收藏
页码:105 / 115
页数:11
相关论文
共 31 条
[1]  
[Anonymous], 2003, P 2003 C N AM CHAPT
[2]  
[Anonymous], 2016, arXiv preprint arXiv:1610.08815
[3]  
[Anonymous], 2012, P COLING
[4]  
[Anonymous], 2014, P INT C INT C MACH L
[5]  
[Anonymous], 2005, P 2 INT JOINT C COMP
[6]  
[Anonymous], 2004, P 2004 C EMP METH NA
[7]  
Cambria E., 2014, PROC MAKING SENSE MI, P2
[8]  
Cambria E., 2016, 26 INT C COMP LING C
[9]  
Carbonell J., 1998, Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P335, DOI 10.1145/290941.291025
[10]  
Celikyilmaz A, 2010, ACL 2010: 48TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, P815