Multi-document summarization via submodularity

被引:29
作者
Li, Jingxuan [1 ]
Li, Lei [1 ]
Li, Tao [1 ]
机构
[1] Florida Int Univ, Sch Comp & Informat Sci, Miami, FL 33199 USA
基金
美国国家科学基金会;
关键词
Multi-document summarization; Submodularity; Greedy algorithm;
D O I
10.1007/s10489-012-0336-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-document summarization is becoming an important issue in the Information Retrieval community. It aims to distill the most important information from a set of documents to generate a compressed summary. Given a set of documents as input, most of existing multi-document summarization approaches utilize different sentence selection techniques to extract a set of sentences from the document set as the summary. The submodularity hidden in the term coverage and the textual-unit similarity motivates us to incorporate this property into our solution to multi-document summarization tasks. In this paper, we propose a new principled and versatile framework for different multi-document summarization tasks using submodular functions (Nemhauser et al. in Math. Prog. 14(1):265-294, 1978) based on the term coverage and the textual-unit similarity which can be efficiently optimized through the improved greedy algorithm. We show that four known summarization tasks, including generic, query-focused, update, and comparative summarization, can be modeled as different variations derived from the proposed framework. Experiments on benchmark summarization data sets (e.g., DUC04-06, TAC08, TDT2 corpora) are conducted to demonstrate the efficacy and effectiveness of our proposed framework for the general multi-document summarization tasks.
引用
收藏
页码:420 / 430
页数:11
相关论文
共 50 条
  • [31] Multi-document extractive summarization using semantic graph
    del Camino Valle, Oleyda
    Simon-Cuevas, Alfredo
    Valladares-Valdes, Eduardo
    Olivas, Jose A.
    Romero, Francisco P.
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2019, (63): : 103 - 110
  • [32] Multi-Document Text Summarization for Competitor Intelligence : A Methodology
    Chakraborti, Swapnajit
    Dey, Shubhamoy
    PROCEEDINGS OF 2014 2ND INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL AND BUSINESS INTELLIGENCE (ISCBI), 2014, : 97 - 100
  • [33] A topic Approach to Sentence Ordering for Multi-document Summarization
    Na, Liu
    Peng, Xiao
    Ying, Lu
    Tang Xiao-jun
    Wang Hai-wen
    Li Ming-xia
    2016 IEEE TRUSTCOM/BIGDATASE/ISPA, 2016, : 1390 - 1395
  • [34] Literature Study on Multi-document Text Summarization Techniques
    Shah, Chintan
    Jivani, Anjali
    SMART TRENDS IN INFORMATION TECHNOLOGY AND COMPUTER COMMUNICATIONS, SMARTCOM 2016, 2016, 628 : 442 - 451
  • [35] Exploiting Conceptual Relations of Sentences for Multi-document Summarization
    Zheng, Hai-Tao
    Gong, Shu-Qin
    Guo, Ji-Min
    Wu, Wen-Zhen
    WEB-AGE INFORMATION MANAGEMENT (WAIM 2015), 2015, 9098 : 506 - 510
  • [36] Multi-document Summarization Algorithm based on Significance Sentences
    Liu Na
    Lu Ying
    Tang Xiao-Jun
    Wang Hai-Wen
    Xiao Peng
    Li Ming-Xia
    PROCEEDINGS OF THE 28TH CHINESE CONTROL AND DECISION CONFERENCE (2016 CCDC), 2016, : 3847 - 3852
  • [37] Topic-Sensitive Multi-document Summarization Algorithm
    Liu Na
    Di Tang
    Lu Ying
    Tang Xiao-jun
    Wang Hai-wen
    COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2015, 12 (04) : 1375 - 1389
  • [38] An Intelligent Web Search Using Multi-Document Summarization
    Takale, Sheetal A.
    Kulkarni, Prakash J.
    Shah, Sahil K.
    INTERNATIONAL JOURNAL OF INFORMATION RETRIEVAL RESEARCH, 2016, 6 (02) : 41 - 65
  • [39] Cover Coefficient-Based Multi-document Summarization
    Ercan, Gonenc
    Can, Fazli
    ADVANCES IN INFORMATION RETRIEVAL, PROCEEDINGS, 2009, 5478 : 670 - 674
  • [40] Parallel Relationship Graph to Improve Multi-Document Summarization
    Lu, Menghua
    Liang, Lijia
    Liu, Gongshen
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT II, 2022, 13530 : 630 - 642