Multi-document summarization via submodularity

被引:29
作者
Li, Jingxuan [1 ]
Li, Lei [1 ]
Li, Tao [1 ]
机构
[1] Florida Int Univ, Sch Comp & Informat Sci, Miami, FL 33199 USA
基金
美国国家科学基金会;
关键词
Multi-document summarization; Submodularity; Greedy algorithm;
D O I
10.1007/s10489-012-0336-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-document summarization is becoming an important issue in the Information Retrieval community. It aims to distill the most important information from a set of documents to generate a compressed summary. Given a set of documents as input, most of existing multi-document summarization approaches utilize different sentence selection techniques to extract a set of sentences from the document set as the summary. The submodularity hidden in the term coverage and the textual-unit similarity motivates us to incorporate this property into our solution to multi-document summarization tasks. In this paper, we propose a new principled and versatile framework for different multi-document summarization tasks using submodular functions (Nemhauser et al. in Math. Prog. 14(1):265-294, 1978) based on the term coverage and the textual-unit similarity which can be efficiently optimized through the improved greedy algorithm. We show that four known summarization tasks, including generic, query-focused, update, and comparative summarization, can be modeled as different variations derived from the proposed framework. Experiments on benchmark summarization data sets (e.g., DUC04-06, TAC08, TDT2 corpora) are conducted to demonstrate the efficacy and effectiveness of our proposed framework for the general multi-document summarization tasks.
引用
收藏
页码:420 / 430
页数:11
相关论文
共 50 条
  • [1] Multi-document summarization via submodularity
    Jingxuan Li
    Lei Li
    Tao Li
    Applied Intelligence, 2012, 37 : 420 - 430
  • [2] MSSF: A Multi-Document Summarization Framework based on Submodularity
    Li, Jingxuan
    Li, Lei
    Li, Tao
    PROCEEDINGS OF THE 34TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR'11), 2011, : 1247 - 1248
  • [3] On redundancy in multi-document summarization
    Calvo, Hiram
    Carrillo-Mendoza, Pabel
    Gelbukh, Alexander
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2018, 34 (05) : 3245 - 3255
  • [4] Multi-document summarization via group sparse learning
    He, Ruifang
    Tang, Jiliang
    Gong, Pinghua
    Hu, Qinghua
    Wang, Bo
    INFORMATION SCIENCES, 2016, 349 : 12 - 24
  • [5] A Game Theory Approach for Multi-document Summarization
    Amreen Ahmad
    Tanvir Ahmad
    Arabian Journal for Science and Engineering, 2019, 44 : 3655 - 3667
  • [6] A Game Theory Approach for Multi-document Summarization
    Ahmad, Amreen
    Ahmad, Tanvir
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2019, 44 (04) : 3655 - 3667
  • [7] Multi-document Summarization via Deep Learning Techniques: A Survey
    Ma, Congbo
    Zhang, Wei Emma
    Guo, Mingyu
    Wang, Hu
    Sheng, Quan Z.
    ACM COMPUTING SURVEYS, 2023, 55 (05)
  • [8] Multi-Document Summarization for Turkish News
    Demirci, Ferhat
    Karabudak, Engin
    Ilgen, Bahar
    2017 INTERNATIONAL ARTIFICIAL INTELLIGENCE AND DATA PROCESSING SYMPOSIUM (IDAP), 2017,
  • [9] Weighted consensus multi-document summarization
    Wang, Dingding
    Li, Tao
    INFORMATION PROCESSING & MANAGEMENT, 2012, 48 (03) : 513 - 523
  • [10] MULTI-DOCUMENT SUMMARIZATION SYSTEMS COMPARISON
    Li, Lei
    Heng, Wei
    Liu, Ping'an
    2012 IEEE 2nd International Conference on Cloud Computing and Intelligent Systems (CCIS) Vols 1-3, 2012, : 1409 - 1413