An unsupervised approach to generating generic summaries of documents

被引:30
作者
Alguliyev, Rasim M. [1 ]
Aliguliyev, Ramiz M. [1 ]
Isazade, Nijat R. [1 ]
机构
[1] Azerbaijan Natl Acad Sci, Inst Informat Technol, AZ-1141 Baku, Azerbaijan
关键词
Maximum relevance; Minimum redundancy; Optimization model; Differential evolution algorithm; Combined similarity measure; DIFFERENTIAL EVOLUTION; MAXIMUM COVERAGE; OPTIMIZATION; ALGORITHM; ENSEMBLE; PARAMETERS; MUTATION; RANKING; MODELS;
D O I
10.1016/j.asoc.2015.04.050
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present an optimization-based unsupervised approach to automatic document summarization. In the proposed approach, text summarization is modeled as a Boolean programming problem. This model generally attempts to optimize three properties, namely, (1) relevance: summary should contain informative textual units that are relevant to the user; (2) redundancy: summaries should not contain multiple textual units that convey the same information; and (3) length: summary is bounded in length. The approach proposed in this paper is applicable to both tasks: single-and multi-document summarization. In both tasks, documents are split into sentences in preprocessing. We select some salient sentences from document(s) to generate a summary. Finally, the summary is generated by threading all the selected sentences in the order that they appear in the original document(s). We implemented our model on multi-document summarization task. When comparing our methods to several existing summarization methods on an open DUC2005 and DUC2007 data sets, we found that our method improves the summarization results significantly. This is because, first, when extracting summary sentences, this method not only focuses on the relevance scores of sentences to the whole sentence collection, but also the topic representative of sentences. Second, when generating a summary, this method also deals with the problem of repetition of information. The methods were evaluated using ROUGE-1, ROUGE-2 and ROUGE-SU4 metrics. In this paper, we also demonstrate that the summarization result depends on the similarity measure. Results of the experiment showed that combination of symmetric and asymmetric similarity measures yields better result than their use separately. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:236 / 250
页数:15
相关论文
共 64 条
  • [1] Alguliev Rasim, 2009, Intelligent Information Management, V1, P128, DOI 10.4236/iim.2009.12019
  • [2] Alguliev R. M., 2005, Automatic Control and Computer Sciences, V39, P42
  • [3] Automatic Text Documents Summarization through Sentences Clustering
    Alguliev, R. M.
    Alyguliev, R. M.
    [J]. JOURNAL OF AUTOMATION AND INFORMATION SCIENCES, 2008, 40 (09) : 53 - 63
  • [4] Alguliev R.M., 2013, COMPUT INTELL, V29
  • [5] Multiple documents summarization based on evolutionary optimization algorithm
    Alguliev, Rasim M.
    Aliguliyev, Ramiz M.
    Isazade, Nijat R.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (05) : 1675 - 1689
  • [6] DESAMC+DocSum: Differential evolution with self-adaptive mutation and crossover parameters for multi-document summarization
    Alguliev, Rasim M.
    Aliguliyev, Ramiz M.
    Isazade, Nijat R.
    [J]. KNOWLEDGE-BASED SYSTEMS, 2012, 36 : 21 - 38
  • [7] Sentence selection for generic document summarization using an adaptive differential evolution algorithm
    Alguliev, Rasim M.
    Aliguliyev, Ramiz M.
    Mehdiyev, Chingiz A.
    [J]. SWARM AND EVOLUTIONARY COMPUTATION, 2011, 1 (04) : 213 - 222
  • [8] CDDS: Constraint-driven document summarization models
    Alguliev, Rasim M.
    Aliguliyev, Ramiz M.
    Isazade, Nijat R.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (02) : 458 - 465
  • [9] GenDocSum plus MCLR: Generic document summarization based on maximum coverage and less redundancy
    Alguliev, Rasim M.
    Aliguliyev, Ramiz M.
    Hajirahimova, Makrufa S.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (16) : 12460 - 12473
  • [10] MCMR: Maximum coverage and minimum redundant text summarization model
    Alguliev, Rasim M.
    Aliguliyev, Ramiz M.
    Hajirahimova, Makrufa S.
    Mehdiyev, Chingiz A.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (12) : 14514 - 14522