CLUSTERING TECHNIQUES AND DISCRETE PARTICLE SWARM OPTIMIZATION ALGORITHM FOR MULTI-DOCUMENT SUMMARIZATION

被引:31
作者
Aliguliyev, Ramiz M. [1 ]
机构
[1] Natl Acad Sci, Inst Informat Technol, Dept 13, AZ-1141 Baku, Azerbaijan
关键词
text mining; sentence clustering; generic multi-document summarization; sentence extractive technique; discrete Particle Swarm Optimization algorithm; TEXT; SENTENCES; LEXRANK;
D O I
10.1111/j.1467-8640.2010.00365.x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-document summarization is a process of automatic creation of a compressed version of a given collection of documents that provides useful information to users. In this article we propose a generic multi-document summarization method based on sentence clustering. We introduce five clustering methods, which optimize various aspects of intra-cluster similarity, inter-cluster dissimilarity and their combinations. To solve the clustering problem a modification of discrete particle swarm optimization algorithm has been proposed. The experimental results on open benchmark data sets from DUC2005 and DUC2007 show that our method significantly outperforms the baseline methods for multi-document summarization.
引用
收藏
页码:420 / 448
页数:29
相关论文
共 55 条
  • [1] Alguliev R. M., 2005, Automatic Control and Computer Sciences, V39, P42
  • [2] Automatic Text Documents Summarization through Sentences Clustering
    Alguliev, R. M.
    Alyguliev, R. M.
    [J]. JOURNAL OF AUTOMATION AND INFORMATION SCIENCES, 2008, 40 (09) : 53 - 63
  • [3] Effective summarization method of text documents
    Alguliev, RM
    Aliguliyev, RM
    [J]. 2005 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, PROCEEDINGS, 2005, : 264 - 271
  • [4] A novel partitioning-based clustering method and generic document summarization
    Aliguliyev, Ramiz M.
    [J]. 2006 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY, WORKSHOPS PROCEEDINGS, 2006, : 626 - 629
  • [5] Clustering of document collection - A weighting approach
    Aliguliyev, Ramiz M.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (04) : 7904 - 7916
  • [6] A new sentence similarity measure and sentence based extractive technique for automatic text summarization
    Aliguliyev, Ramiz M.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (04) : 7764 - 7772
  • [7] [Anonymous], 2008, P 17 ACM C INF KNOWL, DOI DOI 10.1145/1458082.1458319
  • [8] [Anonymous], 2008, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, DOI DOI 10.1145/1390334.1390386
  • [9] [Anonymous], 2008, Proceedings of SIGIR, DOI 10.1145/1390334.1390384
  • [10] [Anonymous], Mead