AN OPTIMIZATION APPROACH TO AUTOMATIC GENERIC DOCUMENT SUMMARIZATION

被引:10
作者
Alguliev, Rasim M. [1 ]
Aliguliyev, Ramiz M. [1 ]
Mehdiyev, Chingiz A. [1 ]
机构
[1] Azerbaijan Natl Acad Sci, Inst Informat Technol, Baku 1141, Az, Azerbaijan
关键词
generic document summarization; summary diversity; redundancy; optimization models; PSO with nonlinear decreasing inertia weight; PMI-based sentence similarity measure; PARTICLE SWARM; RANKING; LEXRANK;
D O I
10.1111/j.1467-8640.2012.00437.x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we have presented an optimization approach to document summarization. The potential of optimization based document summarization models has not been well explored to date. This is partially the difficulty to formulate the criteria used for objective assessment. We modeled document summarization as the linear and nonlinear optimization problems. These models generally attempt simultaneously to balance coverage and diversity in the summary. To solve the optimization problem we developed a novel particle swarm optimization (PSO) algorithm. Experiments showed our linear and nonlinear models produce very competitive results, which significantly outperform the NIST baselines in both years. More important, although linear and nonlinear models are comparable to the top three systems S24, S15, and S12 in the DUC2006, they are even superior to the best participating system in the DUC2005.
引用
收藏
页码:129 / 155
页数:27
相关论文
共 49 条
[11]  
[Anonymous], 2008, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, DOI DOI 10.1145/1390334.1390386
[12]  
[Anonymous], P 20087 4 INT C SEM
[13]   A Web Search Engine-Based Approach to Measure Semantic Similarity between Words [J].
Bollegala, Danushka ;
Matsuo, Yutaka ;
Ishizuka, Mitsuru .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2011, 23 (07) :977-990
[14]  
Cai Xiaoyan, 2010, P 23 INT C COMPUTATI, P134
[15]  
Carbonell J., 1998, Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P335, DOI 10.1145/290941.291025
[16]  
Celikyilmaz A, 2010, ACL 2010: 48TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, P815
[17]   TSCAN: A Content Anatomy Approach to Temporal Topic Summarization [J].
Chen, Chien Chin ;
Chen, Meng Chang .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (01) :170-183
[18]   The particle swarm - Explosion, stability, and convergence in a multidimensional complex space [J].
Clerc, M ;
Kennedy, J .
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2002, 6 (01) :58-73
[19]   LexRank: Graph-based lexical centrality as salience in text summarization [J].
Erkan, G ;
Radev, DR .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2004, 22 :457-479
[20]  
Filatova E., 2004, Computational Linguistics, P397