Learning Similarity Functions in Graph-Based Document Summarization

被引:0
作者
Ouyang, You [1 ]
Li, Wenjie [1 ]
Wei, Furu [1 ]
Lu, Qin [1 ]
机构
[1] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Hong Kong, Peoples R China
来源
COMPUTER PROCESSING OF ORIENTAL LANGUAGES: LANGUAGE TECHNOLOGY FOR THE KNOWLEDGE-BASED ECONOMY | 2009年 / 5459卷
关键词
Document summarization; graph-based ranking; sentence similarity calculation; support vector machine;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Graph-based models have been extensively explored in document summarization in recent years. Compared with traditional feature-based models, graph-based models incorporate interrelated information into the ranking process. Thus, potentially they can do a better job in retrieving the important contents from documents. In this paper,. we investigate the problem of how to measure sentence similarity which is a crucial issue in graph-based summarization models but in our belief has not been well defined in the past. We propose a supervised learning approach that brings together multiple similarity measures and makes use of human-generated summaries to guide the combination process. Therefore, it can be expected to provide more accurate estimation than a single cosine similarity measure. Experiments conducted on the DUC2005 and DUC2006 data sets show that the proposed learning approach is successful in measuring similarity. Its competitiveness and adaptability are also demonstrated.
引用
收藏
页码:189 / 200
页数:12
相关论文
共 21 条
[1]  
[Anonymous], P 2003 C N AM CHAPT
[2]  
[Anonymous], 2004, P 2004 C EMP METH NA
[3]  
BARZILAY R, 1999, P ACL 1999 COLL PARK
[4]   The anatomy of a large-scale hypertextual Web search engine [J].
Brin, S ;
Page, L .
COMPUTER NETWORKS AND ISDN SYSTEMS, 1998, 30 (1-7) :107-117
[5]  
Carbonell J. G., 1998, P 21 ANN INT ACM SIG
[6]  
Chang C.-C., LIBSVM: a Library for Support Vector Machines
[7]  
DANG HT, 2005, DOC UND C 2005
[8]   Using lexical chains for keyword extraction [J].
Ercan, Gonenc ;
Cicekli, Ilyas .
INFORMATION PROCESSING & MANAGEMENT, 2007, 43 (06) :1705-1714
[9]  
Erkan G., 2004, P 2004 C EMP METH NA, P365
[10]  
Kupiec J. M., P 18 ANN INT ACM SIG, P68, DOI 10.1145/215206.215333