Intertopic Information Mining for Query-Based Summarization

被引:11
作者
Ouyang, You [1 ]
Li, Wenjie [1 ]
Li, Sujian [2 ]
Lu, Qin [1 ]
机构
[1] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Hong Kong, Peoples R China
[2] Peking Univ, Minist Educ, Key Lab Computat Linguist, Beijing, Peoples R China
来源
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY | 2010年 / 61卷 / 05期
关键词
D O I
10.1002/asi.21299
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this article, the authors address the problem of sentence ranking in summarization. Although most existing summarization approaches are concerned with the information embodied in a particular topic (including a set of documents and an associated query) for sentence ranking, they propose a novel ranking approach that incorporates intertopic information mining. Intertopic information, in contrast to intratopic information, is able to reveal pairwise topic relationships and thus can be considered as the bridge across different topics. In this article, the intertopic information is used for transferring word importance learned from known topics to unknown topics under a learning-based summarization framework. To mine this information, the authors model the topic relationship by clustering all the words in both known and unknown topics according to various kinds of word conceptual labels, which indicate the roles of the words in the topic. Based on the mined relationships, we develop a probabilistic model using manually generated summaries provided for known topics to predict ranking scores for sentences in unknown topics. A series of experiments have been conducted on the Document Understanding Conference (DUC) 2006 data set. The evaluation results show that intertopic information is indeed effective for sentence ranking and the resultant summarization system performs comparably well to the best-performing DUC participating systems on the same data set.
引用
收藏
页码:1062 / 1072
页数:11
相关论文
共 21 条
[1]  
[Anonymous], P 2003 C N AM CHAPT
[2]  
[Anonymous], 1999, P 37 ANN M ASS COMP, DOI DOI 10.1115/10146781014760
[3]  
Carbonell J., 1998, Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P335, DOI 10.1145/290941.291025
[4]  
Dang HT, 2006, P DOC UND C 2006
[5]  
Daumé H, 2006, COLING/ACL 2006, VOLS 1 AND 2, PROCEEDINGS OF THE CONFERENCE, P305
[6]   NEW METHODS IN AUTOMATIC EXTRACTING [J].
EDMUNDSON, HP .
JOURNAL OF THE ACM, 1969, 16 (02) :264-+
[7]  
HIRAO T, 2002, P 19 INT C COMP LING, P1
[8]  
JAGALAMUDI J, 2006, P DOC UND C 2006
[9]   Summarization beyond sentence extraction: A probabilistic approach to sentence compression [J].
Knight, K ;
Marcu, D .
ARTIFICIAL INTELLIGENCE, 2002, 139 (01) :91-107
[10]  
Kupiec J. M., P 18 ANN INT ACM SIG, P68, DOI 10.1145/215206.215333