The Opposite of Smoothing: A Language Model Approach to Ranking Query-Specific Document Clusters

被引:13
作者
Kurland, Oren [1 ]
Krikon, Eyal [1 ]
机构
[1] Technion Israel Inst Technol, Fac Ind Engn & Management, IL-32000 Haifa, Israel
基金
美国国家科学基金会; 以色列科学基金会;
关键词
INFORMATION;
D O I
10.1613/jair.3327
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Exploiting information induced from (query-specific) clustering of top-retrieved documents has long been proposed as a means for improving precision at the very top ranks of the returned results. We present a novel language model approach to ranking query-specific clusters by the presumed percentage of relevant documents that they contain. While most previous cluster ranking approaches focus on the cluster as a whole, our model utilizes also information induced from documents associated with the cluster. Our model substantially outperforms previous approaches for identifying clusters containing a high relevant document percentage. Furthermore, using the model to produce document ranking yields precision-at-top-ranks performance that is consistently better than that of the initial ranking upon which clustering is performed. The performance also favorably compares with that of a state-of-the-art pseudo-feedback-based retrieval method.
引用
收藏
页码:367 / 395
页数:29
相关论文
共 62 条
  • [1] [Anonymous], 2004, THESIS U MASSACHUSET
  • [2] [Anonymous], 2006, P 29 ANN INT ACM SIG, DOI DOI 10.1145/1148170.1148204
  • [3] [Anonymous], 2006, IR478 CIIR U MASS
  • [4] [Anonymous], 2004, PROC ACL INTERACT PO
  • [5] [Anonymous], THESIS CORNELL U
  • [6] [Anonymous], P 27 INT ACM SIGIR C
  • [7] [Anonymous], 2003, INFORM RETRIEVAL BOO
  • [8] [Anonymous], 1996, P 19 ANN INT ACM SIG, DOI DOI 10.1145/243199.243202
  • [9] [Anonymous], 1998, Computer networks and ISDN systems, DOI [10.1016/S0169-7552(98)00110-X, DOI 10.1016/S0169-7552(98)00110-X]
  • [10] [Anonymous], 1998, SIGIR 98 P 21 ANN IN, DOI DOI 10.1145/290941.291008