Searching documents based on relevance and type

被引:0
作者
Xu, Jun [1 ]
Cao, Yunbo [1 ]
Li, Hang [1 ]
Craswell, Nick [2 ]
Huang, Yalou [3 ]
机构
[1] Microsoft Res Asia, No 49 Zichun Rd, Beijing, Peoples R China
[2] Microsoft Res Cambridge, Cambridge, England
[3] Nankai Univ, Tianjin, Peoples R China
来源
ADVANCES IN INFORMATION RETRIEVAL | 2007年 / 4425卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper extends previous work on document retrieval and document type classification, addressing the problem of 'typed search'. Specifically, given a query and a designated document type, the search system retrieves and ranks documents not only based on the relevance to the query, but also based on the likelihood of being the designated document type. The paper formalizes the problem in a general framework consisting of 'relevance model' and 'type model'. The relevance model indicates whether or not a document is relevant to a query. The type model indicates whether or not a document belongs to the designated document type. We consider three methods for combing the models: linear combination of scores, thresholding on the type score, and a hybrid of the previous two methods. We take course page search and instruction document search as examples and have conducted a series of experiments. Experimental results show our proposed approaches can significantly outperform the baseline methods.
引用
收藏
页码:629 / +
页数:2
相关论文
共 17 条
[1]  
[Anonymous], NIST SPECIAL PUBLICA
[2]  
[Anonymous], P 35 ANN M ASS COMP
[3]  
CRASWELL N, 2001, P 24 ANN INT ACM SIG, P250
[4]  
CRASWELL N, 2005, 14 TEXT RETR C
[5]  
FREUND L, 2005, P 28 ACM SIGIR C SAL
[6]  
Friedman J, 2001, The elements of statistical learning, V1, DOI DOI 10.1007/978-0-387-21606-5
[7]  
KRAAJJ W, 2002, P 25 ACM SIGIR C
[8]  
MATSUDA K, 1999, P 8 CIKM KANS US
[9]  
Mizzaro S, 1997, J AM SOC INFORM SCI, V48, P810, DOI 10.1002/(SICI)1097-4571(199709)48:9<810::AID-ASI6>3.0.CO
[10]  
2-U