Click data as implicit relevance feedback in web search

被引:47
作者
Jung, Seikyung [1 ]
Herlocker, Jonathan L.
Webster, Janet
机构
[1] Oregon State Univ, Sch Elect Engn & Comp Sci, Kelly Engn Ctr 1148, Corvallis, OR 97331 USA
[2] Oregon State Univ Lib, Hatfield Marine Sci Ctr, Guin Lib, Newport, OR 97365 USA
基金
美国国家科学基金会;
关键词
click data; implicit feedback; explicit feedback; search engines; information retrieval; collaborative filtering; SERF;
D O I
10.1016/j.ipm.2006.07.021
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Search sessions consist of a person presenting a query to a search engine, followed by that person examining the search results, selecting some of those search results for further review, possibly following some series of hyperfinks, and perhaps backtracking to previously viewed pages in the session. The series of pages selected for viewing in a search session, sometimes called the click data, is intuitively a source of relevance feedback information to the search engine. We are interested in how that relevance feedback can be used to improve the search results quality for all users, not just the current user. For example, the search engine Could learn which documents are frequently visited when certain search queries are given. In this article, we address three issues related to using click data as implicit relevance feedback: (1) How click data beyond the search results page might be more reliable than just the clicks from the search results page-, (2) Whether we can further subselect from this click data to get even more reliable relevance feedback; and (3) How the reliability of click data for relevance feedback changes when the goal becomes finding one document for the user that completely meets their information needs (if possible). We refer to these documents as the ones that are strictly relevant to the query. Our conclusions are based on empirical data from a live website with manual assessment of relevance. We found that considering all of the click data in a search session as relevance feedback has the potential to increase both precision and recall of the feedback data. We further found that, when the goal is identifying strictly relevant documents, that it could be useful to focus on last visited documents rather than all documents visited in a search session. (c) 2006 Elsevier Ltd. All rights reserved.
引用
收藏
页码:791 / 807
页数:17
相关论文
共 39 条
[1]  
ALFE E, 2004, P 16 EUR C ART INT, P268
[2]  
[Anonymous], 1994, P 17 ANN INT ACM SIG
[3]  
[Anonymous], P 23 SGAI INT C INN
[4]  
Balfe E, 2005, LECT NOTES COMPUT SC, V3408, P330
[5]  
Belkin Nicholas J., 2003, Proceedings from SIGIR 2003: The 26th Annual International ACM Conference on Research and Development in Information Retrieval, P205, DOI DOI 10.1145/860435.860474
[6]   ASK FOR INFORMATION-RETRIEVAL .1. BACKGROUND AND THEORY [J].
BELKIN, NJ ;
ODDY, RN ;
BROOKS, HM .
JOURNAL OF DOCUMENTATION, 1982, 38 (02) :61-71
[7]  
Boros E, 1999, P ASIS ANN, V36, P633
[8]  
Claypool Mark., 2001, P 6 INT C INTELLIGEN, P33, DOI DOI 10.1145/359784.359836
[9]  
COSLEY D, 2002, P 28 INT C VER LARG, P3546
[10]  
Cui H., 2002, Proceed- ings of the 11th International Conference on World Wide Web, P325, DOI DOI 10.1145/511446.511489