Versatile Query Scrambling for Private Web Search

被引:22
作者
Arampatzis, Avi [1 ]
Drosatos, George [1 ]
Efraimidis, Pavlos S. [1 ]
机构
[1] Democritus Univ Thrace, Dept Elect & Comp Engn, GR-67100 Xanthi, Greece
来源
INFORMATION RETRIEVAL JOURNAL | 2015年 / 18卷 / 04期
关键词
Query scrambler; Search privacy; Query-based document sampling; Mutual information; Set covering; Inter-user agreement;
D O I
10.1007/s10791-015-9256-0
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider the problem of privacy leaks suffered by Internet users when they perform web searches, and propose a framework to mitigate them. In brief, given a 'sensitive' search query, the objective of our work is to retrieve the target documents from a search engine without disclosing the actual query. Our approach, which builds upon and improves recent work on search privacy, approximates the target search results by replacing the private user query with a set of blurred or scrambled queries. The results of the scrambled queries are then used to cover the private user interest. We model the problem theoretically, define a set of privacy objectives with respect to web search and investigate the effectiveness of the proposed solution with a set of queries with privacy issues on a large web collection. Experiments show great improvements in retrieval effectiveness over a previously reported baseline in the literature. Furthermore, the methods are more versatile, predictably-behaved, applicable to a wider range of information needs, and the privacy they provide is more comprehensible to the end-user. Additionally, we investigate the perceived privacy via a user study, as well as, measure the system's usefulness taking into account the trade off between retrieval effectiveness and privacy. The practical feasibility of the methods is demonstrated in a field experiment, scrambling queries against a popular web search engine. The findings may have implications for other IR research areas, such as query expansion, query decomposition, and distributed retrieval.
引用
收藏
页码:331 / 358
页数:28
相关论文
共 37 条
[1]  
[Anonymous], 1997, ICML
[2]  
[Anonymous], 2008, Introduction to information retrieval
[3]  
[Anonymous], 2006, INFOSCALE 06
[4]   A query scrambler for search privacy on the internet [J].
Arampatzis, Avi ;
Efraimidis, Pavlos S. ;
Drosatos, George .
INFORMATION RETRIEVAL, 2013, 16 (06) :657-679
[5]  
Arampatzis A, 2011, LECT NOTES COMPUT SC, V6611, P117, DOI 10.1007/978-3-642-20161-5_13
[6]   Where to Stop Reading a Ranked List? Threshold Optimization using Truncated Score Distributions [J].
Arampatzis, Avi ;
Kamps, Jaap ;
Robertson, Stephen .
PROCEEDINGS 32ND ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2009, :524-531
[7]  
Barbaro M, 2006, FACE IS EXPOSED AOL
[8]   Recommending with an Agenda: Active Learning of Private Attributes using Matrix Factorization [J].
Bhagat, Smriti ;
Weinsberg, Udi ;
Ioannidis, Stratis ;
Taft, Nina .
PROCEEDINGS OF THE 8TH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS'14), 2014, :65-72
[9]  
Boneh D, 2007, LECT NOTES COMPUT SC, V4392, P535
[10]  
Bouma G., 2009, FORM MEANING PROCESS, P31