The influence of personalization on tag query length in social media search

被引:9
作者
Clements, M. [1 ]
de Vries, A. P. [1 ,2 ]
Reinders, M. J. T. [1 ]
机构
[1] Delft Univ Technol, Fac Elect Engn Math & Comp Sci, ICT Grp, NL-2628 CD Delft, Netherlands
[2] Ctr Math & Comp Sci CWI, NL-1098 SJ Amsterdam, Netherlands
关键词
Social media; Personalized search; Random walk; Collaborative tagging; Query length;
D O I
10.1016/j.ipm.2009.03.006
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Social content systems contain enormous collections of unstructured user-generated content, annotated by the collaborative effort of regular Internet users. Tag-clouds have become popular interfaces that allow users to query the database of these systems by clicking relevant terms. However, these single click queries are often not expressive enough to effectively retrieve the desired content. Users have to use multiple clicks or type longer queries to satisfy their information need. To enhance the predicted content ranking we use a random walk model that effectively integrates the user's preference and semantically related query terms. We use the collaborative annotations from a popular on-line book catalog to create a social annotation graph and study the effect of personalization and smoothing for increasing query lengths. We show that personalization and smoothing allow the user to find equally relevant content with fewer query terms compared to a frequency based content ranking with TF-IDF weighing. As expected, we see that the influence of the random walk model disappears if users type more detailed queries. Finally, we discuss the observations with respect to synonyms and homographs which are well known to hamper the performance of information retrieval systems. (C) 2009 Elsevier Ltd. All rights reserved.
引用
收藏
页码:403 / 412
页数:10
相关论文
共 24 条
[11]   Real life, real users, and real needs: a study and analysis of user queries on the web [J].
Jansen, BJ ;
Spink, A ;
Saracevic, T .
INFORMATION PROCESSING & MANAGEMENT, 2000, 36 (02) :207-227
[12]   Cumulated gain-based evaluation of IR techniques [J].
Järvelin, K ;
Kekäläinen, J .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2002, 20 (04) :422-446
[13]  
Kaser O., 2007, TAGG MET SOC INF ORG
[14]  
Keenoy K, 2005, LECT NOTES ARTIF INT, V3169, P201
[15]   The link-prediction problem for social networks [J].
Liben-Nowell, David ;
Kleinberg, Jon .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2007, 58 (07) :1019-1031
[16]  
Marlow C., 2006, P 17 C HYP HYPERTEXT, P31, DOI [DOI 10.1145/1149941.1149949, https://doi.org/10.1145/1149941.1149949]
[17]  
Mislove A, 2007, IMC'07: PROCEEDINGS OF THE 2007 ACM SIGCOMM INTERNET MEASUREMENT CONFERENCE, P29
[18]   Power laws, Pareto distributions and Zipf's law [J].
Newman, MEJ .
CONTEMPORARY PHYSICS, 2005, 46 (05) :323-351
[19]  
NOLL M, 2007, 6 INT 2 AS SEM WEB C, P365
[20]  
Page L., 1999, 199966 STANF INFOLAB, DOI DOI 10.1007/978-3-319-08789-4_10