A social inverted index for social-tagging-based information retrieval

被引:10
作者
Lee, Kang-Pyo [1 ]
Kim, Hong-Gee [1 ]
Kim, Hyoung-Joo [1 ]
机构
[1] Seoul Natl Univ, Coll Engn, Sch Comp Sci & Engn, Seoul 151742, South Korea
基金
新加坡国家研究基金会;
关键词
information retrieval; inverted index; social tagging; tags; web search; SEARCH; BOOKMARKING; FILES; TAG;
D O I
10.1177/0165551512438357
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Keywords have played an important role not only for searchers who formulate a query, but also for search engines that index documents and evaluate the query. Recently, tags chosen by users to annotate web resources are gaining significance for improving information retrieval (IR) tasks, in that they can act as meaningful keywords bridging the gap between humans and machines. One critical aspect of tagging (besides the tag and the resource) is the user (or tagger); there exists a ternary relationship among the tag, resource, and user. The traditional inverted index, however, does not consider the user aspect, and is based on the binary relationship between term and document. In this paper we propose a social inverted index - a novel inverted index extended for social-tagging-based IR - that maintains a separate user sublist for each resource in a resource-posting list to contain each user's various features as weights. The social inverted index is different from the normal inverted index in that it regards each user as a unique person, rather than simply count the number of users, and highlights the value of a user who has participated in tagging. This extended structure facilitates the use of dynamic resource weights, which are expected to be more meaningful than simple user-frequency-based weights. It also allows a flexible response to the conditional queries that are increasingly required in tag-based IR. Our experiments have shown that this user-considering indexing performs better in IR tasks than a normal inverted index with no user sublists. The time and space overhead required for index construction and maintenance was also acceptable.
引用
收藏
页码:313 / 332
页数:20
相关论文
共 46 条
[1]  
Amitay E, 2009, 20TH ACM CONFERENCE ON HYPERTEXT AND HYPERMEDIA (HYPERTEXT 2009), P199
[2]  
[Anonymous], 2008, WWW
[3]  
[Anonymous], 2008, P 31 ANN INT ACM SIG, DOI DOI 10.1145/1390334.1390363
[4]  
[Anonymous], 2008, P ECAI MIN SOC DAT W
[5]  
[Anonymous], 2007, P 16 INT C WORLD WID
[6]  
[Anonymous], UNLEASHING WEB 2 0 C
[7]  
[Anonymous], 2007, P 16 INT C WORLD WID, DOI DOI 10.1145/1242572.1242685
[8]  
[Anonymous], 2007, Folksonomy coinage and definition
[9]  
[Anonymous], 2007, P 16 INT C WORLD WID
[10]  
[Anonymous], 2009, P 18 ACM C INF KNOWL, DOI DOI 10.1145/1645953