SEARCHING FOR RELEVANT TWEETS BASED ON TOPIC-RELATED USER ACTIVITIES

被引:0
作者
Noro, Tomoya [1 ]
Tokuda, Takehiro [1 ]
机构
[1] Tokyo Inst Technol, Dept Comp Sci, Meguro Ku, Tokyo 1528552, Japan
来源
JOURNAL OF WEB ENGINEERING | 2016年 / 15卷 / 3-4期
关键词
Social media; Twitter; social network analysis; search; graph-based approach;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Twitter is one of the largest social media. Although it can be used to get information on a topic of interest, it is not easy for us to find tweets relevant to the topic due to a massive amount of tweets and the small size of each tweet. Some relevant tweets may not include any terms explicitly related to the topic, and general content-based keyword search techniques and query expansion techniques are not effective for finding such relevant tweets. To solve this problem, we present a method for finding tweets on a topic of interest based on the Twitter user activities related to the topic such as tweet, retweet, and reply. The method consists of two phases: the preparation phase and the main phase. In the preparation phase, we create a user-tweet reference graph representing the relation between users and tweets based on the past user activities related to the topic, calculate the influence of each user and tweet in the topic, then define two types of each user's power, called "Voice" and "Impact", indicating "how much voice the user has on the topic" and "how much impact the user has on the other users' tweets on the topic". In the main phase, we calculate the relevance of newly-arrived tweets to the topic according to the Voice and the Impact score of the users who posted, retweeted, or replied to each of the tweets, then rank the tweets by the relevance score. The two phases are processed independently. Once the preparation phase is completed, the main phase can return the final result any time. Experimental results show that "who retweeted or replied to the tweet" is more effective for judging the relevance of each tweet to the topic than "who posted the tweet", and our method can find relevant tweets which do not include any terms explicitly related to the topic. We compare our method with an indegree-based method and a PageRank-based method, and show that our method outperforms the methods compared.
引用
收藏
页码:249 / 276
页数:28
相关论文
共 22 条
[1]  
[Anonymous], 1998, P 7 INT WORLD WID WE
[2]  
[Anonymous], 2011, Why Americans use social media
[3]  
[Anonymous], 2010, P 3 ACM INT C WEB SE, DOI DOI 10.1145/1718487.1718520
[4]  
[Anonymous], 2010, ACM RecSys'10', DOI DOI 10.1145/1864708.1864746
[5]  
[Anonymous], 2010, P INT AAAI C WEB SOC, DOI DOI 10.1609/ICWSM.V4I1.14033
[6]  
Brandtzæg PB, 2009, LECT NOTES COMPUT SC, V5621, P143, DOI 10.1007/978-3-642-02774-1_16
[7]  
Duan Y, 2010, P 23 INT C COMP LING, P295
[8]  
Gruber F., 2014, WHY USER SOCIAL MEDI
[9]  
Jaccard P., 1912, New Phytologist, V11, P37, DOI [10.1111/j.1469-8137.1912.tb05611.x, DOI 10.1111/J.1469-8137.1912.TB05611.X]
[10]   Cumulated gain-based evaluation of IR techniques [J].
Järvelin, K ;
Kekäläinen, J .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2002, 20 (04) :422-446