Deriving Topics in Twitter by Exploiting Tweet Interactions

被引:9
作者
Nugroho, Robertus [1 ]
Yang, Jian [1 ]
Zhong, Youliang [1 ]
Paris, Cecile [2 ]
Nepal, Surya [2 ]
机构
[1] Macquarie Univ, Dept Comp, Sydney, NSW, Australia
[2] CSIRO, Canberra, ACT, Australia
来源
2015 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2015 | 2015年
关键词
Topic Derivation; Twitter; Interactions of Tweets; Joint Matrix Factorization;
D O I
10.1109/BigDataCongress.2015.22
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Twitter as a big data social network becomes one of the most important sources for capturing the up-to-date events happening in the world. Topic derivation from Twitter is important for various applications such as situation awareness, market analysis, content filtering, and recommendations. However, tweets are short messages, which makes topic derivation challenging. Current methods employ various semantic features of tweet content but mostly overlook the interactions among tweets. In this paper, we propose a novel topic derivation method that takes into account the interactions among tweets, defined as the reciprocal activities related to people who send the tweets, as well as actions and tweet contents. In particular, topics are derived by performing a two-step matrix factorization jointly over the interactions and semantic features of the tweets. We have conducted a number of experiments on tweets collected over a period of time, showing that the proposed method consistently outperforms other advanced topic derivation methods in the literature. Our experiments also reveal that the interactions among tweets do significantly relieve the sparsity problem caused by the short-text nature of Twitter.
引用
收藏
页码:87 / 94
页数:8
相关论文
共 24 条
  • [11] Hollerer T., 2012, P 2012 ACM INT C INT, P179, DOI [10.1145/2166966.2166998, DOI 10.1145/2166966.2166998]
  • [12] Hu Yuheng., 2012, AAAI, V12, P59, DOI DOI 10.1609/AAAI.V26I1.8106
  • [13] Kullback Solomon, 1997, Information theory and statistics, DOI DOI 10.1007/s10845-018-1456-1
  • [14] Lee DD, 2001, ADV NEUR IN, V13, P556
  • [15] Li JX, 2014, INT CONF UTIL CLOUD, P865, DOI 10.1109/UCC.2014.141
  • [16] Liu C., 2010, P 19 INT C WORLD WID, P681, DOI [DOI 10.1145/1772690.1772760, 10.1145/1772690.1772760]
  • [17] Mimno David, 2011, P C EMPIRICAL METHOD, P262, DOI DOI 10.5555/2145432.2145462
  • [18] Pochampally R., 2011, Workshop on Enriching Information Retrieval (with ACM SIGIR), P1
  • [19] Ramachandran Divya., 2010, Proceedings of the 4th ACM/IEEE international conference on information and communication technologies and development, P1
  • [20] Salton G., 1989, Automatic Text Processing: The Transformation, Analysis, and Retrieval of