Modeling Topic Evolution in Twitter: An Embedding-Based Approach

被引:5
作者
Abulaish, Muhammad [1 ]
Fazil, Mohd [2 ]
机构
[1] South Asian Univ, Dept Comp Sci, New Delhi 110021, India
[2] Jamia Millia Islamia, Dept Comp Sci, New Delhi 110025, India
关键词
Social network analysis; twitter data analysis; temporal evolution; topic modeling; word embedding;
D O I
10.1109/ACCESS.2018.2878494
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In last two decades, online social networks have grown vertically as well as horizontally. Due to various users activities in these networks, huge amount of data, mainly textual, is being generated that can be analyzed at different levels of granularity for various purposes, including behavior analysis, sentiment analysis, and predictive modeling. In this paper, we propose a word embedding-based approach to analyze users-centric tweets to observe their behavior evolution in terms of the topics discussed by them over a period of time. We also present a word embedding-based proximity measure to monitor temporal transitions between the topics using five topic evolution events - emergence, persistence, convergence, divergence, and extinction. The proximity between a pair of topics is defined as a function of the content and contextual similarity between their word distributions, wherein the contextual similarity is calculated using word embedding. The proposed approach is evaluated over three Twitter datasets in line with the existing state-of-the-art approaches in literature and the experimental results are encouraging.
引用
收藏
页码:64847 / 64857
页数:11
相关论文
共 16 条
[1]  
[Anonymous], 2008, Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM
[2]  
Bhat S. Y., 2014, IEEE T KNOW DAT ENG, V27, P1013
[3]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[4]  
Blei DM., 2006, P 23 INT C MACHINE L, P113, DOI DOI 10.1145/1143844.1143859
[5]  
Caro L. D., 2018, P STI LEID NETH SEP, P486
[6]   ON INFORMATION AND SUFFICIENCY [J].
KULLBACK, S ;
LEIBLER, RA .
ANNALS OF MATHEMATICAL STATISTICS, 1951, 22 (01) :79-86
[7]   Finding Statistically Significant Communities in Networks [J].
Lancichinetti, Andrea ;
Radicchi, Filippo ;
Ramasco, Jose J. ;
Fortunato, Santo .
PLOS ONE, 2011, 6 (04)
[8]  
Lauschke C., 2012, P ASONAM IST TURK AU, P1972
[9]  
Mei Q, 2005, P 11 ACM SIGKDD INT, P198, DOI DOI 10.1145/1081870.1081895
[10]  
Mikolov T., 2013, ICLR, P3111