Events Insights Extraction from Twitter Using LDA and Day-Hashtag Pooling

被引:1
作者
Khan, Muhammad Haseeb Ur Rehman [1 ]
Wakabayashi, Kei [1 ]
Fukuyama, Satoshi [1 ]
机构
[1] Tsukuba Univ, Tsukuba, Ibaraki, Japan
来源
IIWAS2019: THE 21ST INTERNATIONAL CONFERENCE ON INFORMATION INTEGRATION AND WEB-BASED APPLICATIONS & SERVICES | 2019年
关键词
LDA; Hashtag Pooling; Topic Modeling; Time Series Analysis;
D O I
10.1145/3366030.3366090
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
News extraction from Twitter data is a hot topic. But can we extract much more than just news? The purpose of this research is to find, either news is the only information which can be extracted from Twitter data or it contains much more insights about real life events. So, we introduce a technique for analysis of Twitter's raw content. After pre-processing of tweets data, we apply hashtag pooling and extract topics using available topic modeling algorithm Latent Dirichlet Allocation (LDA) without modifying its core machinery. In the second part, estimated number of tweets per day and correlated top hashtags for each topic are calculated using dayhashtag pooling. Finally, the continues time series graph is constructed for topic analysis. Our findings show interesting results of bursty news detection, topic popularity, people's way to perceiving an event, real-life event's transition over time and before & after affects of a specific event.
引用
收藏
页码:240 / 244
页数:5
相关论文
共 13 条
[1]  
[Anonymous], 2012, P 29 INT C MACH LEAR
[2]  
Blei D.M., 2006, P 23 INT C MACHINE L, P113, DOI [DOI 10.1145/1143844.1143859, 10.1145/1143844.114385]
[3]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[4]  
Fukuyama Satoshi, 2018, P 20 INT C INF INT W, P365
[5]  
Han J, 2012, ACMIEEE INT CONF HUM, P421
[6]   Characterizing diabetes, diet, exercise, and obesity comments on Twitter [J].
Karami, Amir ;
Dahl, Alicia A. ;
Turner-McGrievy, Gabrielle ;
Kharrazi, Hadi ;
Shaw, George, Jr. .
INTERNATIONAL JOURNAL OF INFORMATION MANAGEMENT, 2018, 38 (01) :1-6
[7]  
Koike Daichi., 2013, PROC INT JOINT C NLP, P917
[8]  
Kwak H., WWW'10, DOI [DOI 10.1145/1772690.1772751, 10.1145/1772690.1772751]
[9]  
Mehrotra R, 2013, SIGIR'13: THE PROCEEDINGS OF THE 36TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH & DEVELOPMENT IN INFORMATION RETRIEVAL, P889
[10]   An Empirical Study on the Existence of Bubble in Chinese Stock Market: Based on TGARCH Model [J].
Nan, Lin ;
Hong, Lu ;
Zheng, Qin .
2010 2ND IEEE INTERNATIONAL CONFERENCE ON INFORMATION AND FINANCIAL ENGINEERING (ICIFE), 2010, :87-90