Twitter Topic Modeling for Breaking News Detection

被引:5
作者
Wold, Henning M. [1 ]
Vikre, Linn [1 ]
Gulla, Jon Atle [1 ]
Ozgobek, Ozlem [1 ,2 ]
Su, Xiaomeng [3 ]
机构
[1] NTNU, Dept Comp & Informat Sci, Trondheim, Norway
[2] Balikesir Univ, Dept Comp Engn, Balikesir, Turkey
[3] NTNU, Dept Informat & E Learning, Trondheim, Norway
来源
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND TECHNOLOGIES, VOL 2 (WEBIST) | 2016年
关键词
Twitter; Topic Modeling; News Detection; Text Mining;
D O I
10.5220/0005801902110218
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Social media platforms like Twitter have become increasingly popular for the dissemination and discussion of current events. Twitter makes it possible for people to share stories that they find interesting with their followers, and write updates on what is happening around them. In this paper we attempt to use topic models of tweets in real time to identify breaking news. Two different methods, Latent Dirichlet Allocation (LDA) and Hierarchical Dirichlet Process (HDP) are tested with each tweet in the training corpus as a document by itself, as well as with all the tweets of a unique user regarded as one document. This second approach emulates Author-Topic modeling (AT-modeling). The evaluation of methods relies on manual scoring of the accuracy of the modeling by volunteered participants. The experiments indicate topic modeling on tweets in real-time is not suitable for detecting breaking news by itself, but may be useful in analyzing and describing news tweets.
引用
收藏
页码:211 / 218
页数:8
相关论文
共 18 条
[1]  
[Anonymous], 2011, P 14 INT C ART INT S
[2]  
[Anonymous], 2012, P SIGCHI C HUM FACT, DOI [DOI 10.1145/2207676.2208672, 10.1145/2207676.2208672]
[3]  
[Anonymous], 2013, WEB AGE INFORM MANAG
[4]   Probabilistic Topic Models [J].
Blei, David M. .
COMMUNICATIONS OF THE ACM, 2012, 55 (04) :77-84
[5]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[6]  
Gulla J. A., 2014, IMPLICIT USER PROFIL
[7]  
Hong L., 2010, P 1 WORKSH SOC MED A, P80, DOI DOI 10.1145/1964858.1964870
[8]  
Ingvaldsen J. E., 2015, P JOINT WORKSH INT H
[9]  
Kwak H., WWW'10, DOI DOI 10.1145/1772690.1772751
[10]  
Mendoza Marcelo, 2010, P 1 WORKSH SOC MED A, P71, DOI DOI 10.1145/1964858.1964869