Extracting time series variation of topic popularity in microblogs

被引：0

作者：

Fukuyama, Satoshi ^{[1
]}

Wakabayashi, Kei ^{[2
]}

机构：

[1] Univ Tsukuba, Grad Sch Lib Informat & Media Studies, Tsukuba, Ibaraki, Japan

[2] Univ Tsukuba, Fac Lib Informat & Media Sci, Tsukuba, Ibaraki, Japan

来源：

IIWAS2018: THE 20TH INTERNATIONAL CONFERENCE ON INFORMATION INTEGRATION AND WEB-BASED APPLICATIONS & SERVICES | 2014年

关键词：

microblogs; topic popularity; Biterm Topic Model;

D O I：

10.1145/3282373.3282409

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Extracting topics and their popularities in microblogs is a promising approach to discover popular topics in the world. To challenge this task, some methods that estimate popularity of topics based on Latent Dirichlet Allocation (LDA) has been proposed. However, LDA fails to extract favorable topics on a collection of short text documents such as microblogs because the word co-occurrence information in an individual document is sparse. Therefore, in order to extract topics from microblogs, we should use a model specialized for short text documents. In this paper, we propose a topic popularity estimation method using Biterm Topic Model (BTM), which can alleviate the problem caused by document level word co-occurrence sparsity. We extract topics from the microblog documents with BTM for each time period and estimate the frequency of each topic occurrence. The proposed method can analyze the popularity of topics in a real time because we apply an efficient inference algorithm for BTM on small batches of tweets. Experiments on tweets collection show that some of the topics extracted by the proposed method correspond to the real world events and a topic burstiness gets higher when the event occurs.

引用

页码：365 / 369

页数：5

共 12 条

[1]

[Anonymous], IEEE T KNOWLEDGE DAT

[2]

AWAYA N, IEEE IJCNN, P3364

[3]

Blei D.M., 2006, INT C MACHINE LEARNI, DOI DOI 10.1145/1143844.1143859

[4] Latent Dirichlet allocation [J].

Blei, DM ;

Ng, AY ;

Jordan, MI .

JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022

[5]

Fukuyama Satoshi, 2018, P DEIM 2018

[6]

Iwata Tomoharu, 2009, IJCAI, V1, P2

[7]

Koike Daichi., 2013, PROC INT JOINT C NLP, P917

[8]

Kudo T., 2004, P 2004 C EMP METH NA, P230, DOI DOI 10.1109/ICCSIT.2009.5234727

[9]

Mehrotra R, 2013, SIGIR'13: THE PROCEEDINGS OF THE 36TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH & DEVELOPMENT IN INFORMATION RETRIEVAL, P889

[10]

Takahashi Yusuke, 2012, Advances in Natural Language Processing. Proceedings 8th International Conference on NLP, JapTAL 2012, P239, DOI 10.1007/978-3-642-33983-7_24

← 1 2 →