Detecting Hotspot Information Using Multi-Attribute Based Topic Model

被引:11
作者
Wang, Jing [1 ]
Li, Li [1 ]
Tan, Feng [1 ]
Zhu, Ying [1 ]
Feng, Weisi [1 ]
机构
[1] Southwest Univ, Sch Comp & Informat Sci, Chongqing, Peoples R China
来源
PLOS ONE | 2015年 / 10卷 / 10期
基金
中国国家自然科学基金;
关键词
TWITTER;
D O I
10.1371/journal.pone.0140539
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Microblogging as a kind of social network has become more and more important in our daily lives. Enormous amounts of information are produced and shared on a daily basis. Detecting hot topics in the mountains of information can help people get to the essential information more quickly. However, due to short and sparse features, a large number of meaningless tweets and other characteristics of microblogs, traditional topic detection methods are often ineffective in detecting hot topics. In this paper, we propose a new topic model named multi-attribute latent dirichlet allocation (MA-LDA), in which the time and hashtag attributes of microblogs are incorporated into LDA model. By introducing time attribute, MA-LDA model can decide whether a word should appear in hot topics or not. Meanwhile, compared with the traditional LDA model, applying hashtag attribute in MA-LDA model gives the core words an artificially high ranking in results meaning the expressiveness of outcomes can be improved. Empirical evaluations on real data sets demonstrate that our method is able to detect hot topics more accurately and efficiently compared with several baselines. Our method provides strong evidence of the importance of the temporal factor in extracting hot topics.
引用
收藏
页数:16
相关论文
共 34 条
  • [1] Abel F, 2011, LECT NOTES COMPUT SC, V6644, P375, DOI 10.1007/978-3-642-21064-8_26
  • [2] Alam M. H., 2014, P 3 WORKSH DAT DRIV, P15
  • [3] On-Line LDA: Adaptive Topic Models for Mining Text Streams with Applications to Topic Detection and Tracking
    AlSumait, Loulwah
    Barbara, Daniel
    Domeniconi, Carlotta
    [J]. ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2008, : 3 - 12
  • [4] [Anonymous], 2010, ICWSM 2010, DOI DOI 10.1609/ICWSM.V4I1.14026
  • [5] [Anonymous], P 23 ACM INT C INF K
  • [6] [Anonymous], 2009, P 2009 C EMPIRICAL M
  • [7] [Anonymous], 2005, Markov chain monte carlo
  • [8] [Anonymous], PHYS LIFE REV
  • [9] Blei D.M., 2006, P 23 INT C MACH LEAR, DOI DOI 10.1145/1143844.1143859
  • [10] Latent Dirichlet allocation
    Blei, DM
    Ng, AY
    Jordan, MI
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 993 - 1022