Dynamic Online HDP model for discovering evolutionary topics from Chinese social texts

被引:17
作者
Fu, Xianghua [1 ]
Li, Jianqiang [1 ]
Yang, Kun [1 ]
Cui, Laizhong [1 ]
Yang, Lei [1 ]
机构
[1] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Guangdong, Peoples R China
基金
中国国家自然科学基金;
关键词
Hierarchical Dirichlet Process; Topic probability model dynamic topic discovery; Chinese social media; DIRICHLET; INFERENCE;
D O I
10.1016/j.neucom.2015.06.047
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
User-generated content such as online reviews in social media evolve rapidly over time. To better understand the social media content, users not only want to examine what the topics are, but also want to discover the topic evolution patterns. In this paper, we propose a Dynamic Online Hierarchical Dirichlet Process model (DOHDP) to discover the evolutionary topics for Chinese social texts. In our DOHDP model, the evolutionary processes of topics are considered as evolutions in two levels, i.e. interepoch level and intra-epoch level. In inter-epoch level, the corpus of each epoch is modeled with an online HOP topic model, and the social texts are generated in a sequence mode. In the intra-epoch level, the time dependencies of historical epochs are modeled with an exponential decay function in which more recent epochs have a relatively stronger influence on the model parameters than the earlier epoch. Furthermore, we implement our DOHDP model using a two-phase online variational algorithm. Through comparing our DOHDP model with other related topic models on Chinese social media dataset Tianya-80299, the experiment results show that DOHDP model provides the best performance for discovering the evolutionary topics of Chinese social texts. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:412 / 424
页数:13
相关论文
共 51 条
[1]  
Ahmed A., 2010, UAI'10, P20
[2]  
Ahmed A., 2013, P 23 INT JOINT C ART, P3111
[3]  
Ahmed A., 2011, Proceedings of the 20th International Conference on World Wide Web, P267
[4]  
Ahmed A., 2008, P 8 SIAM INT C DAT M
[5]  
[Anonymous], 2011, P 14 INT C ART INT S
[6]  
[Anonymous], 2012, Proceedings of the fifth ACM International Conference on Web Search and Data Mining
[7]  
[Anonymous], 2014, ACM SIGKDD Explor. Newslett.
[8]  
[Anonymous], 2012, INT C MACH LEARN
[9]  
[Anonymous], 2009, Proceeding of the 18th ACM Conference on Information and Knowledge Management, DOI DOI 10.1145/1645953.1646076
[10]  
[Anonymous], 2010, Proceedings of the 2010 international conference on Management of data