Topic representation model based on microblogging behavior analysis

被引:10
|
作者
Han, Weihong [1 ]
Tian, Zhihong [1 ]
Huang, Zizhong [2 ]
Li, Shudong [1 ]
Jia, Yan [3 ]
机构
[1] Guangzhou Univ, Cyberspace Inst Adv Technol, Guangzhou 510006, Peoples R China
[2] Natl Univ Def Technol, Comp Sch, Changsha 410073, Peoples R China
[3] Cyberspace Secur Res Ctr, Peng Cheng Lab, Shenzhen 518000, Peoples R China
来源
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS | 2020年 / 23卷 / 06期
关键词
Topic representation model; Behavior analysis; Word distribution; LDA model; Topic detection; INTERNET;
D O I
10.1007/s11280-020-00822-x
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the development of microblogging, it has become an important way for people to obtain information, express opinions, and make suggestions. Identifying new topics quickly and accurately from the massive microblogging data plays a crucial role for recommending information and controlling public opinion. The topic representation model provides a basis for topic detection. In this paper, we propose a topic representation model based on user behavior analysis, i.e., microblogging behavior analysis-latent Dirichlet allocation (MBA-LDA) model, for microblogging datasets. Topic-word distribution is acquired by the LDA model which considers information on user behaviors (such as posting, forwarding and commenting) and word distribution among documents within one topic and among different topics. The model also re-assesses the importance of words in topic representation. The basic idea is that the distribution of words within a topic or among different topics has a great influence on the selection of topic expression words. If a word is evenly distributed among all documents of a certain topic, it indicates that the word is the common word of all documents in the topic, and it is more suitable to represent this topic. If a word is more evenly distributed among various topics, it indicates that the word is the common word of all topics, and it can't achieve the purpose of distinguishing topics, so it is less suitable to represent any topic. By experiments with Sina Microblogging's actual data set, the topic model based on the MBA-LDA algorithm makes the representative words more important and increases the differentiation of topic words, which effectively improves the accuracy of subsequent topic detection and evolutionary analysis.
引用
收藏
页码:3083 / 3097
页数:15
相关论文
共 50 条
  • [1] Topic representation model based on microblogging behavior analysis
    Weihong Han
    Zhihong Tian
    Zizhong Huang
    Shudong Li
    Yan Jia
    World Wide Web, 2020, 23 : 3083 - 3097
  • [2] Natural disaster topic extraction in Sina microblogging based on graph analysis
    Ma, Tinghuai
    Zhao, YuWei
    Zhou, Honghao
    Tian, Yuan
    Al-Dhelaan, Abdullah
    Al-Rodhaan, Mznah
    EXPERT SYSTEMS WITH APPLICATIONS, 2019, 115 : 346 - 355
  • [3] A Topic Representation Model for Online Social Networks Based on Hybrid Human-Artificial Intelligence
    Han, Weihong
    Tian, Zhihong
    Zhu, Chunsheng
    Huang, Zizhong
    Jia, Yan
    Guizani, Mohsen
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2021, 8 (01): : 191 - 200
  • [4] Topic Detection from Microblog Based on Text Clustering and Topic Model Analysis
    Huang, Siqi
    Yang, Yitao
    Li, Huakang
    Sun, Guozi
    2014 ASIA-PACIFIC SERVICES COMPUTING CONFERENCE (APSCC), 2014, : 88 - 92
  • [5] Word Polarity Analysis Method Based on Topic Model
    Fan, Xiao-Nan
    Wang, Shi-Min
    PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON MANAGEMENT SCIENCE AND MANAGEMENT INNOVATION, 2014, : 107 - 112
  • [6] An Automatic Topic Ranking Approach for Event Detection on Microblogging Messages
    Lee, Chung-Hong
    Chien, Tzan-Feng
    Yang, Hsin-Chang
    2011 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2011, : 1358 - 1363
  • [7] The Research on Behavior Analysis of Network Topic Diffusion
    Zhang, Yanchao
    Liu, Yun
    PROCEEDINGS OF 2010 CROSS-STRAIT CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY, 2010, : 198 - 201
  • [8] Cost-Effective Online Trending Topic Detection and Popularity Prediction in Microblogging
    Miao, Zhongchen
    Chen, Kai
    Fang, Yi
    He, Jianhua
    Zhou, Yi
    Zhang, Wenjun
    Zha, Hongyuan
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2017, 35 (03)
  • [9] Knowledge-Based Topic Model for Multi-Modal Social Event Analysis
    Xue, Feng
    Hong, Richang
    He, Xiangnan
    Wang, Jianwei
    Qian, Shengsheng
    Xu, Changsheng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (08) : 2098 - 2110
  • [10] Learning Methods for Dynamic Topic Modeling in Automated Behavior Analysis
    Isupova, Olga
    Kuzin, Danil
    Mihaylova, Lyudmila
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (09) : 3980 - 3993