Topic representation model based on microblogging behavior analysis

被引:10
|
作者
Han, Weihong [1 ]
Tian, Zhihong [1 ]
Huang, Zizhong [2 ]
Li, Shudong [1 ]
Jia, Yan [3 ]
机构
[1] Guangzhou Univ, Cyberspace Inst Adv Technol, Guangzhou 510006, Peoples R China
[2] Natl Univ Def Technol, Comp Sch, Changsha 410073, Peoples R China
[3] Cyberspace Secur Res Ctr, Peng Cheng Lab, Shenzhen 518000, Peoples R China
来源
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS | 2020年 / 23卷 / 06期
关键词
Topic representation model; Behavior analysis; Word distribution; LDA model; Topic detection; INTERNET;
D O I
10.1007/s11280-020-00822-x
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the development of microblogging, it has become an important way for people to obtain information, express opinions, and make suggestions. Identifying new topics quickly and accurately from the massive microblogging data plays a crucial role for recommending information and controlling public opinion. The topic representation model provides a basis for topic detection. In this paper, we propose a topic representation model based on user behavior analysis, i.e., microblogging behavior analysis-latent Dirichlet allocation (MBA-LDA) model, for microblogging datasets. Topic-word distribution is acquired by the LDA model which considers information on user behaviors (such as posting, forwarding and commenting) and word distribution among documents within one topic and among different topics. The model also re-assesses the importance of words in topic representation. The basic idea is that the distribution of words within a topic or among different topics has a great influence on the selection of topic expression words. If a word is evenly distributed among all documents of a certain topic, it indicates that the word is the common word of all documents in the topic, and it is more suitable to represent this topic. If a word is more evenly distributed among various topics, it indicates that the word is the common word of all topics, and it can't achieve the purpose of distinguishing topics, so it is less suitable to represent any topic. By experiments with Sina Microblogging's actual data set, the topic model based on the MBA-LDA algorithm makes the representative words more important and increases the differentiation of topic words, which effectively improves the accuracy of subsequent topic detection and evolutionary analysis.
引用
收藏
页码:3083 / 3097
页数:15
相关论文
共 50 条
  • [41] Flow interaction based propagation model and bursty influence behavior analysis of Internet flows
    Wu, Xiao-Yu
    Gu, Ren-Tao
    Ji, Yue-Feng
    PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2016, 462 : 341 - 349
  • [42] Topic features for machine learning-based sentiment analysis in Indonesian tweets
    Murfi, Hendri
    Siagian, Furida Lusi
    Satria, Yudi
    INTERNATIONAL JOURNAL OF INTELLIGENT COMPUTING AND CYBERNETICS, 2019, 12 (01) : 70 - 81
  • [44] LDA topic model for microblog recommendation
    Duan, Jianyong
    Ai, Yamin
    Ii, Xia
    PROCEEDINGS OF 2015 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING, 2015, : 185 - 188
  • [45] Digital Twins: A Systematic Literature Review Based on Data Analysis and Topic Modeling
    Kukushkin, Kuzma
    Ryabov, Yury
    Borovkov, Alexey
    DATA, 2022, 7 (12)
  • [46] Exploring the topic structure and evolution of associations in information behavior research through co-word analysis
    Deng, Shengli
    Xia, Sudi
    Hu, Jiming
    Li, Hongxiu
    Liu, Yong
    JOURNAL OF LIBRARIANSHIP AND INFORMATION SCIENCE, 2021, 53 (02) : 280 - 297
  • [47] Domain-Oriented Topic Discovery Based on Features Extraction and Topic Clustering
    Lu, Xiaofeng
    Zhou, Xiao
    Wang, Wenting
    Lio, Pietro
    Hui, Pan
    IEEE ACCESS, 2020, 8 (08): : 93648 - 93662
  • [48] A topic detection method based on KM-LSH Fusion algorithm and improved BTM model
    Liu, Wenjun
    Guo, Huan
    Gan, Jiaxin
    Wang, Hai
    Wang, Hailan
    Zhang, Chao
    Peng, Qingcheng
    Sun, Yuyan
    Yu, Bao
    Hou, Mengshu
    Li, Bo
    Li, Xiaolei
    Soft Computing, 2024, 28 (19) : 11421 - 11438
  • [49] Topic Detection Based on User Intention
    Deng, Lu
    Quan, Yong
    Xu, Jing
    Huang, Jiuming
    Zhou, Bin
    2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW), 2015, : 885 - 891
  • [50] Social sentiment sensor: a visualization system for topic detection and topic sentiment analysis on microblog
    Yanyan Zhao
    Bing Qin
    Ting Liu
    Duyu Tang
    Multimedia Tools and Applications, 2016, 75 : 8843 - 8860