Topic representation model based on microblogging behavior analysis

被引:10
|
作者
Han, Weihong [1 ]
Tian, Zhihong [1 ]
Huang, Zizhong [2 ]
Li, Shudong [1 ]
Jia, Yan [3 ]
机构
[1] Guangzhou Univ, Cyberspace Inst Adv Technol, Guangzhou 510006, Peoples R China
[2] Natl Univ Def Technol, Comp Sch, Changsha 410073, Peoples R China
[3] Cyberspace Secur Res Ctr, Peng Cheng Lab, Shenzhen 518000, Peoples R China
来源
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS | 2020年 / 23卷 / 06期
关键词
Topic representation model; Behavior analysis; Word distribution; LDA model; Topic detection; INTERNET;
D O I
10.1007/s11280-020-00822-x
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the development of microblogging, it has become an important way for people to obtain information, express opinions, and make suggestions. Identifying new topics quickly and accurately from the massive microblogging data plays a crucial role for recommending information and controlling public opinion. The topic representation model provides a basis for topic detection. In this paper, we propose a topic representation model based on user behavior analysis, i.e., microblogging behavior analysis-latent Dirichlet allocation (MBA-LDA) model, for microblogging datasets. Topic-word distribution is acquired by the LDA model which considers information on user behaviors (such as posting, forwarding and commenting) and word distribution among documents within one topic and among different topics. The model also re-assesses the importance of words in topic representation. The basic idea is that the distribution of words within a topic or among different topics has a great influence on the selection of topic expression words. If a word is evenly distributed among all documents of a certain topic, it indicates that the word is the common word of all documents in the topic, and it is more suitable to represent this topic. If a word is more evenly distributed among various topics, it indicates that the word is the common word of all topics, and it can't achieve the purpose of distinguishing topics, so it is less suitable to represent any topic. By experiments with Sina Microblogging's actual data set, the topic model based on the MBA-LDA algorithm makes the representative words more important and increases the differentiation of topic words, which effectively improves the accuracy of subsequent topic detection and evolutionary analysis.
引用
收藏
页码:3083 / 3097
页数:15
相关论文
共 50 条
  • [31] REPRESENTATION AND ANALYSIS OF BEHAVIOR FOR MULTIPROCESS SYSTEMS BY USING STOCHASTIC PETRI NETS
    JIN, Q
    SUGASAWA, Y
    MATHEMATICAL AND COMPUTER MODELLING, 1995, 22 (10-12) : 109 - 118
  • [32] Derivative Topic Dissemination Model Based on Multitopic Iterative Derivation and Social Psychology
    Wang, Rong
    Ma, Kexin
    Guo, Xiaole
    Wei, Shihong
    Wang, Zhiwei
    Li, Tun
    Xiao, Yunpeng
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2025, 12 (01): : 390 - 403
  • [33] Determining the Topic Hashtags for Chinese Microblogs Based on 5W Model
    Zhao, Zhibin
    Sun, Jiahong
    Mao, Zhenyu
    Feng, Shi
    Bao, Yubin
    BIG DATA COMPUTING AND COMMUNICATIONS, (BIGCOM 2016), 2016, 9784 : 55 - 67
  • [34] Online information analysis on pancreatic cancer in Korea using structural topic model
    Jo, Wonkwang
    Kim, Yeol
    Seo, Minji
    Lee, Nayoung
    Park, Junli
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [35] A word embedding topic model for topic detection and summary in social networks
    Shi, Lei
    Cheng, Gang
    Xie, Shang-ru
    Xie, Gang
    MEASUREMENT & CONTROL, 2019, 52 (9-10) : 1289 - 1298
  • [36] Personalized Subject Learning Based on Topic Detection and Canonical Correlation Analysis
    Shi, Zhangzu
    Shi, Steve K.
    Shi, Lucy L.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2015, 6 (10) : 120 - 126
  • [37] Classroom Behavior Analysis and Evaluation in Physical Education by Using Structure Representation
    Yu, Qiufen
    Liu, Baishan
    INTERNATIONAL JOURNAL OF DISTRIBUTED SYSTEMS AND TECHNOLOGIES, 2022, 13 (03)
  • [38] PLSA-based Topic Detection in Meetings for Adaptation of Lexicon and Language Model
    Akita, Yuya
    Nemoto, Yusuke
    Kawahara, Tatsuya
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1321 - 1324
  • [39] Behavior Analysis-Based IoT Services For Crowd Management
    Noor, Talal H.
    COMPUTER JOURNAL, 2023, 66 (09) : 2208 - 2219
  • [40] MigrO: a plug-in for the analysis of individual mobility behavior based on the stay region model
    Damiani, Maria Luisa
    Issa, Hamza
    Fotino, Giuseppe
    Hachem, Fatima
    Ranc, Nathan
    Cagnacci, Francesca
    23RD ACM SIGSPATIAL INTERNATIONAL CONFERENCE ON ADVANCES IN GEOGRAPHIC INFORMATION SYSTEMS (ACM SIGSPATIAL GIS 2015), 2015,