Semantic Augmented Topic Model over Short Text

被引:0
作者
Li, Lingyun [1 ,2 ]
Sun, Yawei [1 ,2 ]
Wang, Cong [1 ,2 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Software Engn, Beijing 100876, Peoples R China
[2] Beijing Univ Posts & Telecommun, Minist Educ, Key Lab Trustworthy Distributed Comp & Serv, Beijing 100876, Peoples R China
来源
PROCEEDINGS OF 2018 5TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS) | 2018年
关键词
topic model; short text; latent semantic; bi-term topic model;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
With the rapid development of Internet and mobile devices, a vast number of short texts are produced by users, which also post great challenges to topic modeling because of the severe sparsity in context. The traditional topic model cannot do well in short text because of lacking word co-occurrence patterns. An effective approach bi-term topic model(BTM) has been proposed which models the word co-occurrence at the whole corpus directly and performs better than conventional topic models. However, BTM only consider the frequency of bi-term simply and ignore the latent semantic information between bi-terms which cause the words with similar semantic having a great risk of being grouped under different topic. In this paper, we propose a latent semantic augmented bi-term topic model(LS-BTM) which incorporates semantic information as prior knowledge to infer the topic more reasonable. The experimental result shows that our model gets better result than other short text topic models over real-world dataset.
引用
收藏
页码:652 / 656
页数:5
相关论文
共 50 条
  • [21] BTM: Topic Modeling over Short Texts
    Cheng, Xueqi
    Yan, Xiaohui
    Lan, Yanyan
    Guo, Jiafeng
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (12) : 2928 - 2941
  • [22] Heterogeneous Latent Topic Discovery for Semantic Text Mining
    Li, Yawen
    Jiang, Di
    Lian, Rongzhong
    Wu, Xueyang
    Tan, Conghui
    Xu, Yi
    Su, Zhiyang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (01) : 533 - 544
  • [23] Combining Lexical and Semantic Features for Short Text Classification
    Yang, Lili
    Li, Chunping
    Ding, Qiang
    Li, Li
    17TH INTERNATIONAL CONFERENCE IN KNOWLEDGE BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS - KES2013, 2013, 22 : 78 - 86
  • [24] User clustering in a dynamic social network topic model for short text streams
    Qiu, Zhangcheng
    Shen, Hong
    INFORMATION SCIENCES, 2017, 414 : 102 - 116
  • [25] Hot Topic Detection on Chinese Short Text
    Zhang, Cheng
    Fan, Xinghua
    Chen, Xianlin
    ADVANCED RESEARCH ON COMPUTER EDUCATION, SIMULATION AND MODELING, PT II, 2011, 176 (02): : 207 - 212
  • [26] Benchmarking short text semantic similarity
    O'Shea J.
    Bandar Z.
    Crockett K.
    McLean D.
    International Journal of Intelligent Information and Database Systems, 2010, 4 (02) : 103 - 120
  • [27] Utilizing Recurrent Neural Network for topic discovery in short text scenarios
    Lu, Heng-Yang
    Kang, Ning
    Li, Yun
    Zhan, Qian-Yi
    Xie, Jun-Yuan
    Wang, Chong-Jun
    INTELLIGENT DATA ANALYSIS, 2019, 23 (02) : 259 - 277
  • [28] Topic Mining over Asynchronous Text Sequences
    Wang, Xiang
    Jin, Xiaoming
    Chen, Meng-En
    Zhang, Kai
    Shen, Dou
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (01) : 156 - 169
  • [29] Evaluation of the Dirichlet Process Multinomial Mixture Model for Short-Text Topic Modeling
    Karlsson, Alexander
    Duarte, Denio
    Mathiason, Gunnar
    Bae, Juhee
    2018 6TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL AND BUSINESS INTELLIGENCE (ISCBI 2018), 2018, : 79 - 83
  • [30] Text Categorization Based on Topic Model
    Zhou, Shibin
    Li, Kan
    Liu, Yushu
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2009, 2 (04) : 398 - 409