Semantic Augmented Topic Model over Short Text

被引:0
作者
Li, Lingyun [1 ,2 ]
Sun, Yawei [1 ,2 ]
Wang, Cong [1 ,2 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Software Engn, Beijing 100876, Peoples R China
[2] Beijing Univ Posts & Telecommun, Minist Educ, Key Lab Trustworthy Distributed Comp & Serv, Beijing 100876, Peoples R China
来源
PROCEEDINGS OF 2018 5TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS) | 2018年
关键词
topic model; short text; latent semantic; bi-term topic model;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
With the rapid development of Internet and mobile devices, a vast number of short texts are produced by users, which also post great challenges to topic modeling because of the severe sparsity in context. The traditional topic model cannot do well in short text because of lacking word co-occurrence patterns. An effective approach bi-term topic model(BTM) has been proposed which models the word co-occurrence at the whole corpus directly and performs better than conventional topic models. However, BTM only consider the frequency of bi-term simply and ignore the latent semantic information between bi-terms which cause the words with similar semantic having a great risk of being grouped under different topic. In this paper, we propose a latent semantic augmented bi-term topic model(LS-BTM) which incorporates semantic information as prior knowledge to infer the topic more reasonable. The experimental result shows that our model gets better result than other short text topic models over real-world dataset.
引用
收藏
页码:652 / 656
页数:5
相关论文
共 50 条
[1]   Word co-occurrence augmented topic model in short text [J].
Chen, Guan-Bin ;
Kao, Hung-Yu .
INTELLIGENT DATA ANALYSIS, 2017, 21 :S55-S70
[2]   Constructing Pseudo Documents with Semantic Similarity for Short Text Topic Discovery [J].
Lu, Heng-yang ;
Li, Yun ;
Tang, Chi ;
Wang, Chong-jun ;
Xie, Jun-yuan .
NEURAL INFORMATION PROCESSING (ICONIP 2018), PT V, 2018, 11305 :437-449
[3]   TextNetTopics Pro, a topic model-based text classification for short text by integration of semantic and document-topic distribution information [J].
Voskergian, Daniel ;
Bakir-Gungor, Burcu ;
Yousef, Malik .
FRONTIERS IN GENETICS, 2023, 14
[4]   Fuzzy topic modeling approach for text mining over short text [J].
Rashid, Junaid ;
Shah, Syed Muhammad Adnan ;
Irtaza, Aun .
INFORMATION PROCESSING & MANAGEMENT, 2019, 56 (06)
[5]   Short text classification using semantically enriched topic model [J].
Uddin, Farid ;
Chen, Yibo ;
Zhang, Zuping ;
Huang, Xin .
JOURNAL OF INFORMATION SCIENCE, 2025, 51 (02) :481-498
[6]   A CWTM Model of Topic Extraction for Short Text [J].
Diao, Yunlan ;
Du, Yajun ;
Xiao, Pan ;
Liu, Jia .
KNOWLEDGE GRAPH AND SEMANTIC COMPUTING: LANGUAGE, KNOWLEDGE, AND INTELLIGENCE, CCKS 2017, 2017, 784 :80-91
[7]   Embedding Semantic Anchors to Guide Topic Models on Short Text Corpora [J].
Steuber, Florian ;
Schneider, Sinclair ;
Schoenfeld, Mirco .
BIG DATA RESEARCH, 2022, 27
[8]   Short text optimized topic model for service clustering [J].
Lu J.-W. ;
Zheng J.-H. ;
Li D.-N. ;
Xu J. ;
Xiao G. .
Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2022, 56 (12) :2416-2425+2444
[9]   Exploiting Global Semantic Similarity Biterms for Short-text Topic Discovery [J].
Lu, Heng-yang ;
Ge, Gao-jian ;
Li, Yun ;
Wang, Chong-jun ;
Xie, Jun-yuan .
2018 IEEE 30TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2018, :975-982
[10]   Supervised Intensive Topic Models for Emotion Detection over Short Text [J].
Rao, Yanghui ;
Pang, Jianhui ;
Xie, Haoran ;
Liu, An ;
Wong, Tak-Lam ;
Li, Qing ;
Wang, Fu Lee .
DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2017), PT I, 2017, 10177 :408-422