Semantic Augmented Topic Model over Short Text

被引:0
作者
Li, Lingyun [1 ,2 ]
Sun, Yawei [1 ,2 ]
Wang, Cong [1 ,2 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Software Engn, Beijing 100876, Peoples R China
[2] Beijing Univ Posts & Telecommun, Minist Educ, Key Lab Trustworthy Distributed Comp & Serv, Beijing 100876, Peoples R China
来源
PROCEEDINGS OF 2018 5TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS) | 2018年
关键词
topic model; short text; latent semantic; bi-term topic model;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
With the rapid development of Internet and mobile devices, a vast number of short texts are produced by users, which also post great challenges to topic modeling because of the severe sparsity in context. The traditional topic model cannot do well in short text because of lacking word co-occurrence patterns. An effective approach bi-term topic model(BTM) has been proposed which models the word co-occurrence at the whole corpus directly and performs better than conventional topic models. However, BTM only consider the frequency of bi-term simply and ignore the latent semantic information between bi-terms which cause the words with similar semantic having a great risk of being grouped under different topic. In this paper, we propose a latent semantic augmented bi-term topic model(LS-BTM) which incorporates semantic information as prior knowledge to infer the topic more reasonable. The experimental result shows that our model gets better result than other short text topic models over real-world dataset.
引用
收藏
页码:652 / 656
页数:5
相关论文
共 50 条
[31]   Text Categorization Based on Topic Model [J].
Zhou, Shibin ;
Li, Kan ;
Liu, Yushu .
INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2009, 2 (04) :398-409
[32]   Text Categorization Based on Topic Model [J].
School of Computer Science and Technology, China University of Mining and Technology, Jiangsu Province, Xuzhou ;
221116, China ;
不详 ;
100081, China .
Int. J. Comput. Intell. Syst., 2009, 4 (398-409) :398-409
[33]   SPARSE TOPIC MODEL FOR TEXT CLASSIFICATION [J].
Liu, Tao .
PROCEEDINGS OF 2013 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOLS 1-4, 2013, :1916-1920
[34]   Short Text Clustering based on Word Semantic Graph with Word Embedding Model [J].
Jinarat, Supakpong ;
Manaskasemsak, Bundit ;
Rungsawang, Arnon .
2018 JOINT 10TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 19TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS), 2018, :1427-1432
[35]   Incorporating structural topic modeling into short text analysis [J].
Wang, Po-Ya Angela ;
Hsieh, Shu-Kai .
CONCENTRIC-STUDIES IN LINGUISTICS, 2023, 49 (01) :96-138
[36]   Filtering out the noise in short text topic modeling [J].
Li, Ximing ;
Wang, Yue ;
Zhang, Ang ;
Li, Changchun ;
Chi, Jinjin ;
Ouyang, Jihong .
INFORMATION SCIENCES, 2018, 456 :83-96
[37]   A general framework to expand short text for topic modeling [J].
Bicalho, Paulo ;
Pita, Marcelo ;
Pedrosa, Gabriel ;
Lacerda, Anisio ;
Pappa, Gisele L. .
INFORMATION SCIENCES, 2017, 393 :66-81
[38]   Discovering Topic Representative Terms for Short Text Clustering [J].
Yang, Shuiqiao ;
Huang, Guangyan ;
Cai, Borui .
IEEE ACCESS, 2019, 7 :92037-92047
[39]   Short text topic modeling by exploring original documents [J].
Ximing Li ;
Changchun Li ;
Jinjin Chi ;
Jihong Ouyang .
Knowledge and Information Systems, 2018, 56 :443-462
[40]   Short text topic modeling by exploring original documents [J].
Li, Ximing ;
Li, Changchun ;
Chi, Jinjin ;
Ouyang, Jihong .
KNOWLEDGE AND INFORMATION SYSTEMS, 2018, 56 (02) :443-462