Combine Topic Modeling with Semantic Embedding: Embedding Enhanced Topic Model

被引:17
|
作者
Zhang, Peng [1 ]
Wang, Suge [2 ,3 ]
Li, Deyu [2 ,3 ]
Li, Xiaoli [4 ]
Xu, Zhikang [5 ]
机构
[1] Shanxi Univ Finance & Econ, Sch Informat, Taiyuan 030006, Peoples R China
[2] Shanxi Univ, Sch Comp & Informat Technol, Taiyuan 030006, Peoples R China
[3] Shanxi Univ, Minist Educ, Key Lab Computat Intelligence & Chinese Informat, Taiyuan 030006, Peoples R China
[4] ASTAR, Inst Infocomm Res, Singapore 138632, Singapore
[5] Shanghai Univ, Sch Comp Engn & Sci, Shanghai 200444, Peoples R China
基金
中国国家自然科学基金;
关键词
Topic model; word embedding; topical embedding; representation learning;
D O I
10.1109/TKDE.2019.2922179
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Topic model and word embedding reflect two perspectives of text semantics. Topic model maps documents into topic distribution space by utilizing word collocation patterns within and across documents, while word embedding represents words within a continuous embedding space by exploiting the local word collocation patterns in context windows. Clearly, these two types of patterns are complementary. In this paper, we propose a novel integration framework to combine the two representation methods, where topic information can be transmitted into corresponding semantic embedding structure. Based on this framework, we construct a Embedding Enhanced Topic Model (EETM), which can improve topic modeling and generate topic embeddings by leveraging the word embedding. Extensive experimental results show that EETM can learn high-quality document representations for common text analysis tasks across multiple data sets, indicating it is very effective for merging topic models with word embeddings.
引用
收藏
页码:2322 / 2335
页数:14
相关论文
共 50 条
  • [1] Efficient Correlated Topic Modeling with Topic Embedding
    He, Junxian
    Hu, Zhiting
    Berg-Kirkpatrick, Taylor
    Huang, Ying
    Xing, Eric P.
    KDD'17: PROCEEDINGS OF THE 23RD ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2017, : 225 - 233
  • [2] Topic Modeling in Embedding Spaces
    Dieng, Adji B.
    Ruiz, Francisco J. R.
    Blei, David M.
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2020, 8 (439-453) : 439 - 453
  • [3] A Semantic Embedding Enhanced Topic Model For User-Generated Textual Content Modeling In Social Ecosystems
    Zhang, Peng
    Liu, Baoxi
    Lu, Tun
    Gu, Hansu
    Ding, Xianghua
    Gu, Ning
    COMPUTER JOURNAL, 2022, 65 (11): : 2953 - 2968
  • [4] A Word Embedding Model For Topic Recommendation
    Kannan, Megala S.
    Mahalakshmi, G. S.
    Smitha, E. S.
    Sendhilkumar, S.
    PROCEEDINGS OF THE 2018 SECOND INTERNATIONAL CONFERENCE ON INVENTIVE COMMUNICATION AND COMPUTATIONAL TECHNOLOGIES (ICICCT), 2018, : 826 - 830
  • [5] A Word Embedding Model For Topic Recommendation
    Kannan, Megala S.
    Mahalakshmi, G. S.
    Smitha, E. S.
    Sendhilkumar, S.
    PROCEEDINGS OF THE 2018 SECOND INTERNATIONAL CONFERENCE ON INVENTIVE COMMUNICATION AND COMPUTATIONAL TECHNOLOGIES (ICICCT), 2018, : 1307 - 1311
  • [6] A word embedding topic model for topic detection and summary in social networks
    Shi, Lei
    Cheng, Gang
    Xie, Shang-ru
    Xie, Gang
    MEASUREMENT & CONTROL, 2019, 52 (9-10): : 1289 - 1298
  • [7] A supervised topic embedding model and its application
    Xu, Weiran
    Eguchi, Koji
    PLOS ONE, 2022, 17 (11):
  • [8] Spatial Temporal Topic Embedding: A Semantic Modeling Method for Short Text in Social Network
    Yang, Congxian
    Du, Junping
    Kou, Feifei
    Lee, Jangmyung
    ARTIFICIAL INTELLIGENCE (ICAI 2018), 2018, 888 : 198 - 210
  • [9] Distilled Wasserstein Learning for Word Embedding and Topic Modeling
    Xu, Hongteng
    Wang, Wenlin
    Liu, Wei
    Carin, Lawrence
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [10] An Embedding-Based Topic Model for Document Classification
    Seifollahi, Sattar
    Piccardi, Massimo
    Jolfaei, Alireza
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2021, 20 (03)