Integration of Neural Embeddings and Probabilistic Models in Topic Modeling

被引:0
|
作者
Koochemeshkian, Pantea [1 ]
Bouguila, Nizar [1 ]
机构
[1] Concordia Inst Informat Syst Engn CIISE, Informat Syst Engn, Montreal, PQ, Canada
关键词
DIRICHLET; EXTRACTION;
D O I
10.1080/08839514.2024.2403904
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Topic modeling, a way to find topics in large volumes of text, has grown with the help of deep learning. This paper presents two novel approaches to topic modeling by integrating embeddings derived from Bert-Topic with the multi-grain clustering topic model (MGCTM). Recognizing the inherent hierarchical and multi-scale nature of topics in corpora, our methods utilize MGCTM to capture topic structures at multiple levels of granularity. We enhance the expressiveness of MGCTM by introducing the Generalized Dirichlet and Beta-Liouville distributions as priors, which provide greater flexibility in modeling topic proportions and capturing richer topic relationships. Comprehensive experiments on various datasets showcase the effectiveness of our proposed models in achieving superior topic coherence and granularity compared to state-of-the-art methods. Our findings underscore the potential of leveraging hybrid architectures, marrying neural embeddings with advanced probabilistic modeling, to push the boundaries of topic modeling.
引用
收藏
页数:33
相关论文
共 50 条
  • [1] ONLINE TIME-DEPENDENT CLUSTERING USING PROBABILISTIC TOPIC MODELS
    Renard, Benjamin
    Kharratzadeh, Milad
    Coates, Mark
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 2036 - 2040
  • [2] Improving biterm topic model with word embeddings
    Huang, Jiajia
    Peng, Min
    Li, Pengwei
    Hu, Zhiwei
    Xu, Chao
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2020, 23 (06): : 3099 - 3124
  • [3] Neural Networks with Emotion Associations, Topic Modeling and Supervised Term Weighting for Sentiment Analysis
    Hajek, Petr
    Barushka, Aliaksandr
    Munk, Michal
    INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 2021, 31 (10)
  • [4] A network approach to topic models
    Gerlach, Martin
    Peixoto, Tiago P.
    Altmann, Eduardo G.
    SCIENCE ADVANCES, 2018, 4 (07):
  • [5] Scalable Training of Hierarchical Topic Models
    Chen, Jianfei
    Zhu, Jun
    Lu, Jie
    Liu, Shixia
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2018, 11 (07): : 826 - 839
  • [6] Novel mixture allocation models for topic learning
    Maanicshah, Kamal
    Amayri, Manar
    Bouguila, Nizar
    COMPUTATIONAL INTELLIGENCE, 2024, 40 (02)
  • [7] Stochastic topic models for large scale and nonstationary data
    Ihou, Koffi Eddy
    Bouguila, Nizar
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2020, 88 (88)
  • [8] A Decade of Sentic Computing: Topic Modeling and Bibliometric Analysis
    Chen, Xieling
    Xie, Haoran
    Cheng, Gary
    Li, Zongxi
    COGNITIVE COMPUTATION, 2022, 14 (01) : 24 - 47
  • [9] The Biased Coin Flip Process for Nonparametric Topic Modeling
    Wood, Justin
    Wang, Wei
    Arnold, Corey
    DOCUMENT ANALYSIS AND RECOGNITION - ICDAR 2021, PT II, 2021, 12822 : 68 - 83
  • [10] Approximating Probabilistic Models as Weighted Finite Automata
    Suresh, Ananda Theertha
    Roark, Brian
    Riley, Michael
    Schogol, Vlad
    COMPUTATIONAL LINGUISTICS, 2021, 47 (02) : 221 - 254