Integration of Neural Embeddings and Probabilistic Models in Topic Modeling

被引:0
|
作者
Koochemeshkian, Pantea [1 ]
Bouguila, Nizar [1 ]
机构
[1] Concordia Inst Informat Syst Engn CIISE, Informat Syst Engn, Montreal, PQ, Canada
关键词
DIRICHLET; EXTRACTION;
D O I
10.1080/08839514.2024.2403904
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Topic modeling, a way to find topics in large volumes of text, has grown with the help of deep learning. This paper presents two novel approaches to topic modeling by integrating embeddings derived from Bert-Topic with the multi-grain clustering topic model (MGCTM). Recognizing the inherent hierarchical and multi-scale nature of topics in corpora, our methods utilize MGCTM to capture topic structures at multiple levels of granularity. We enhance the expressiveness of MGCTM by introducing the Generalized Dirichlet and Beta-Liouville distributions as priors, which provide greater flexibility in modeling topic proportions and capturing richer topic relationships. Comprehensive experiments on various datasets showcase the effectiveness of our proposed models in achieving superior topic coherence and granularity compared to state-of-the-art methods. Our findings underscore the potential of leveraging hybrid architectures, marrying neural embeddings with advanced probabilistic modeling, to push the boundaries of topic modeling.
引用
收藏
页数:33
相关论文
共 50 条
  • [11] A probabilistic fuzzy approach to modeling nonlinear systems
    Song Hengjie
    Miao, Chunyan
    Shen, Zhiqi
    Roel, Wuyts
    D'Hondt, Maja
    Francky, Catthoor
    NEUROCOMPUTING, 2011, 74 (06) : 1008 - 1025
  • [12] Unsupervised Topic Modeling in a Large Free Text Radiology Report Repository
    Hassanpour, Saeed
    Langlotz, Curtis P.
    JOURNAL OF DIGITAL IMAGING, 2016, 29 (01) : 59 - 62
  • [13] Topic Modeling for Mining Opinion Aspects from a Customer Feedback Corpus
    Babina, O. I.
    AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS, 2024, 58 (01) : 63 - 79
  • [14] Probabilistic Integration of Intensity and Depth Information for Part-Based Vehicle Detection
    Makris, Alexandros
    Perrollaz, Mathias
    Laugier, Christian
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2013, 14 (04) : 1896 - 1906
  • [15] Neural joint attention code search over structure embeddings for software Q&A sites
    Hu, Gang
    Peng, Min
    Zhang, Yihan
    Xie, Qianqian
    Yuan, Mengting
    JOURNAL OF SYSTEMS AND SOFTWARE, 2020, 170
  • [16] A Structural Topic Modeling-Based Bibliometric Study of Sentiment Analysis Literature
    Chen, Xieling
    Xie, Haoran
    COGNITIVE COMPUTATION, 2020, 12 (06) : 1097 - 1129
  • [17] Spatial polychaeta habitat potential mapping using probabilistic models
    Choi, Jong-Kuk
    Oh, Hyun-Joo
    Koo, Bon Joo
    Ryu, Joo-Hyung
    Lee, Saro
    ESTUARINE COASTAL AND SHELF SCIENCE, 2011, 93 (02) : 98 - 105
  • [18] A Probabilistic Approach Towards Modeling Email Network With Realistic Features
    Li, Quangang
    Shi, Jinqiao
    Liu, Tingwen
    Guo, Li
    Qin, Zhiguang
    2014 23RD INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND NETWORKS (ICCCN), 2014,
  • [19] A probabilistic approach to river network detection in digital elevation models
    Poggio, Laura
    Soille, Pierre
    CATENA, 2011, 87 (03) : 341 - 350
  • [20] Effective interrelation of Bayesian nonparametric document clustering and embedded-topic modeling
    Costa, Gianni
    Ortale, Riccardo
    KNOWLEDGE-BASED SYSTEMS, 2021, 234