Integration of Neural Embeddings and Probabilistic Models in Topic Modeling

被引:0
|
作者
Koochemeshkian, Pantea [1 ]
Bouguila, Nizar [1 ]
机构
[1] Concordia Inst Informat Syst Engn CIISE, Informat Syst Engn, Montreal, PQ, Canada
关键词
DIRICHLET; EXTRACTION;
D O I
10.1080/08839514.2024.2403904
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Topic modeling, a way to find topics in large volumes of text, has grown with the help of deep learning. This paper presents two novel approaches to topic modeling by integrating embeddings derived from Bert-Topic with the multi-grain clustering topic model (MGCTM). Recognizing the inherent hierarchical and multi-scale nature of topics in corpora, our methods utilize MGCTM to capture topic structures at multiple levels of granularity. We enhance the expressiveness of MGCTM by introducing the Generalized Dirichlet and Beta-Liouville distributions as priors, which provide greater flexibility in modeling topic proportions and capturing richer topic relationships. Comprehensive experiments on various datasets showcase the effectiveness of our proposed models in achieving superior topic coherence and granularity compared to state-of-the-art methods. Our findings underscore the potential of leveraging hybrid architectures, marrying neural embeddings with advanced probabilistic modeling, to push the boundaries of topic modeling.
引用
收藏
页数:33
相关论文
共 50 条
  • [21] A Comparison of Word Embeddings and N-gram Models for DBpedia Type and Invalid Entity Detection
    Zhou, Hanqing
    Zouaq, Amal
    Inkpen, Diana
    INFORMATION, 2019, 10 (01)
  • [22] Recurrent neural networks with specialized word embeddings for health-domain named-entity recognition
    Unanue, Inigo Jauregi
    Borzeshi, Ehsan Zare
    Piccardi, Massimo
    JOURNAL OF BIOMEDICAL INFORMATICS, 2017, 76 : 102 - 109
  • [23] Variable resolution probabilistic modeling of residential exposure and vulnerability for risk applications
    Pittore, Massimiliano
    Haas, Michael
    Silva, Vitor
    EARTHQUAKE SPECTRA, 2020, 36 (1_SUPPL) : 321 - 344
  • [24] Fitting semi-empirical drying models using a tool based on wavelet neural networks: Modeling a maize drying process
    Claumann, Carlos Alberto
    Cancelier, Adriano
    da Silva, Adriano
    Zibetti, Andre Wuest
    Lopes, Toni Jefferson
    Francisco Machado, Ricardo Antonio
    JOURNAL OF FOOD PROCESS ENGINEERING, 2018, 41 (01)
  • [25] A Probabilistic Framework for Online Analysis of Alarm Floods Using Convolutional Neural Networks
    Alinezhad, Haniyeh Seyed
    Shang, Jun
    Chen, Tongwen
    Shah, Sirish L.
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73 : 1 - 11
  • [26] A Two-Stage Biomedical Event Trigger Detection Method Based on Hybrid Neural Network and Sentence Embeddings
    He, Xinyu
    Ren, Yonggong
    Tai, Ping
    Shi, Hui
    IEEE ACCESS, 2021, 9 : 81926 - 81935
  • [27] Neural Modeling of Bromelain Extraction by Reversed Micelles
    Frattini Fileti, Ana Maria
    Fischer, Gilvan Anderson
    Tambourgi, Elias Basile
    BRAZILIAN ARCHIVES OF BIOLOGY AND TECHNOLOGY, 2010, 53 (02) : 455 - 463
  • [28] Microwave neural modeling for silicon FinFET varactors
    Marinkovic, Zlatica
    Crupi, Giovanni
    Schreurs, Dominique M. M. -P.
    Caddemi, Alina
    Markovic, Vera
    INTERNATIONAL JOURNAL OF NUMERICAL MODELLING-ELECTRONIC NETWORKS DEVICES AND FIELDS, 2014, 27 (5-6) : 834 - 845
  • [29] A three-way approach for learning rules in automatic knowledge-based topic models
    Khan, Muhammad Taimoor
    Azam, Nouman
    Khalid, Shehzad
    Yao, JingTao
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2017, 82 : 210 - 226
  • [30] On Classifying Diabetic Patients' with Proliferative Retinopathies via a Radial Basis Probabilistic Neural Network
    Carnimeo, Leonarda
    Nitti, Rosamaria
    ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS, ICIC 2015, PT III, 2015, 9227 : 115 - 126