Initialization in Gibbs Sampling Implementation of LDA

被引:0
作者
Tekin, Yasar [1 ]
机构
[1] Techinsoft Bilisim Teknol AS, Merkez, Turkiye
来源
32ND IEEE SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU 2024 | 2024年
关键词
topic modeling; LDA; Gibbs sampling; random and fixed initialization;
D O I
10.1109/SIU61531.2024.10600919
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Latent Dirichlet Allocation (LDA) is a text mining technique used for automatic extraction of topics addressed in document collections. Sampling-based or variational inference algorithms are used to approximate the posterior in LDA. In this study, the effect of random initialization in Gibbs Sampling implementation of LDA on model coherence is investigated. For this purpose, topic models using the same parameter values are initialized with random and fixed values, and model coherence scores obtained are compared with each other.
引用
收藏
页数:4
相关论文
共 15 条
[1]   Variational Inference: A Review for Statisticians [J].
Blei, David M. ;
Kucukelbir, Alp ;
McAuliffe, Jon D. .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2017, 112 (518) :859-877
[2]   Probabilistic Topic Models [J].
Blei, David M. .
COMMUNICATIONS OF THE ACM, 2012, 55 (04) :77-84
[3]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[4]  
Bolstad WM., 2010, Understanding Computational Bayesian Statistics, V644
[5]  
Boyd CE, 2015, AQUACULTURE, RESOURCE USE, AND THE ENVIRONMENT, P1, DOI 10.1002/9781118857915
[6]   EXPLAINING THE GIBBS SAMPLER [J].
CASELLA, G ;
GEORGE, EI .
AMERICAN STATISTICIAN, 1992, 46 (03) :167-174
[7]  
Corrado G., 2013, WORKSH P INT C LEARN, V1301, P3781
[8]   Finding scientific topics [J].
Griffiths, TL ;
Steyvers, M .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 :5228-5235
[9]   Can topic models be used in research evaluations? Reproducibility, validity, and reliability when compared with semantic maps [J].
Hecking, Tobias ;
Leydesdorff, Loet .
RESEARCH EVALUATION, 2019, 28 (03) :263-272
[10]   Data Analysis Recipes: Using Markov Chain Monte Carlo [J].
Hogg, David W. ;
Foreman-Mackey, Daniel .
ASTROPHYSICAL JOURNAL SUPPLEMENT SERIES, 2018, 236 (01)