Novel Topic Models for Parallel Topics Extraction from Multilingual Text

被引:1
作者
Maanicshah, Kamal [1 ]
Manouchehri, Narges [1 ]
Amayri, Manar [1 ]
Bouguila, Nizar [1 ]
机构
[1] Concordia Univ, Concordia Inst Informat Syst Engn, Montreal, PQ, Canada
来源
INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2023, PT II | 2023年 / 13996卷
关键词
Multi-lingual topic models; Generalized Dirichlet distribution; Beta-Liouville distribution; mixture allocation; MIXTURE;
D O I
10.1007/978-981-99-5837-5_25
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we propose novel topic models to extract topics from multilingual documents. We add more flexibility to conventional LDA by relaxing some constraints in its prior. We apply other alternative priors namely generalized Dirichlet and Beta-Liouville distributions. Also, we extend finite mixture model to infinite case to provide flexibility in modelling various topics. To learn our proposed models, we deploy variational inference. To evaluate our framework, we tested it on English and French documents and compared topics and similarities by Jaccard index. The outcomes indicate that our proposed model could be considered as promising alternative in topic modeling.
引用
收藏
页码:297 / 309
页数:13
相关论文
共 14 条
[1]  
[Anonymous], 2009, P 2009 C EMP METH NA
[2]   A variational Bayes model for count data learning and classification [J].
Bakhtiari, Ali Shojaee ;
Bouguila, Nizar .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2014, 35 :176-186
[3]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[4]   Model-Based Clustering Based on Variational Learning of Hierarchical Infinite Beta-Liouville Mixture Models [J].
Fan, Wentao ;
Bouguila, Nizar .
NEURAL PROCESSING LETTERS, 2016, 44 (02) :431-449
[5]   A hierarchical Dirichlet process mixture of generalized Dirichlet distributions for feature selection [J].
Fan, Wentao ;
Sallay, Hassen ;
Bouguila, Nizar ;
Bourouis, Sami .
COMPUTERS & ELECTRICAL ENGINEERING, 2015, 43 :48-65
[6]   Variational Learning for Finite Dirichlet Mixture Models and Applications [J].
Fan, Wentao ;
Bouguila, Nizar ;
Ziou, Djemel .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2012, 23 (05) :762-774
[7]  
Gutierrez E. D., 2016, Transactions of the Asso- ciation for Computational Linguistics, V4, P47
[8]   Stochastic topic models for large scale and nonstationary data [J].
Ihou, Koffi Eddy ;
Bouguila, Nizar .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2020, 88 (88)
[9]   Multilingual Topic Models for Bilingual Dictionary Extraction [J].
Liu, Xiaodong ;
Duh, Kevin ;
Matsumoto, Yuji .
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2015, 14 (03)
[10]   iLDA: An interactive latent Dirichlet allocation model to improve topic quality [J].
Liu, Yezheng ;
Du, Fei ;
Sun, Jianshan ;
Jiang, Yuanchun .
JOURNAL OF INFORMATION SCIENCE, 2020, 46 (01) :23-40