Mongolian Information Retrieval Method Based on LDA Model

被引:0
作者
Siriguleng [1 ]
Lin, Min [1 ]
Tian, Changbo [1 ]
机构
[1] Inner Mongolia Normal Univ, Coll Comp & Informat Engn, Hohhot 010022, Inner Mongolia, Peoples R China
来源
PROCEEDINGS OF 2015 6TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE | 2015年
关键词
Mongolian; LDA; Gibbs Sampling; Information retrieval;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
A new method based on Latent Dirichlet Allocation (LDA) is proposed to retrieval information in Mongolian. Semantic information is also considered by Mongolian documents when consider relationship between keywords and retrieval documents. This method models Mongolian documents with LDA, parameters are estimated with Gibbs sampling and probability of word is represented, it can mine the hidden relationship between the different topics and the words from documents, get the topic distribution and compute the similarity of keywords topics. Finally, return to the most relevant documents with topics. Experimental results show that the method has a higher performance in topic semantic compared with vector space model and Language model.
引用
收藏
页码:353 / 356
页数:4
相关论文
共 12 条
[1]  
[Anonymous], 2003, J MACHINE LEARNING R
[2]  
[Anonymous], 2004, TECHNICAL REPORT
[3]  
B Qiong Zhi, 2014, APPL RES COMPUTERS
[4]  
Hua L Jin, 2010, INTELLIGENCE INFORM, P51
[5]  
Hua Liu Qi, 2014, INFORM SCI, V8
[6]  
Huo Y Yong, 2006, COMBINING VECTOR SPA
[7]  
Qiang D Guo, 2013, RES IMPLEMENTATION M
[8]  
Qun He Jin, 2014, RES LDA MODEL INFORM
[9]  
Wei J, 2009, RES MONGOLIAN INFORM
[10]  
Xing Wei, 2006, Proceedings of the Twenty-Ninth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P178