Sparse Representation Based Query Classification Using LDA Topic Modeling

被引:2
作者
Bhattacharya, Indrani [1 ]
Sil, Jaya [1 ]
机构
[1] Indian Inst Engn Sci & Technol, Sibpur, Howrah, India
来源
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON DATA ENGINEERING AND COMMUNICATION TECHNOLOGY, ICDECT 2016, VOL 2 | 2017年 / 469卷
关键词
Topic modeling; LDA; Sparse classifier; Statistical methods;
D O I
10.1007/978-981-10-1678-3_59
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, tremendous growth of documents provides scope and challenges to the interdisciplinary research community in text processing for retrieving information. Text analytics reveals high-quality information by identifying patterns and its trends using statistical methods. In this paper, we propose a novel approach to classify user query in a reduced search space by considering the query as a collection of words distributed over different topics. Latent Dirichlet allocation (LDA) has been used for topic modeling and a collection of topics containing words are obtained following Dirichlet distribution. We construct a sparse matrix called topic-vocabulary matrix (TVM) using probability distribution of words appearing in the topics. Finally, sparse representation based classifier (SRC) has been applied for classifying query using TVM consisting of training patterns. Here, we have analyzed the effect of number of patterns in classifying the queries and achieved 90.4 % accuracy.
引用
收藏
页码:621 / 629
页数:9
相关论文
共 19 条
[1]  
[Anonymous], 1999, MODERN INFORM RETRIE
[2]  
[Anonymous], 2008, P 14 ACM SIGKDD INT
[3]  
[Anonymous], 2006, 23 INT C MACH LEARN, DOI [10.1145/1143844.1143874, DOI 10.1145/1143844.1143874]
[4]  
[Anonymous], 2008, INTRO INFORM RETRIEV, DOI DOI 10.1017/CBO9780511809071
[5]  
Blei D., 2006, ADV NEURAL INFORM PR
[6]  
Blei DM, 2004, ADV NEUR IN, V16, P17
[7]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[8]   Atomic decomposition by basis pursuit [J].
Chen, SSB ;
Donoho, DL ;
Saunders, MA .
SIAM JOURNAL ON SCIENTIFIC COMPUTING, 1998, 20 (01) :33-61
[9]  
DEERWESTER S, 1990, J AM SOC INFORM SCI, V41, P391, DOI 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO
[10]  
2-9