Sparse Representation Based Query Classification Using LDA Topic Modeling

被引:2
作者
Bhattacharya, Indrani [1 ]
Sil, Jaya [1 ]
机构
[1] Indian Inst Engn Sci & Technol, Sibpur, Howrah, India
来源
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON DATA ENGINEERING AND COMMUNICATION TECHNOLOGY, ICDECT 2016, VOL 2 | 2017年 / 469卷
关键词
Topic modeling; LDA; Sparse classifier; Statistical methods;
D O I
10.1007/978-981-10-1678-3_59
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, tremendous growth of documents provides scope and challenges to the interdisciplinary research community in text processing for retrieving information. Text analytics reveals high-quality information by identifying patterns and its trends using statistical methods. In this paper, we propose a novel approach to classify user query in a reduced search space by considering the query as a collection of words distributed over different topics. Latent Dirichlet allocation (LDA) has been used for topic modeling and a collection of topics containing words are obtained following Dirichlet distribution. We construct a sparse matrix called topic-vocabulary matrix (TVM) using probability distribution of words appearing in the topics. Finally, sparse representation based classifier (SRC) has been applied for classifying query using TVM consisting of training patterns. Here, we have analyzed the effect of number of patterns in classifying the queries and achieved 90.4 % accuracy.
引用
收藏
页码:621 / 629
页数:9
相关论文
共 19 条
[11]  
Hofmann T, 1999, IJCAI-99: PROCEEDINGS OF THE SIXTEENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS 1 & 2, P688
[12]   An introduction to variational methods for graphical models [J].
Jordan, MI ;
Ghahramani, Z ;
Jaakkola, TS ;
Saul, LK .
MACHINE LEARNING, 1999, 37 (02) :183-233
[13]  
Papadimitriou C. H., 1998, Proceedings of the Seventeenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. PODS 1998, P159, DOI 10.1145/275487.275505
[14]  
Porteous I, 2008, P 14 ACM SIGKDD INT, P569, DOI DOI 10.1145/1401890.1401960
[15]   Automatic text structuring and summarization [J].
Salton, G ;
Singhal, A ;
Mitra, M ;
Buckley, C .
INFORMATION PROCESSING & MANAGEMENT, 1997, 33 (02) :193-207
[16]  
Salton G., 1983, INTRO MODERN INFORM
[17]   Collective Latent Dirichlet Allocation [J].
Shen, Zhi-Yong ;
Sun, Jun ;
Shen, Yi-Dong .
ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2008, :1019-1024
[18]   Sparse Representation Classifier Steered Discriminative Projection With Applications to Face Recognition [J].
Yang, Jian ;
Chu, Delin ;
Zhang, Lei ;
Xu, Yong ;
Yang, Jingyu .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2013, 24 (07) :1023-1035
[19]   Face recognition: A literature survey [J].
Zhao, W ;
Chellappa, R ;
Phillips, PJ ;
Rosenfeld, A .
ACM COMPUTING SURVEYS, 2003, 35 (04) :399-459