Sparse Representation Based Query Classification Using LDA Topic Modeling

被引:2
作者
Bhattacharya, Indrani [1 ]
Sil, Jaya [1 ]
机构
[1] Indian Inst Engn Sci & Technol, Sibpur, Howrah, India
来源
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON DATA ENGINEERING AND COMMUNICATION TECHNOLOGY, ICDECT 2016, VOL 2 | 2017年 / 469卷
关键词
Topic modeling; LDA; Sparse classifier; Statistical methods;
D O I
10.1007/978-981-10-1678-3_59
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, tremendous growth of documents provides scope and challenges to the interdisciplinary research community in text processing for retrieving information. Text analytics reveals high-quality information by identifying patterns and its trends using statistical methods. In this paper, we propose a novel approach to classify user query in a reduced search space by considering the query as a collection of words distributed over different topics. Latent Dirichlet allocation (LDA) has been used for topic modeling and a collection of topics containing words are obtained following Dirichlet distribution. We construct a sparse matrix called topic-vocabulary matrix (TVM) using probability distribution of words appearing in the topics. Finally, sparse representation based classifier (SRC) has been applied for classifying query using TVM consisting of training patterns. Here, we have analyzed the effect of number of patterns in classifying the queries and achieved 90.4 % accuracy.
引用
收藏
页码:621 / 629
页数:9
相关论文
共 50 条
  • [1] Query Classification using LDA Topic Model and Sparse Representation Based Classifier
    Bhattacharya, Indrani
    Sil, Jaya
    PROCEEDINGS OF THE THIRD ACM IKDD CONFERENCE ON DATA SCIENCES (CODS), 2016,
  • [2] Fast Detection of Duplicate Bug Reports using LDA-based Topic Modeling and Classification
    Akilan, Thangarajah
    Shah, Dhruvit
    Patel, Nishi
    Mehta, Rinkal
    2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 1622 - 1629
  • [3] Financial Topic Modeling Based on the BERT-LDA Embedding
    Zhou, Mei
    Kong, Ying
    Lin, Jianwu
    2022 IEEE 20TH INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN), 2022, : 495 - 500
  • [4] LDA Based Topic Modeling of Journal Abstracts
    Anupriya, P.
    Karpagavalli, S.
    ICACCS 2015 PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING & COMMUNICATION SYSTEMS, 2015,
  • [5] Bangla News Trend Observation using LDA Based Topic Modeling
    Alam, Kazi Masudul
    Hemel, Md Tanvir Hussain
    Islam, S. M. Muhaiminul
    Akther, Avsha
    2020 23RD INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (ICCIT 2020), 2020,
  • [6] SHORT TEXT CLASSIFICATION BASED ON LDA TOPIC MODEL
    Chen, Qiuxing
    Yao, Lixiu
    Yang, Jie
    PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), 2016, : 749 - 753
  • [7] Web content topic modeling using LDA and HTML']HTML tags
    Altarturi, Hamza H. M.
    Saadoon, Muntadher
    Anuar, Nor Badrul
    PEERJ COMPUTER SCIENCE, 2023, 9
  • [8] LDA-Based Topic Modeling Sentiment Analysis Using Topic/Document/Sentence (TDS) Model
    Farkhod, Akhmedov
    Abdusalomov, Akmalbek
    Makhmudov, Fazliddin
    Cho, Young Im
    APPLIED SCIENCES-BASEL, 2021, 11 (23):
  • [9] Text classification method based on self-training and LDA topic models
    Pavlinek, Miha
    Podgorelec, Vili
    EXPERT SYSTEMS WITH APPLICATIONS, 2017, 80 : 83 - 93
  • [10] Clustering of Business Organisations based on Textual Data - An LDA Topic Modeling Approach
    Tolner, Ferenc
    Takacs, Marta
    Eigner, Gyorgy
    Barta, Balazs
    21ST IEEE INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND INFORMATICS (CINTI), 2021, : 79 - 84