Text Categorization Based on Topic Model

被引:0
|
作者
School of Computer Science and Technology, China University of Mining and Technology, Jiangsu Province, Xuzhou [1 ]
221116, China
不详 [2 ]
100081, China
机构
[1] School of Computer Science and Technology, China University of Mining and Technology, Jiangsu Province, Xuzhou
[2] School of Computer Science and Technology, Beijing Institute of Technology, Haidian District, Beijing
来源
Int. J. Comput. Intell. Syst. | 2009年 / 4卷 / 398-409期
关键词
Category Language Model; Latent Dirichlet allocation; Topic model; Variational Inference;
D O I
10.2991/ijcis.2009.2.4.8
中图分类号
学科分类号
摘要
In the text literature, many topic models were proposed to represent documents and words as topics or latent topics in order to process text effectively and accurately. In this paper, we propose LDACLM or Latent Dirichlet Allocation Category Language Model for text categorization and estimate parameters of models by variational inference. As a variant of Latent Dirichlet Allocation Model, LDACLM regards documents of category as Language Model and uses variational parameters to estimate maximum a posteriori of terms. In general, experiments show LDACLM model is effective and outperform Naïve Bayes with Laplace smoothing and Rocchio algorithm but little inferior to SVM for text categorization. © 2009, the authors.
引用
收藏
页码:398 / 409
页数:11
相关论文
共 50 条
  • [1] Text Categorization Based on Topic Model
    Zhou, Shibin
    Li, Kan
    Liu, Yushu
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2009, 2 (04) : 398 - 409
  • [2] Text categorization based on topic model
    Zhou, Shibin
    Li, Kan
    Liu, Yushu
    ROUGH SETS AND KNOWLEDGE TECHNOLOGY, 2008, 5009 : 572 - 579
  • [3] SLDA-TC: A Novel Text Categorization Approach Based on Supervised Topic Model
    Tang H.-L.
    Dou Q.-S.
    Yu L.-P.
    Song Y.-J.
    Lu M.-Y.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2019, 47 (06): : 1300 - 1308
  • [4] News Text Classification Model Based on Topic Model
    Li, Zhenzhong
    Shang, Wenqian
    Yan, Menghan
    2016 IEEE/ACIS 15TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS), 2016, : 1197 - 1201
  • [5] Topic Model Based Text Similarity Measure for Chinese Judgment Document
    Wang, Yue
    Ge, Jidong
    Zhou, Yemao
    Feng, Yi
    Li, Chuanyi
    Li, Zhongjin
    Zhou, Xiaoyu
    Luo, Bin
    DATA SCIENCE, PT II, 2017, 728 : 42 - 54
  • [6] Feature Selection based on Supervised Topic Modeling for Boosting-Based Multi-Label Text Categorization
    Al-Salemi, Bassam
    Ayob, Masri
    Noah, Shahrul Azman Mohd
    Ab Aziz, Mohd Juzaiddin
    PROCEEDINGS OF THE 2017 6TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND INFORMATICS (ICEEI'17), 2017,
  • [7] Text Classification of Network Pyramid Scheme based on Topic Model
    Mu, Pengyu
    He, Jingsha
    Zhu, Nafei
    NLPIR 2019: 2019 3RD INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, 2019, : 15 - 19
  • [8] Smoothing LDA model for text categorization
    Li, Wenbo
    Sun, Le
    Feng, Yuanyong
    Zhang, Dakun
    INFORMATION RETRIEVAL TECHNOLOGY, 2008, 4993 : 83 - +
  • [9] Scene Categorization Using Topic Model Based Hierarchical Conditional Random Fields
    Garg, Vikram
    Hassan, Ehtesham
    Chaudhury, Santanu
    Gopal, M.
    PATTERN RECOGNITION AND MACHINE INTELLIGENCE, 2011, 6744 : 206 - 212
  • [10] SPARSE TOPIC MODEL FOR TEXT CLASSIFICATION
    Liu, Tao
    PROCEEDINGS OF 2013 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOLS 1-4, 2013, : 1916 - 1920