Text Categorization Based on Topic Model

被引:0
|
作者
School of Computer Science and Technology, China University of Mining and Technology, Jiangsu Province, Xuzhou [1 ]
221116, China
不详 [2 ]
100081, China
机构
[1] School of Computer Science and Technology, China University of Mining and Technology, Jiangsu Province, Xuzhou
[2] School of Computer Science and Technology, Beijing Institute of Technology, Haidian District, Beijing
来源
Int. J. Comput. Intell. Syst. | 2009年 / 4卷 / 398-409期
关键词
Category Language Model; Latent Dirichlet allocation; Topic model; Variational Inference;
D O I
10.2991/ijcis.2009.2.4.8
中图分类号
学科分类号
摘要
In the text literature, many topic models were proposed to represent documents and words as topics or latent topics in order to process text effectively and accurately. In this paper, we propose LDACLM or Latent Dirichlet Allocation Category Language Model for text categorization and estimate parameters of models by variational inference. As a variant of Latent Dirichlet Allocation Model, LDACLM regards documents of category as Language Model and uses variational parameters to estimate maximum a posteriori of terms. In general, experiments show LDACLM model is effective and outperform Naïve Bayes with Laplace smoothing and Rocchio algorithm but little inferior to SVM for text categorization. © 2009, the authors.
引用
收藏
页码:398 / 409
页数:11
相关论文
共 50 条
  • [31] Short text classification using semantically enriched topic model
    Uddin, Farid
    Chen, Yibo
    Zhang, Zuping
    Huang, Xin
    JOURNAL OF INFORMATION SCIENCE, 2025, 51 (02) : 481 - 498
  • [32] Automated Text Document Categorization
    Yasotha, R.
    Charles, E. Y. A.
    2015 IEEE SEVENTH INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND INFORMATION SYSTEMS (ICICIS), 2015, : 522 - 528
  • [33] Multi-LDA hybrid topic model with boosting strategy and its application in text classification
    Wang Yongliang
    Guo Qiao
    2014 33RD CHINESE CONTROL CONFERENCE (CCC), 2014, : 4802 - 4806
  • [34] Multi-level Topical Text Categorization with Wikipedia
    Guo, Nan
    He, Yuan
    Yan, ChunGang
    Liu, Lu
    Wang, Cheng
    2016 IEEE/ACM 9TH INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING (UCC), 2016, : 343 - 352
  • [35] I-Topic: An Image-text Topic Modeling Method Based on Community Detection
    Liu, Jiapeng
    Zhang, Leihan
    Yan, Qiang
    2024 5TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND APPLICATION, ICCEA 2024, 2024, : 797 - 800
  • [36] Short Text Mining for Fault Diagnosis of Railway System Based on Multi-Granularity Topic Model
    Wu, Shun
    2018 8TH INTERNATIONAL CONFERENCE ON LOGISTICS, INFORMATICS AND SERVICE SCIENCES (LISS), 2018,
  • [37] A Novel Chinese Text Topic Extraction Method Based on LDA
    Liu, Qihua
    PROCEEDINGS OF 2015 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2015), 2015, : 53 - 57
  • [38] A short text sentiment-topic model for product review analysis
    Xiong S.-F.
    Ji D.-H.
    Zidonghua Xuebao/Acta Automatica Sinica, 2016, 42 (08): : 1227 - 1237
  • [39] An Efficient Framework by Topic Model for Multi-label Text Classification
    Sun, Wei
    Ran, Xiangying
    Luo, Xiangyang
    Wang, Chongjun
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [40] Decarbonization of Turkey: Text Mining Based Topic Modeling for the Literature
    Yilmaz, Selin
    Yesil, Ercem
    Kaya, Tolga
    INTELLIGENT AND FUZZY SYSTEMS: DIGITAL ACCELERATION AND THE NEW NORMAL, INFUS 2022, VOL 2, 2022, 505 : 372 - 379