WINDOW-BASED TOPIC MODEL FOR HDP

被引:0
|
作者
Liu, Di [1 ]
Zeng, Ye [1 ]
Luo, Yu [1 ]
Pang, Hong [1 ]
Wu, Xiao-Hua [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Informat & Software Engn, Chengdu 610054, Peoples R China
来源
2019 16TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICWAMTIP) | 2019年
关键词
Hierarchical Dirichlet process; Topic model; Window; Belief propagation;
D O I
10.1109/iccwamtip47768.2019.9067737
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Hierarchical Dirichlet process (HDP) is a non-parametric Bayesian model, and has been widely applied in the application of topic models. However, the model is based on the "bag of words" hypothesis, ignoring the order of words in the document, resulting in a lack of word context semantics. In this regard, this paper proposes a window-based hierarchical Dirichlet process model (WHDP). The model uses windows to divide documents into smaller fragments, and keeps the order between words while moving windows, so as to reduce the semantic confusion of the text. We applied our method in real dataset and compared with other existing methods, such as sampling belief propagation algorithm for HDP, LDA model, and sliding window based topic model. The results show that the proposed method performs the superiority in convergence rate, perplexity and generalization ability.
引用
收藏
页码:70 / 75
页数:6
相关论文
共 50 条
  • [1] A Window-Based Self-Attention approach for sentence encoding
    Huang, Ting
    Deng, Zhi-Hong
    Shen, Gehui
    Chen, Xi
    NEUROCOMPUTING, 2020, 375 : 25 - 31
  • [2] Fair end-to-end window-based congestion control
    Mo, JH
    Walrand, J
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2000, 8 (05) : 556 - 567
  • [3] Fair end-to-end window-based congestion control
    Mo, J
    Walrand, J
    PERFORMANCE AND CONTROL OF NETWORK SYSTEMS II, 1998, 3530 : 55 - 63
  • [4] HDP-TUB Based Topic Mining Method for Chinese Micro-blogs
    Zhang, Yaorong
    Yang, Bo
    Yi, Li
    Liu, Yi
    Zhang, Yangsen
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2017, 2018, 10619 : 856 - 865
  • [5] An Efficient Window-Based Stereo Matching Algorithm using Foreground Disparity Concentration
    Bai, Xuejiao
    Kamata, Sei-ichiro
    2012 12TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS & VISION (ICARCV), 2012, : 1352 - 1357
  • [6] ADWISE: Adaptive Window-based Streaming Edge Partitioning for High-Speed Graph Processing
    Mayer, Christian
    Mayer, Ruben
    Tariq, Muhammad Adnan
    Geppert, Heiko
    Laich, Larissa
    Rieger, Lukas
    Rothermel, Kurt
    2018 IEEE 38TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS), 2018, : 685 - 695
  • [7] Topic model with incremental vocabulary based on Belief Propagation
    Wang, Meng
    Yang, Lu
    Yan, JianFeng
    Zhang, Jianwei
    Zhou, Jie
    Xia, Peng
    KNOWLEDGE-BASED SYSTEMS, 2019, 182
  • [8] A multiple window-based co-location pattern mining approach for various types of spatial data
    Venkatesan, M.
    Thangavelu, Arunkumar
    INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN TECHNOLOGY, 2013, 48 (02) : 144 - 154
  • [9] A Topic Model Based on Poisson Decomposition
    Jiang, Haixin
    Zhou, Rui
    Zhang, Limeng
    Wang, Hua
    Zhang, Yanchun
    CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 1489 - 1498
  • [10] Text Categorization Based on Topic Model
    Zhou, Shibin
    Li, Kan
    Liu, Yushu
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2009, 2 (04) : 398 - 409