Clustering Improvement via Integrating with Sparse Topical Coding

被引:0
作者
Ahmadi, Parvin [1 ]
Kaviani, Razie [1 ]
Gholampour, Iman [1 ]
Tabandeh, Mahmoud [1 ]
机构
[1] Sharif Univ Technol, Dept Elect Engn, Tehran, Iran
来源
2015 23RD IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE) | 2015年
关键词
Document clustering; topic model; Sparse Topical Coding (STC); K-means;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Topic modeling can improve document clustering by projecting documents into a topic space. By document, we mean a general concept. Document can be an image, a video, a textual document or each data which can be described in bag-of-words model based on the histogram of its features. In this paper, we introduce a clustering method based on Sparse Topical Coding (STC). In the proposed method, document clustering and topic modeling are integrated into a unified framework and jointly performed to achieve the best clustering performance. Our method clusters the documents based on STC topic modeling used for mining the topics and K-means clustering used for discovering latent groups in document collection. Experimental results show the effectiveness of our proposed clustering approach.
引用
收藏
页码:466 / 471
页数:6
相关论文
共 6 条
  • [1] Aggarwal C. C., 2012, MINING TEXT DATA, P163, DOI [DOI 10.1007/978-1-4614-3223-46, DOI 10.1007/978-1-4614-3223-4, 10.1007/978-1-4614-3223-4]
  • [2] Latent Dirichlet allocation
    Blei, DM
    Ng, AY
    Jordan, MI
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 993 - 1022
  • [3] Locally Consistent Concept Factorization for Document Clustering
    Cai, Deng
    He, Xiaofei
    Han, Jiawei
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2011, 23 (06) : 902 - 913
  • [4] Hofmann T, 1999, UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, P289
  • [5] Investigating task performance of probabilistic topic models: an empirical study of PLSA and LDA
    Lu, Yue
    Mei, Qiaozhu
    Zhai, ChengXiang
    [J]. INFORMATION RETRIEVAL, 2011, 14 (02): : 178 - 203
  • [6] Hierarchical Dirichlet processes
    Teh, Yee Whye
    Jordan, Michael I.
    Beal, Matthew J.
    Blei, David M.
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2006, 101 (476) : 1566 - 1581