On the equivalence between Non-negative Matrix Factorization and Probabilistic Latent Semantic Indexing

被引:163
作者
Ding, Chris [2 ]
Li, Tao [1 ]
Peng, Wei [1 ]
机构
[1] Florida Int Univ, Sch Comp Sci, Miami, FL 33199 USA
[2] Univ Texas Arlington, Dept CSE, Arlington, TX 76019 USA
基金
美国国家科学基金会;
关键词
D O I
10.1016/j.csda.2008.01.011
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Non-negative Matrix Factorization (NMF) and Probabilistic Latent Semantic Indexing (PLSI) have been successfully applied to document clustering recently. In this paper, we show that PLSI and NMF (with the I-divergence objective function) optimize the same objective function, although PLSI and NMF are different algorithms as verified by experiments. This provides a theoretical basis for a new hybrid method that runs PLSI and NMF alternatively, each jumping out of the local minima of the other method successively, thus achieving a better final solution. Extensive experiments on five real-life datasets show relations between NMF and PLSI, and indicate that the hybrid method leads to significant improvements over NMF-only or PLSI-only methods. We also show that at first-order approximation, NMF is identical to the X-2-statistic. (c) 2008 Published by Elsevier B.V.
引用
收藏
页码:3913 / 3927
页数:15
相关论文
共 14 条
  • [1] [Anonymous], 2001, ADV NEURAL INFORM PR
  • [2] [Anonymous], 2003, ACM SIGIR
  • [3] [Anonymous], P SIAM DAT MIN C
  • [4] [Anonymous], 2005, KDD'05, DOI [10.1145/1081870.1081894, DOI 10.1145/1081870.1081894]
  • [5] Latent Dirichlet allocation
    Blei, DM
    Ng, AY
    Jordan, MI
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 993 - 1022
  • [6] Ding C., 2006, P NAT C ART INT
  • [7] Gaussier E., 2005, SIGIR 2005. Proceedings of the Twenty-Eighth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P601, DOI 10.1145/1076034.1076148
  • [8] HAN EH, 1998, P 2 INT C AUT AG AG
  • [9] Hofmann T, 1999, UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, P289
  • [10] Learning the parts of objects by non-negative matrix factorization
    Lee, DD
    Seung, HS
    [J]. NATURE, 1999, 401 (6755) : 788 - 791