An NMF-framework for Unifying Posterior Probabilistic Clustering and Probabilistic Latent Semantic Indexing

被引:1
|
作者
Zhang, Zhong-Yuan [1 ]
Li, Tao [2 ]
Ding, Chris [3 ]
Tang, Jie [4 ]
机构
[1] Cent Univ Finance & Econ, Sch Math & Stat, Beijing, Peoples R China
[2] Florida Int Univ, Sch Comp & Informat Sci, Miami, FL 33199 USA
[3] Univ Texas Arlington, Dept Comp Sci & Engn, Arlington, TX 76019 USA
[4] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
基金
中国国家自然科学基金; 美国国家科学基金会;
关键词
Posterior probabilistic clustering; Probabilistic latent semantic indexing; NMF-framework; MATRIX FACTORIZATION;
D O I
10.1080/03610926.2012.714034
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In document clustering, a document may be assigned to multiple clusters and the probabilities of a document belonging to different clusters are directly normalized. We propose a new Posterior Probabilistic Clustering (PPC) model that has this normalization property. The clustering model is based on Nonnegative Matrix Factorization (NMF) and flexible such that if we use class conditional probability normalization, the model reduces to Probabilistic Latent Semantic Indexing (PLSI). Systematic comparison and evaluation indicates that PPC is competitive with other state-of-art clustering methods. Furthermore, the results of PPC are more sparse and orthogonal, both of which are highly desirable.
引用
收藏
页码:4011 / 4024
页数:14
相关论文
共 50 条
  • [1] Probabilistic latent semantic indexing
    Hofmann, T
    SIGIR'99: PROCEEDINGS OF 22ND INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 1999, : 50 - 57
  • [2] Latent semantic indexing: A probabilistic analysis
    Papadimitriou, CH
    Raghavan, P
    Tamaki, H
    Vempala, S
    JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2000, 61 (02) : 217 - 235
  • [3] A probabilistic model for Latent Semantic Indexing
    Ding, CHQ
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2005, 56 (06): : 597 - 608
  • [4] Semantic Video Indexing using a probabilistic framework
    Naphade, MR
    Huang, TS
    15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 3, PROCEEDINGS: IMAGE, SPEECH AND SIGNAL PROCESSING, 2000, : 79 - 84
  • [5] A probabilistic framework for semantic indexing and retrieval in video
    Naphade, MR
    Huang, TS
    2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 475 - 478
  • [6] A novel updating scheme for probabilistic latent semantic indexing
    Kotropoulos, Constantine
    Papaioannou, Athanasios
    ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 3955 : 137 - 147
  • [7] Combining probabilistic ranking and latent semantic indexing for feature identification
    Poshyvanyk, Denys
    Guehenuec, Yann-Gael
    Marcus, Andrian
    Antoniol, Giuliano
    Rajlich, Vaclav
    14TH IEEE INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION (ICPC 2006), PROCEEDINGS, 2006, : 137 - +
  • [8] COMPARISON OF LATENT SEMANTIC ANALYSIS AND PROBABILISTIC LATENT SEMANTIC ANALYSIS FOR DOCUMENTS CLUSTERING
    Kuta, Marcin
    Kitowski, Jacek
    COMPUTING AND INFORMATICS, 2014, 33 (03) : 652 - 666
  • [9] A probabilistic framework for semantic video indexing, filtering, and retrieval
    Naphade, MR
    Huang, TS
    IEEE TRANSACTIONS ON MULTIMEDIA, 2001, 3 (01) : 141 - 151
  • [10] A probabilistic model for latent semantic indexing in information retrieval and filtering
    Ding, CHQ
    COMPUTATIONAL INFORMATION RETRIEVAL, 2001, : 65 - 73