Adaptive local learning regularized nonnegative matrix factorization for data clustering

被引:17
作者
Sheng, Yongpan [1 ]
Wang, Meng [2 ]
Wu, Tianxing [3 ]
Xu, Han [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu, Sichuan, Peoples R China
[2] Southeast Univ, Sch Comp Sci & Engn, Nanjing, Jiangsu, Peoples R China
[3] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
关键词
Adaptive local structure learning; Manifold regularization; Nonnegative matrix factorization; Data clustering; MANIFOLD; ALGORITHM;
D O I
10.1007/s10489-018-1380-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data clustering aims to group the input data instances into certain clusters according to the high similarity to each other, and it could be regarded as a fundamental and essential immediate or intermediate task that appears in areas of machine learning, pattern recognition, and information retrieval. Clustering algorithms based on graph regularized extensions have accumulated much interest for a couple of decades, and the performance of this category of approaches is largely determined by the data similarity matrix, which is usually calculated by the predefined model with carefully tuned parameters combination. However, they may lack a more flexible ability and not be optimal in practice. In this paper, we consider both discriminative information as well as the data manifold in a matrix factorization point of view, and propose an adaptive local learning regularized nonnegative matrix factorization (ALLRNMF) approach for data clustering, which assumes that similar instance pairs with a smaller distance should have a larger probability to be assigned to the probabilistic neighbors. ALLRNMF simultaneously learns the data similarity matrix under the assumption and performs the nonnegative matrix factorization. The constraint of the similarity matrix encodes both the discriminative information as well as the learned adaptive local structure and benefits the data clustering on manifold. In order to solve the optimization problem of our approach, an effective alternative optimization algorithm is proposed such that our objective function could be decomposed into several subproblems that each has an optimal solution, and its convergence is theoretically guaranteed. Experiments on real-world benchmark datasets demonstrate the superior performance of our approach against the existing clustering approaches.
引用
收藏
页码:2151 / 2168
页数:18
相关论文
共 39 条
  • [1] [Anonymous], TPAMI
  • [2] Belkin M, 2006, J MACH LEARN RES, V7, P2399
  • [3] Boyd S., 2004, Convex optimization
  • [4] Non-negative Matrix Factorization on Manifold
    Cai, Deng
    He, Xiaofei
    Wu, Xiaoyun
    Han, Jiawei
    [J]. ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2008, : 63 - +
  • [5] Cai Xiao, 2013, INT JOINT C ARTIF IN
  • [6] Chung F. R., 1997, Spectral graph theory, V92
  • [7] Ding C., 2006, P 12 ACM SIGKDD INT, P126, DOI 10.1145/1150402.1150420
  • [8] Convex and Semi-Nonnegative Matrix Factorizations
    Ding, Chris
    Li, Tao
    Jordan, Michael I.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (01) : 45 - 55
  • [9] Sparse Subspace Clustering: Algorithm, Theory, and Applications
    Elhamifar, Ehsan
    Vidal, Rene
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (11) : 2765 - 2781
  • [10] Eui-Hong Han, 1998, Proceedings of the Second International Conference on Autonomous Agents, P408