ON TENSORS, SPARSITY, AND NONNEGATIVE FACTORIZATIONS

被引:180
作者
Chi, Eric C. [1 ]
Kolda, Tamara G. [2 ]
机构
[1] Univ Calif Los Angeles, Dept Human Genet, Los Angeles, CA 90095 USA
[2] Sandia Natl Labs, Livermore, CA USA
关键词
nonnegative tensor factorization; nonnegative CANDECOMP-PARAFAC; Poisson tensor factorization; Lee-Seung multiplicative updates; majorization-minimization algorithms; CONSTRAINED LEAST-SQUARES; MATRIX FACTORIZATION; RECONSTRUCTION ALGORITHMS; CONVERGENCE; PARAFAC;
D O I
10.1137/110859063
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Tensors have found application in a variety of fields, ranging from chemometrics to signal processing and beyond. In this paper, we consider the problem of multilinear modeling of sparse count data. Our goal is to develop a descriptive tensor factorization model of such data, along with appropriate algorithms and theory. To do so, we propose that the random variation is best described via a Poisson distribution, which better describes the zeros observed in the data as compared to the typical assumption of a Gaussian distribution. Under a Poisson assumption, we fit a model to observed data using the negative log-likelihood score. We present a new algorithm for Poisson tensor factorization called CANDECOMP-PARAFAC alternating Poisson regression (CP-APR) that is based on a majorization-minimization approach. It can be shown that CP-APR is a generalization of the Lee-Seung multiplicative updates. We show how to prevent the algorithm from converging to non-KKT points and prove convergence of CP-APR under mild conditions. We also explain how to implement CP-APR for large-scale sparse tensors and present results on several data sets, both real and simulated.
引用
收藏
页码:1272 / 1299
页数:28
相关论文
共 57 条
  • [1] Scalable tensor factorizations for incomplete data
    Acar, Evrim
    Dunlavy, Daniel M.
    Kolda, Tamara G.
    Morup, Morten
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2011, 106 (01) : 41 - 56
  • [2] AMARI S.-I., 2007, ICASSP 07
  • [3] [Anonymous], ARXIV08104225
  • [4] [Anonymous], 2006, Advances in Neural Information Processing Systems 18
  • [5] [Anonymous], 2004, Optimization
  • [6] [Anonymous], 2005, TECHNICAL REPORT
  • [7] [Anonymous], 1999, SPRINGER SCI
  • [8] [Anonymous], 2006, KDD
  • [9] [Anonymous], 1983, Generalized Linear Models
  • [10] BADER B. W., 2008, SURVEY TEXT MINING 2