ON TENSORS, SPARSITY, AND NONNEGATIVE FACTORIZATIONS

被引:187
作者
Chi, Eric C. [1 ]
Kolda, Tamara G. [2 ]
机构
[1] Univ Calif Los Angeles, Dept Human Genet, Los Angeles, CA 90095 USA
[2] Sandia Natl Labs, Livermore, CA USA
关键词
nonnegative tensor factorization; nonnegative CANDECOMP-PARAFAC; Poisson tensor factorization; Lee-Seung multiplicative updates; majorization-minimization algorithms; CONSTRAINED LEAST-SQUARES; MATRIX FACTORIZATION; RECONSTRUCTION ALGORITHMS; CONVERGENCE; PARAFAC;
D O I
10.1137/110859063
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Tensors have found application in a variety of fields, ranging from chemometrics to signal processing and beyond. In this paper, we consider the problem of multilinear modeling of sparse count data. Our goal is to develop a descriptive tensor factorization model of such data, along with appropriate algorithms and theory. To do so, we propose that the random variation is best described via a Poisson distribution, which better describes the zeros observed in the data as compared to the typical assumption of a Gaussian distribution. Under a Poisson assumption, we fit a model to observed data using the negative log-likelihood score. We present a new algorithm for Poisson tensor factorization called CANDECOMP-PARAFAC alternating Poisson regression (CP-APR) that is based on a majorization-minimization approach. It can be shown that CP-APR is a generalization of the Lee-Seung multiplicative updates. We show how to prevent the algorithm from converging to non-KKT points and prove convergence of CP-APR under mild conditions. We also explain how to implement CP-APR for large-scale sparse tensors and present results on several data sets, both real and simulated.
引用
收藏
页码:1272 / 1299
页数:28
相关论文
共 57 条
[1]   Scalable tensor factorizations for incomplete data [J].
Acar, Evrim ;
Dunlavy, Daniel M. ;
Kolda, Tamara G. ;
Morup, Morten .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2011, 106 (01) :41-56
[2]  
AMARI S.-I., 2007, ICASSP 07
[3]  
[Anonymous], ARXIV08104225
[4]  
[Anonymous], 2006, Advances in Neural Information Processing Systems 18
[5]  
[Anonymous], 2004, Optimization
[6]  
[Anonymous], 2005, TECHNICAL REPORT
[7]  
[Anonymous], 1999, SPRINGER SCI
[8]  
[Anonymous], 2006, KDD
[9]  
[Anonymous], 1983, Generalized Linear Models
[10]  
BADER B. W., 2008, SURVEY TEXT MINING 2