Enforced Sparse Non-Negative Matrix Factorization

被引:0
作者
Gavin, Brendan [1 ,2 ]
Gadepally, Vijay [2 ]
Kepner, Jeremy [2 ]
机构
[1] Univ Massachusetts, Amherst, MA 01003 USA
[2] MIT, Lincoln Lab, Cambridge, MA 02139 USA
来源
2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW) | 2016年
关键词
ALGORITHMS;
D O I
10.1109/IPDPSW.2016.58
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Non-negative matrix factorization (NMF) is a dimensionality reduction algorithm for data that can be represented as an undirected bipartite graph. It has become a common method for generating topic models of text data because it is known to produce good results, despite its relative simplicity of implementation and ease of computation. One challenge with applying the NMF to large datasets is that intermediate matrix products often become dense, thus stressing the memory and compute elements of the underlying system. In this article, we investigate a simple but powerful modification of the alternating least squares method of determining the NMF of a sparse matrix that enforces the generation of sparse intermediate and output matrices. This method enables the application of NMF to large datasets through improved memory and compute performance. Further, we demonstrate, empirically, that this method of enforcing sparsity in the NMF either preserves or improves both the accuracy of the resulting topic model and the convergence rate of the underlying algorithm.
引用
收藏
页码:902 / 911
页数:10
相关论文
共 20 条
[1]  
[Anonymous], ARXIV10070380
[2]  
[Anonymous], INT PAR DISTR PROC S
[3]   Algorithms and applications for approximate nonnegative matrix factorization [J].
Berry, Michael W. ;
Browne, Murray ;
Langville, Amy N. ;
Pauca, V. Paul ;
Plemmons, Robert J. .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2007, 52 (01) :155-173
[4]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[5]   Graph Regularized Nonnegative Matrix Factorization for Data Representation [J].
Cai, Deng ;
He, Xiaofei ;
Han, Jiawei ;
Huang, Thomas S. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (08) :1548-1560
[6]   Fast Local Algorithms for Large Scale Nonnegative Matrix and Tensor Factorizations [J].
Cichocki, Andrzej ;
Phan, Anh-Huy .
IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2009, E92A (03) :708-721
[7]  
Eggert J, 2004, IEEE IJCNN, P2529
[8]  
Gadepally V., 2015, IEEE INT PAR DISTR P
[9]  
Gerard S., 1983, INTRO MODERN INFORM
[10]   Probabilistic latent semantic indexing [J].
Hofmann, T .
SIGIR'99: PROCEEDINGS OF 22ND INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 1999, :50-57