Low Rank Approximation with Entrywise l1-Norm Error

被引:35
作者
Song, Zhao [1 ]
Woodruff, David P. [2 ]
Zhong, Peilin [3 ]
机构
[1] Univ Texas Austin, Dept Comp Sci, Austin, TX 78712 USA
[2] IBM Almaden Res Ctr, San Jose, CA 95120 USA
[3] Columbia Univ, Dept Comp Sci, New York, NY 10027 USA
来源
STOC'17: PROCEEDINGS OF THE 49TH ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING | 2017年
关键词
Entry-wise l(1) norm; low rank approximation; robust algorithms; sketching; numerical linear algebra; PRINCIPAL COMPONENT ANALYSIS; COMPUTATIONAL-COMPLEXITY; DECISION PROBLEM; 1ST-ORDER THEORY; PRELIMINARIES; ALGORITHMS; GEOMETRY; REALS;
D O I
10.1145/3055399.3055431
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We study the l(1)-low rank approximation problem, where for a given n x d matrix A and approximation factor alpha >= 1, the goal is to output a rank-k matrix (A) over cap for which parallel to A - (A) over cap parallel to(1) <= alpha. min (rank-k matrices A') parallel to A - A'parallel to(1), where for an n x d matrix C, we let parallel to C parallel to(1) = Sigma(n)(i=1) Sigma(d)(j=1) vertical bar C-i,C- j vertical bar . This error measure is known to be more robust than the Frobenius norm in the presence of outliers and is indicated in models where Gaussian assumptions on the noise may not apply. The problem was shown to be NP-hard by Gillis and Vavasis and a number of heuristics have been proposed. It was asked in multiple places if there are any approximation algorithms. We give the first provable approximation algorithms for l(1)-low rank approximation, showing that it is possible to achieve approximation factor alpha = (log d) . poly(k) in nnz(A) + (n + d) poly(k) time, where nnz(A) denotes the number of non-zero entries of A. If k is constant, we further improve the approximation ratio to O(1) with a poly(nd)-time algorithm. Under the Exponential Time Hypothesis, we show there is no poly(nd)-time algorithm achieving a (1 + 1/log(1+gamma) (nd))-approximation, for gamma > 0 an arbitrarily small constant, even when k = 1. We give a number of additional results for l(1)-low rank approximation: nearly tight upper and lower bounds for column subset selection, CUR decompositions, extensions to low rank approximation with respect to l(p)-norms for 1 <= p < 2 and earthmover distance, low-communication distributed protocols and low-memory streaming algorithms, algorithms with limited randomness, and bicriteria algorithms. We also give a preliminary empirical evaluation.
引用
收藏
页码:688 / 701
页数:14
相关论文
共 96 条
  • [71] OSNAP: Faster numerical linear algebra algorithms via sparser subspace embeddings
    Nelson, Jelani
    Nguyen, Huy L.
    [J]. 2013 IEEE 54TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS), 2013, : 117 - 126
  • [72] Netrapalli P., 2014, ADV NEURAL INFORM PR, P1107, DOI DOI 10.48550/ARXIV.1410.7660
  • [73] Nie FP, 2014, PR MACH LEARN RES, V32, P1062
  • [74] Park Young Woong, 2016, ARXIV160902997
  • [75] Paul Brooks J., 2012, Pcal1: An implementation in r of three methods for '1-norm principal component analysis
  • [76] Elemental: A New Framework for Distributed Memory Dense Matrix Computations
    Poulson, Jack
    Marker, Bryan
    van de Geijn, Robert A.
    Hammond, Jeff R.
    Romero, Nichols A.
    [J]. ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2013, 39 (02):
  • [77] Qu Y., 2002, P IEEE INT C DAT MIN
  • [78] Weighted Low Rank Approximations with Provable Guarantees
    Razenshteyn, Ilya
    Song, Zhao
    Woodruff, David P.
    [J]. STOC'16: PROCEEDINGS OF THE 48TH ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING, 2016, : 250 - 263
  • [79] RENEGAR J, 1992, J SYMB COMPUT, V13, P301, DOI [10.1016/S0747-7171(10)80004-5, 10.1016/S0747-7171(10)80005-7]
  • [80] Sandler R, 2009, PROC CVPR IEEE, P1873, DOI 10.1109/CVPRW.2009.5206834