Low Rank Approximation with Entrywise l1-Norm Error

被引:35
作者
Song, Zhao [1 ]
Woodruff, David P. [2 ]
Zhong, Peilin [3 ]
机构
[1] Univ Texas Austin, Dept Comp Sci, Austin, TX 78712 USA
[2] IBM Almaden Res Ctr, San Jose, CA 95120 USA
[3] Columbia Univ, Dept Comp Sci, New York, NY 10027 USA
来源
STOC'17: PROCEEDINGS OF THE 49TH ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING | 2017年
关键词
Entry-wise l(1) norm; low rank approximation; robust algorithms; sketching; numerical linear algebra; PRINCIPAL COMPONENT ANALYSIS; COMPUTATIONAL-COMPLEXITY; DECISION PROBLEM; 1ST-ORDER THEORY; PRELIMINARIES; ALGORITHMS; GEOMETRY; REALS;
D O I
10.1145/3055399.3055431
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We study the l(1)-low rank approximation problem, where for a given n x d matrix A and approximation factor alpha >= 1, the goal is to output a rank-k matrix (A) over cap for which parallel to A - (A) over cap parallel to(1) <= alpha. min (rank-k matrices A') parallel to A - A'parallel to(1), where for an n x d matrix C, we let parallel to C parallel to(1) = Sigma(n)(i=1) Sigma(d)(j=1) vertical bar C-i,C- j vertical bar . This error measure is known to be more robust than the Frobenius norm in the presence of outliers and is indicated in models where Gaussian assumptions on the noise may not apply. The problem was shown to be NP-hard by Gillis and Vavasis and a number of heuristics have been proposed. It was asked in multiple places if there are any approximation algorithms. We give the first provable approximation algorithms for l(1)-low rank approximation, showing that it is possible to achieve approximation factor alpha = (log d) . poly(k) in nnz(A) + (n + d) poly(k) time, where nnz(A) denotes the number of non-zero entries of A. If k is constant, we further improve the approximation ratio to O(1) with a poly(nd)-time algorithm. Under the Exponential Time Hypothesis, we show there is no poly(nd)-time algorithm achieving a (1 + 1/log(1+gamma) (nd))-approximation, for gamma > 0 an arbitrarily small constant, even when k = 1. We give a number of additional results for l(1)-low rank approximation: nearly tight upper and lower bounds for column subset selection, CUR decompositions, extensions to low rank approximation with respect to l(p)-norms for 1 <= p < 2 and earthmover distance, low-communication distributed protocols and low-memory streaming algorithms, algorithms with limited randomness, and bicriteria algorithms. We also give a preliminary empirical evaluation.
引用
收藏
页码:688 / 701
页数:14
相关论文
共 96 条
  • [1] [Anonymous], 2014, P 25 ANN ACM SIAM S, DOI 10.1137/1.9781611973402.53
  • [2] [Anonymous], 2003, CMUCS03172
  • [3] [Anonymous], 2015, P 26 ANN ACM SIAM S, DOI [DOI 10.1137/1.9781611973730.63, 10.1137/1.9781611973730.63]
  • [4] [Anonymous], 2017, P 49 ANN S THEOR COM
  • [5] [Anonymous], ARXIV14091534
  • [6] [Anonymous], 2014, P 27 INT C NEUR INF
  • [7] [Anonymous], 2010, Proceedings of 11th IEEE International Workshop on Signal ProcessingAdvances in Wireless Communications (SPAWC), Marrakech
  • [8] [Anonymous], 2016, P 27 ANN ACM SIAM S
  • [9] Arora S, 2009, COMPUTATIONAL COMPLEXITY: A MODERN APPROACH, P1, DOI 10.1017/CBO9780511804090
  • [10] Arora S, 2012, STOC'12: PROCEEDINGS OF THE 2012 ACM SYMPOSIUM ON THEORY OF COMPUTING, P145