Streaming Algorithms via Precision Sampling

被引:40
作者
Andoni, Alexandr [1 ]
Krauthgamer, Robert [2 ]
Onak, Krzysztof [3 ]
机构
[1] Microsoft Res, Mountain View, CA 94043 USA
[2] Weizmann Inst Sci, Rehovot, Israel
[3] CMU, Pittsburgh, PA USA
来源
2011 IEEE 52ND ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS 2011) | 2011年
基金
美国国家科学基金会; 以色列科学基金会;
关键词
streaming; sampling; moments; cascaded norms;
D O I
10.1109/FOCS.2011.82
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
A technique introduced by Indyk and Woodruff (STOC 2005) has inspired several recent advances in data-stream algorithms. We show that a number of these results follow easily from the application of a single probabilistic method called Precision Sampling. Using this method, we obtain simple data-stream algorithms that maintain a randomized sketch of an input vector x = (x(1), x(2), ... , x(n)), which is useful for the following applications: Estimating the F-k-moment of x, for k > 2. Estimating the l(p)-norm of x, for p is an element of [1, 2], with small update time. Estimating cascaded norms l(p)(l(q)) for all p, q > 0. l(1) sampling, where the goal is to produce an element i with probability (approximately) vertical bar x(i)vertical bar/parallel to x parallel to(1). It extends to similarly defined l(p)-sampling, for p is an element of [1, 2]. For all these applications the algorithm is essentially the same: scale the vector x entry-wise by a well-chosen random vector, and run a heavy-hitter estimation algorithm on the resulting vector. Our sketch is a linear function of x, thereby allowing general updates to the vector x. Precision Sampling itself addresses the problem of estimating a sum Sigma(n)(i=1) a(i) from weak estimates of each real a(i) is an element of [0, 1]. More precisely, the estimator first chooses a desired precision u(i) is an element of (0, 1] for each i is an element of [n], and then it receives an estimate of every a(i) within additive ui. Its goal is to provide a good approximation to Sigma a(i) while keeping a tab on the "approximation cost" Sigma(i) (1/u(i)). Here we refine previous work (Andoni, Krauthgamer, and Onak, FOCS 2010) which shows that as long as Sigma a(i) = Omega(1), a good multiplicative approximation can be achieved using total precision of only O(n log n).
引用
收藏
页码:363 / 372
页数:10
相关论文
共 31 条
  • [1] The space complexity of approximating the frequency moments
    Alon, N
    Matias, Y
    Szegedy, M
    [J]. JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1999, 58 (01) : 137 - 147
  • [2] Andoni A., 2009, P FOCS
  • [3] Andoni A., 2010, P FOCS
  • [4] An information statistics approach to data stream and communication complexity
    Bar-Yossef, Z
    Jayram, TS
    Kumar, R
    Sivakumar, D
    [J]. JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2004, 68 (04) : 702 - 732
  • [5] Simpler algorithm for estimating frequency moments of data streams
    Bhuvanagiri, Lakshminath
    Ganguly, Sumit
    Kesh, Deepanjan
    Saha, Chandan
    [J]. PROCEEDINGS OF THE SEVENTHEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2006, : 708 - 713
  • [6] Bhuvanagiri L, 2006, LECT NOTES COMPUT SC, V4168, P148
  • [7] Braverman V., 2010, P STOC
  • [8] Braverman Vladimir., 2010, CORR
  • [9] Near-optimal lower bounds on the multi-party communication complexity of set disjointness
    Chakrabarti, A
    Khot, S
    Sun, XD
    [J]. 18TH IEEE ANNUAL CONFERENCE ON COMPUTATIONAL COMPLEXITY, PROCEEDINGS, 2003, : 107 - 117
  • [10] Charikar M., 2002, P ICALP