Streaming Algorithms via Precision Sampling

被引：40

作者：

Andoni, Alexandr ^{[1
]}

Krauthgamer, Robert ^{[2
]}

Onak, Krzysztof ^{[3
]}

机构：

[1] Microsoft Res, Mountain View, CA 94043 USA

[2] Weizmann Inst Sci, Rehovot, Israel

[3] CMU, Pittsburgh, PA USA

来源：

2011 IEEE 52ND ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS 2011) | 2011年

基金：

美国国家科学基金会; 以色列科学基金会;

关键词：

streaming; sampling; moments; cascaded norms;

D O I：

10.1109/FOCS.2011.82

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

A technique introduced by Indyk and Woodruff (STOC 2005) has inspired several recent advances in data-stream algorithms. We show that a number of these results follow easily from the application of a single probabilistic method called Precision Sampling. Using this method, we obtain simple data-stream algorithms that maintain a randomized sketch of an input vector x = (x(1), x(2), ... , x(n)), which is useful for the following applications: Estimating the F-k-moment of x, for k > 2. Estimating the l(p)-norm of x, for p is an element of [1, 2], with small update time. Estimating cascaded norms l(p)(l(q)) for all p, q > 0. l(1) sampling, where the goal is to produce an element i with probability (approximately) vertical bar x(i)vertical bar/parallel to x parallel to(1). It extends to similarly defined l(p)-sampling, for p is an element of [1, 2]. For all these applications the algorithm is essentially the same: scale the vector x entry-wise by a well-chosen random vector, and run a heavy-hitter estimation algorithm on the resulting vector. Our sketch is a linear function of x, thereby allowing general updates to the vector x. Precision Sampling itself addresses the problem of estimating a sum Sigma(n)(i=1) a(i) from weak estimates of each real a(i) is an element of [0, 1]. More precisely, the estimator first chooses a desired precision u(i) is an element of (0, 1] for each i is an element of [n], and then it receives an estimate of every a(i) within additive ui. Its goal is to provide a good approximation to Sigma a(i) while keeping a tab on the "approximation cost" Sigma(i) (1/u(i)). Here we refine previous work (Andoni, Krauthgamer, and Onak, FOCS 2010) which shows that as long as Sigma a(i) = Omega(1), a good multiplicative approximation can be achieved using total precision of only O(n log n).

引用

页码：363 / 372

页数：10

共 31 条

[1] The space complexity of approximating the frequency moments
Alon, N
Matias, Y
Szegedy, M
[J]. JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1999, 58 (01) : 137 - 147
[2] Andoni A., 2009, P FOCS
[3] Andoni A., 2010, P FOCS
[4] An information statistics approach to data stream and communication complexity
Bar-Yossef, Z
Jayram, TS
Kumar, R
Sivakumar, D
[J]. JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2004, 68 (04) : 702 - 732
[5] Simpler algorithm for estimating frequency moments of data streams
Bhuvanagiri, Lakshminath
Ganguly, Sumit
Kesh, Deepanjan
Saha, Chandan
[J]. PROCEEDINGS OF THE SEVENTHEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2006, : 708 - 713
[6] Bhuvanagiri L, 2006, LECT NOTES COMPUT SC, V4168, P148
[7] Braverman V., 2010, P STOC
[8] Braverman Vladimir., 2010, CORR
[9] Near-optimal lower bounds on the multi-party communication complexity of set disjointness
Chakrabarti, A
Khot, S
Sun, XD
[J]. 18TH IEEE ANNUAL CONFERENCE ON COMPUTATIONAL COMPLEXITY, PROCEEDINGS, 2003, : 107 - 117
[10] Charikar M., 2002, P ICALP

← 1 2 3 4 →