Annotations in Data Streams

被引:14
作者
Chakrabarti, Amit [1 ]
Cormode, Graham [2 ]
Mcgregor, Andrew [3 ]
Thaler, Justin [4 ]
机构
[1] Dartmouth Coll, Dept Comp Sci, Hanover, NH 03755 USA
[2] Univ Warwick, Dept Comp Sci, Coventry CV47AL, W Midlands, England
[3] Univ Massachusetts, Dept Comp Sci, Amherst, MA 01003 USA
[4] Yahoo Labs, New York, NY 10018 USA
基金
美国国家科学基金会;
关键词
Interactive proof; annotations; data streams; LOWER BOUNDS; COMMUNICATION COMPLEXITY; COUNTING TRIANGLES; ALGORITHMS; COMPUTATIONS;
D O I
10.1145/2636924
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The central goal of data stream algorithms is to process massive streams of data using sublinear storage space. Motivated by work in the database community on outsourcing database and data stream processing, we ask whether the space usage of such algorithms can be further reduced by enlisting a more powerful "helper" that can annotate the stream as it is read. We do not wish to blindly trust the helper, so we require that the algorithm be convinced of having computed a correct answer. We show upper bounds that achieve a nontrivial tradeoff between the amount of annotation used and the space required to verify it. We also prove lower bounds on such tradeoffs, often nearly matching the upper bounds, via notions related to Merlin-Arthur communication complexity. Our results cover the classic data stream problems of selection, frequency moments, and fundamental graph problems such as triangle-freeness and connectivity. Our work is also part of a growing trend-including recent studies of multipass streaming, read/write streams, and randomly ordered streams-of asking more complexity-theoretic questions about data stream processing. It is a recognition that, in addition to practical relevance, the data stream model raises many interesting theoretical questions in its own right.
引用
收藏
页数:30
相关论文
共 54 条
[1]  
Aaronson S, 2008, ACM S THEORY COMPUT, P731
[2]  
Aaronson S, 2006, ANN IEEE CONF COMPUT, P261
[3]   Lower bounds for one-way probabilistic communication complexity and their application to space complexity [J].
Ablayev, F .
THEORETICAL COMPUTER SCIENCE, 1996, 157 (02) :139-159
[4]   On the streaming model augmented with a sorting primitive [J].
Aggarwal, G ;
Datar, M ;
Rajagopalan, S ;
Ruhl, M .
45TH ANNUAL IEEE SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, PROCEEDINGS, 2004, :540-549
[5]   The space complexity of approximating the frequency moments [J].
Alon, N ;
Matias, Y ;
Szegedy, M .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1999, 58 (01) :137-147
[6]   Streaming Algorithms via Precision Sampling [J].
Andoni, Alexandr ;
Krauthgamer, Robert ;
Onak, Krzysztof .
2011 IEEE 52ND ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS 2011), 2011, :363-372
[7]  
[Anonymous], 2006, Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
[8]  
Babai L., 1986, 27th Annual Symposium on Foundations of Computer Science (Cat. No.86CH2354-9), P337, DOI 10.1109/SFCS.1986.15
[9]  
Bar-Yossef Z, 2002, SIAM PROC S, P623
[10]   Lower Bounds for Randomized Read/Write Stream Algorithms [J].
Beame, Paul ;
Jayram, T. S. ;
Rudra, Atri .
STOC 07: PROCEEDINGS OF THE 39TH ANNUAL ACM SYMPOSIUM ON THEORY OF COMPUTING, 2007, :689-698