Incremental evaluation of sliding-window queries over data streams

被引:36
作者
Ghanem, Thanaa M.
Hammad, Moustafa A.
Mokbel, Mohamed F.
Aref, Walid G.
Elmagarmid, Ahmed K.
机构
[1] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
[2] Univ Calgary, Dept Comp Sci, Calgary, AB T2N 1N4, Canada
[3] Univ Minnesota Twin Cities, Dept Comp Sci & Engn, Minneapolis, MN 55455 USA
基金
美国国家科学基金会;
关键词
data stream management systems; pipelined query execution; negative tuples;
D O I
10.1109/TKDE.2007.250585
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Two research efforts have been conducted to realize sliding-window queries in data stream management systems, namely, query reevaluation and incremental evaluation. In the query reevaluation method, two consecutive windows are processed independently of each other. On the other hand, in the incremental evaluation method, the query answer for a window is obtained incrementally from the answer of the preceding window. In this paper, we focus on the incremental evaluation method. Two approaches have been adopted for the incremental evaluation of sliding-window queries, namely, the input-triggered approach and the negative tuples approach. In the input-triggered approach, only the newly inserted tuples flow in the query pipeline and tuple expiration is based on the timestamps of the newly inserted tuples. On the other hand, in the negative tuples approach, tuple expiration is separated from tuple insertion where a tuple flows in the pipeline for every inserted or expired tuple. The negative tuples approach avoids the unpredictable output delays that result from the input-triggered approach. However, negative tuples double the number of tuples through the query pipeline, thus reducing the pipeline bandwidth. Based on a detailed study of the incremental evaluation pipeline, we classify the incremental query operators into two classes according to whether an operator can avoid the processing of negative tuples or not. Based on this classification, we present several optimization techniques over the negative tuples approach that aim to reduce the overhead of processing negative tuples while avoiding the output delay of the query answer. A detailed experimental study, based on a prototype system implementation, shows the performance gains over the input-triggered approach of the negative tuples approach when accompanied with the proposed optimizations.
引用
收藏
页码:57 / 72
页数:16
相关论文
共 30 条
[1]   Aurora: a new model and architecture for data stream management [J].
Abadi, DJ ;
Carney, D ;
Cetintemel, U ;
Cherniack, M ;
Convey, C ;
Lee, S ;
Stonebraker, M ;
Tatbul, N ;
Zdonik, S .
VLDB JOURNAL, 2003, 12 (02) :120-139
[2]  
ABADI DJ, 2005, P C INN DAT SYST RES
[3]  
[Anonymous], SIGMOD RECORD
[4]  
[Anonymous], 2003, P C INN DAT SYST RES
[5]  
[Anonymous], 2003, Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data-SIGMOD'03
[6]  
ARASU A, 2004, P INT C VER LARG DAT
[7]  
ARASU A, 2005, CHAPTER STREAM STANF
[8]  
ARASU A, 2003, P INT WORKSH DAT PRO
[9]  
AYAD A, 2004, P ACM SIGMOD C
[10]  
BABCOCK B, 2002, P ACM SIGMOD PODS C