L-BiX: incremental sliding-window aggregation over data streams using linear bidirectional aggregating indexes

被引:0
作者
Savong Bou
Hiroyuki Kitagawa
Toshiyuki Amagasa
机构
[1] University of Tsukuba,Center for Computational Sciences
[2] Toyota Technological Institute,Smart Vehicle Research Center
[3] University of Tsukuba,Center for Artificial Intelligence Research
来源
Knowledge and Information Systems | 2020年 / 62卷
关键词
Sliding window; Aggregation; Incremental computation; Streams;
D O I
暂无
中图分类号
学科分类号
摘要
The number of real-time information sources, or so-called streams, has rapidly increased, leading to a greater demand for complex analyses over streams. Although many stream analysis methods exist, aggregation is fundamental to ascertain higher levels of knowledge from raw data. In particular, sliding-window aggregation, where aggregations over sliding windows are repeatedly computed, is useful in many real-life applications. Two stacks is the state-of-the-art method to compute sliding-window aggregations incrementally with a O(1) time complexity. However, its performance seriously degrades as the window size increases due to the high overhead to maintain its index. To address this problem, this paper proposes a linear bidirectional index (L-BiX) that exploits two kinds of partial aggregations. Specifically, forward (old-to-new) and backward (new-to-old) aggregations allow efficient computations in an incremental manner. The proposed algorithm requires the same time complexity as two stacks (O(1)). Our experimental evaluation shows that the throughput of L-BiX can be faster by up to 1.71 times than that of two stacks with a 50% smaller memory footprint.
引用
收藏
页码:3107 / 3131
页数:24
相关论文
共 19 条
[1]  
Abadi DJ(2003)Aurora: a new model and architecture for data stream management VLDB J 2 120-139
[2]  
Akidau T(2013)MillWheel: fault-tolerant stream processing at internet scale Proc VLDB Endow 11 1033-1044
[3]  
Ruxton GD(2006)The unequal variance Behav Ecol 17 688-690
[4]  
Gulisano V(2012) test is an underused alternative to student’s IEEE Trans Parallel Distrib Syst 12 2351-2351
[5]  
Arasu A(2006) test and the Mann Whitney VLDB J 2 121-142
[6]  
Akidau T(2015) test Proc VLDB Endow 12 1792-1803
[7]  
Gedik B(2014)Streamcloud: an elastic and scalable data streaming system Softw Pract Exp 9 1105-1128
[8]  
Ali MH(2010)The CQL continuous query language: semantic foundations and query execution Softw Pract Exp 2 69-74
[9]  
Yang J(2003)The dataflow model: a practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing VLDB J 3 262-283
[10]  
Widom J(2019)Generic windowing support for extensible stream processing systems Inf Syst 81 117-135