Optimizing Multi-Top-k Queries over Uncertain Data Streams

被引:10
作者
Chen, Tao [1 ]
Chen, Lei [2 ]
Oezsu, M. Tamer [3 ]
Xiao, Nong [4 ]
机构
[1] Beijing Proteome Res Ctr, State Key Lab Prote, Bioinformat Grp, Beijing 102206, Peoples R China
[2] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Kowloon, Hong Kong, Peoples R China
[3] Univ Waterloo, Database Res Grp, David R Cheriton Sch Comp Sci, Waterloo, ON N2L 3G1, Canada
[4] Natl Univ Def Technol, Sch Comp, Changsha 410073, Hunan, Peoples R China
基金
加拿大自然科学与工程研究理事会; 中国国家自然科学基金;
关键词
Data streams; uncertain data streams; top-k query; multiquery optimization;
D O I
10.1109/TKDE.2012.126
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Query processing over uncertain data streams, in particular top-k query processing, has become increasingly important due to its wide application in many fields such as sensor network monitoring and internet traffic control. In many real applications, multiple top-k queries are registered in the system. Sharing the results of these queries is a key factor in saving the computation cost and providing real-time response. However, due to the complex semantics of uncertain top-k query processing, it is nontrivial to implement sharing among different top-k queries and few works have addressed the sharing issue. In this paper, we formulate various types of sharing among multiple top-k queries over uncertain data streams based on the frequency upper bound of each top-k query. We present an optimal dynamic programming solution as well as a more efficient (in terms of time and space complexity) greedy algorithm to compute the execution plan of executing queries for saving the computation cost between them. Experiments have demonstrated that the greedy algorithm can find the optimal solution in most cases, and it can almost achieve the same performance (in terms of latency and throughput) as the dynamic programming approach.
引用
收藏
页码:1814 / 1829
页数:16
相关论文
共 28 条
  • [1] Abbott R., 1988, Proceedings of the Fourteenth International Conference on Very Large Databases, P1
  • [2] Aggarwal CC, 2008, PROC INT CONF DATA, P150, DOI 10.1109/ICDE.2008.4497423
  • [3] [Anonymous], 2003, Proceedings of the 29th international conference on Very large data bases
  • [4] [Anonymous], PROCEEDINGS
  • [5] [Anonymous], 2007, P IEEE INT C DAT ENG
  • [6] The CQL continuous query language: semantic foundations and query execution
    Arasu, A
    Babu, S
    Widom, J
    [J]. VLDB JOURNAL, 2006, 15 (02) : 121 - 142
  • [7] Arasu A., 2004, VLDB, V30, P336, DOI DOI 10.1016/B978-012088469-8.50032-2
  • [8] Operator scheduling in data stream systems
    Babcock, B
    Babu, S
    Datar, M
    Motwani, R
    Thomas, D
    [J]. VLDB JOURNAL, 2004, 13 (04) : 333 - 353
  • [9] Babcock B., 2002, PODS, P1, DOI [DOI 10.1145/543613.543615, 10.1145/543613.543615]
  • [10] Bonifaci V, 2010, PROC APPL MATH, V135, P1350