Filtering Data Streams for Entity-Based Continuous Queries

被引:11
|
作者
Cheng, Reynold [1 ]
Kao, Ben C. M. [1 ]
Kwan, Alan [1 ]
Prabhakar, Sunil [2 ]
Tu, Yi-Cheng [3 ]
机构
[1] Univ Hong Kong, Dept Comp Sci, Pokfulam, Hong Kong, Peoples R China
[2] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
[3] Univ S Florida, Dept Comp Sci & Engn, Tampa, FL 33620 USA
关键词
Data streams; continuous queries; adaptive filters; fraction-based tolerance;
D O I
10.1109/TKDE.2009.63
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The idea of allowing query users to relax their correctness requirements in order to improve performance of a data stream management system (e.g., location-based services and sensor networks) has been recently studied. By exploiting the maximum error (or tolerance) allowed in query answers, algorithms for reducing the use of system resources have been developed. In most of these works, however, query tolerance is expressed as a numerical value, which may be difficult to specify. We observe that in many situations, users may not be concerned with the actual value of an answer, but rather which object satisfies a query (e.g., "who is my nearest neighbor?"). In particular, an entity-based query returns only the names of objects that satisfy the query. For these queries, it is possible to specify a tolerance that is "nonvalue-based." In this paper, we study fraction-based tolerance, a type of nonvalue-based tolerance, where a user specifies the maximum fractions of a query answer that can be false positives and false negatives. We develop fraction-based tolerance for two major classes of entity-based queries: 1) nonrank-based query (e.g., range queries) and 2) rank-based query (e.g., k-nearest-neighbor queries). These definitions provide users with an alternative to specify the maximum tolerance allowed in their answers. We further investigate how these definitions can be exploited in a distributed stream environment. We design adaptive filter algorithms that allow updates be dropped conditionally at the data stream sources without affecting the overall query correctness. Extensive experimental results show that our protocols reduce the use of network and energy resources significantly.
引用
收藏
页码:234 / 248
页数:15
相关论文
共 50 条
  • [21] Characterizing memory requirements for queries over continuous data streams
    Arasu, A
    Babcock, B
    Babu, S
    McAlister, J
    Widom, J
    ACM TRANSACTIONS ON DATABASE SYSTEMS, 2004, 29 (01): : 162 - 194
  • [22] Load shedding for window queries over continuous data streams
    Kim, Kwang Rak
    Kim, Hyeon Gyu
    Lecture Notes in Electrical Engineering, 2015, 373 : 159 - 164
  • [23] A chaos-based predictive algorithm for continuous aggregate queries over data streams
    Yu, Yaxin
    Wang, Guoren
    Chen, Can
    Fu, Chong
    FOURTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 3, PROCEEDINGS, 2007, : 391 - +
  • [24] Chaos-based predictive algorithm for continuous aggregate queries over data streams
    Yu, Ya-Xin
    Wang, Guo-Ren
    Chen, Can
    Fu, Chong
    Dongbei Daxue Xuebao/Journal of Northeastern University, 2007, 28 (08): : 1105 - 1108
  • [25] Earliest Deadline Scheduling for Continuous Queries over Data Streams
    Li, Xin
    Jia, Zhiping
    Ma, Li
    Zhang, Ruihua
    Wang, Haiyang
    2009 INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS, PROCEEDINGS, 2009, : 57 - +
  • [26] Transformation of continuous aggregation join queries over data streams
    Tran, Tri Minh
    Lee, Byung Suk
    ADVANCES IN SPATIAL AND TEMPORAL DATABASES, PROCEEDINGS, 2007, 4605 : 330 - +
  • [27] TweetSpector: Entity-based retrieval of Tweets
    Yerva, Surender Reddy
    Miklos, Zoltan
    Grosan, Flavia
    Tandrau, Alexandru
    Aberer, Karl
    SIGIR 2012: PROCEEDINGS OF THE 35TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2012, : 1016 - 1016
  • [28] The KNOWLEDGESTORE: an Entity-Based Storage System
    Cattoni, R.
    Corcoglioniti, F.
    Girardi, C.
    Magnini, B.
    Serafini, L.
    Zanoli, R.
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 3639 - 3646
  • [29] Semantic load shedding for prioritized continuous queries over data streams
    Park, J
    Cho, H
    COMPUTER AND INFORMATION SCIENCES - ISCIS 2005, PROCEEDINGS, 2005, 3733 : 813 - 822
  • [30] Prioritized Query Shedding Technique for Continuous Queries Over Data Streams
    Helmy, Yehia M.
    El Zanfaly, Doaa S.
    Othman, Nermin A.
    2009 INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND SYSTEMS (ICCES 2009), 2009, : 418 - 422