Filtering Data Streams for Entity-Based Continuous Queries

被引:11
|
作者
Cheng, Reynold [1 ]
Kao, Ben C. M. [1 ]
Kwan, Alan [1 ]
Prabhakar, Sunil [2 ]
Tu, Yi-Cheng [3 ]
机构
[1] Univ Hong Kong, Dept Comp Sci, Pokfulam, Hong Kong, Peoples R China
[2] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
[3] Univ S Florida, Dept Comp Sci & Engn, Tampa, FL 33620 USA
关键词
Data streams; continuous queries; adaptive filters; fraction-based tolerance;
D O I
10.1109/TKDE.2009.63
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The idea of allowing query users to relax their correctness requirements in order to improve performance of a data stream management system (e.g., location-based services and sensor networks) has been recently studied. By exploiting the maximum error (or tolerance) allowed in query answers, algorithms for reducing the use of system resources have been developed. In most of these works, however, query tolerance is expressed as a numerical value, which may be difficult to specify. We observe that in many situations, users may not be concerned with the actual value of an answer, but rather which object satisfies a query (e.g., "who is my nearest neighbor?"). In particular, an entity-based query returns only the names of objects that satisfy the query. For these queries, it is possible to specify a tolerance that is "nonvalue-based." In this paper, we study fraction-based tolerance, a type of nonvalue-based tolerance, where a user specifies the maximum fractions of a query answer that can be false positives and false negatives. We develop fraction-based tolerance for two major classes of entity-based queries: 1) nonrank-based query (e.g., range queries) and 2) rank-based query (e.g., k-nearest-neighbor queries). These definitions provide users with an alternative to specify the maximum tolerance allowed in their answers. We further investigate how these definitions can be exploited in a distributed stream environment. We design adaptive filter algorithms that allow updates be dropped conditionally at the data stream sources without affecting the overall query correctness. Extensive experimental results show that our protocols reduce the use of network and energy resources significantly.
引用
收藏
页码:234 / 248
页数:15
相关论文
共 50 条
  • [41] Entity-based Neural Local Coherence Modeling
    Jeon, Sungho
    Strube, Michael
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 7787 - 7805
  • [42] Entity-Based Relevance Feedback for Document Retrieval
    Sheetrit, Eilon
    Raiber, Fiana
    Kurland, Oren
    PROCEEDINGS OF THE 2023 ACM SIGIR INTERNATIONAL CONFERENCE ON THE THEORY OF INFORMATION RETRIEVAL, ICTIR 2023, 2023, : 177 - 187
  • [43] Controlled entity-based access control technique
    Yang, Ximin
    Xie, Changsheng
    Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2007, 35 (08): : 56 - 59
  • [44] Active Learning for Entity Filtering in Microblog Streams
    Spina, Damiano
    Peetz, Maria-Hendrike
    de Rijke, Maarten
    SIGIR 2015: PROCEEDINGS OF THE 38TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2015, : 975 - 978
  • [45] Modeling local coherence: An entity-based approach
    Barzilay, Regina
    Lapata, Mirella
    COMPUTATIONAL LINGUISTICS, 2008, 34 (01) : 1 - 34
  • [46] The impact of filtering on spatial continuous queries
    Brinkhoff, T
    ADVANCES IN SPATIAL DATA HANDLING, 2002, : 41 - 54
  • [47] Optimizing Cost of Continuous Overlapping Queries over Data Streams by Filter Adaption
    Xie, Qing
    Zhang, Xiangliang
    Li, Zhixu
    Zhou, Xiaofang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (05) : 1258 - 1271
  • [48] Queueing Analysis of Continuous Queries for Uncertain Data Streams Over Sliding Windows
    Xiao, Guoqing
    Li, Kenli
    Zhou, Xu
    Li, Keqin
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2016, 30 (09)
  • [49] Flexible multi-threaded scheduling for continuous queries over data streams
    Cammert, Michael
    Heinz, Christoph
    Kraemer, Juergen
    Seeger, Bernhard
    Vaupel, Sonny
    Wolske, Udo
    2007 IEEE 23RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOP, VOLS 1-2, 2007, : 624 - 633
  • [50] An Efficient Processing Scheme for Continuous Queries Involving RFID and Sensor Data Streams
    Park, Jeongwoo
    Lee, Kwangjae
    Ryu, Wooseok
    Kwon, Joonho
    Hong, Bonghee
    SECURE AND TRUST COMPUTING, DATA MANAGEMENT, AND APPLICATIONS, 2011, 186 : 187 - +