Filtering Data Streams for Entity-Based Continuous Queries

被引:11
|
作者
Cheng, Reynold [1 ]
Kao, Ben C. M. [1 ]
Kwan, Alan [1 ]
Prabhakar, Sunil [2 ]
Tu, Yi-Cheng [3 ]
机构
[1] Univ Hong Kong, Dept Comp Sci, Pokfulam, Hong Kong, Peoples R China
[2] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
[3] Univ S Florida, Dept Comp Sci & Engn, Tampa, FL 33620 USA
关键词
Data streams; continuous queries; adaptive filters; fraction-based tolerance;
D O I
10.1109/TKDE.2009.63
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The idea of allowing query users to relax their correctness requirements in order to improve performance of a data stream management system (e.g., location-based services and sensor networks) has been recently studied. By exploiting the maximum error (or tolerance) allowed in query answers, algorithms for reducing the use of system resources have been developed. In most of these works, however, query tolerance is expressed as a numerical value, which may be difficult to specify. We observe that in many situations, users may not be concerned with the actual value of an answer, but rather which object satisfies a query (e.g., "who is my nearest neighbor?"). In particular, an entity-based query returns only the names of objects that satisfy the query. For these queries, it is possible to specify a tolerance that is "nonvalue-based." In this paper, we study fraction-based tolerance, a type of nonvalue-based tolerance, where a user specifies the maximum fractions of a query answer that can be false positives and false negatives. We develop fraction-based tolerance for two major classes of entity-based queries: 1) nonrank-based query (e.g., range queries) and 2) rank-based query (e.g., k-nearest-neighbor queries). These definitions provide users with an alternative to specify the maximum tolerance allowed in their answers. We further investigate how these definitions can be exploited in a distributed stream environment. We design adaptive filter algorithms that allow updates be dropped conditionally at the data stream sources without affecting the overall query correctness. Extensive experimental results show that our protocols reduce the use of network and energy resources significantly.
引用
收藏
页码:234 / 248
页数:15
相关论文
共 50 条
  • [31] Semantics and Implementation of Continuous Sliding Window Queries over Data Streams
    Kraemer, Juergen
    Seeger, Bernhard
    ACM TRANSACTIONS ON DATABASE SYSTEMS, 2009, 34 (01):
  • [32] Efficiently processing continuous k-NN queries on data streams
    Boehm, Christian
    Ooi, Beng Chin
    Plant, Claudia
    Yan, Ying
    2007 IEEE 23RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2007, : 131 - +
  • [33] Fuzzy Named Entity-Based Document Clustering
    Cao, Tru H.
    Do, Hai T.
    Hong, Dung T.
    Quan, Tho T.
    2008 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-5, 2008, : 2030 - 2036
  • [34] Entity-Based Knowledge Conflicts in Question Answering
    Longpre, Shayne
    Perisetla, Kartik
    Chen, Anthony
    Ramesh, Nikhil
    DuBois, Chris
    Singh, Sameer
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 7052 - 7063
  • [35] Towards an Entity-based Scientific Metadata Schema
    Xu, Hao
    APPLIED MATERIALS AND TECHNOLOGIES FOR MODERN MANUFACTURING, PTS 1-4, 2013, 423-426 : 2751 - 2754
  • [36] Entity-Based Collaboration Tools for Intelligence Analysis
    Bier, Eric A.
    Card, Stuart K.
    Bodnar, John W.
    IEEE SYMPOSIUM ON VISUAL ANALYTICS SCIENCE AND TECHNOLOGY 2008, PROCEEDINGS, 2008, : 99 - +
  • [37] Attribute-based evaluation of multiple continuous queries for filtering incoming tuples of a data stream
    Lee, Hyun-Ho
    Yun, Eun-Won
    Lee, Won-Suk
    INFORMATION SCIENCES, 2008, 178 (11) : 2416 - 2432
  • [38] Entity-based noun phrase coreference resolution
    Yang, XF
    Su, J
    Yang, LP
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2005, 3406 : 218 - 221
  • [39] An Efficient Filtering Method for Processing Continuous Skyline Queries on Sensor Data
    Jang, Su Min
    Park, Choon Seo
    Seo, Dong Min
    Yo, Jae Soo
    IEICE TRANSACTIONS ON COMMUNICATIONS, 2010, E93B (08) : 2180 - 2183
  • [40] Entity-based keyword search in web documents
    Sartori E.
    Velegrakis Y.
    Guerra F.
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2016, 9630 : 21 - 49