Time-decaying bloom filters for data streams with skewed distributions

被引:26
|
作者
Cheng, K [1 ]
Xiang, LM [1 ]
Iwaihara, M [1 ]
Xu, HY [1 ]
Mohania, MM [1 ]
机构
[1] Kyushu Sangyo Univ, Fukuoka, Japan
来源
15th International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications, Proceedings | 2005年
关键词
D O I
10.1109/RIDE.2005.15
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Bloom Filters are space-efficient data structures for membership queries over sets. To enable queries for multiplicities of multi-sets, the bitmap in a Bloom Filter is replaced by an array of counters whose values increment on each occurrence. In a data stream model, however, data items arrive at varying rates and recent occurrences are often regarded as more significant than past ones. In most data stream applications, it is critical to handle this "time-sensitivity". Furthermore, data streams with skewed distributions are common in many emerging applications, e.g., traffic engineering and billing, intrusion detection, trading surveillance and outlier detection. For such applications, it is inefficient to allocate counters of uniform size to all buckets. In this paper, we present Time-decaying Bloom Filters (TBF), a Bloom Filter that maintains the frequency count for each item in a data stream, and the value of each counter decays with time. For data streams with highly skewed distributions, we proposed further optimization by allowing dynamically allocating free counters to the "large" items. We performed preliminary experiments to verify the optimization.
引用
收藏
页码:63 / 69
页数:7
相关论文
共 50 条
  • [1] Time-Decaying Bloom Filters for Efficient Middle-Tier Data Management
    Cheng, Kai
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2010, PT 3, PROCEEDINGS, 2010, 6018 : 395 - 404
  • [2] On-demand Time-decaying Bloom Filters for Telemarketer Detection
    Bianchi, Giuseppe
    d'Heureuse, Nico
    Niccolini, Saverio
    ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2011, 41 (05) : 5 - 12
  • [3] Inferring Insertion Times and Optimizing Error Penalties in Time-decaying Bloom Filters
    Dautrich, Jonathan L., Jr.
    Ravishankar, Chinya V.
    ACM TRANSACTIONS ON DATABASE SYSTEMS, 2019, 44 (02):
  • [4] Time-Decaying Sketches for Sensor Data Aggregation
    Cormode, Graham
    Tirthapura, Srikanta
    Xu, Bojian
    PODC'07: PROCEEDINGS OF THE 26TH ANNUAL ACM SYMPOSIUM ON PRINCIPLES OF DISTRIBUTED COMPUTING, 2007, : 215 - 224
  • [5] TIME-DECAYING SKETCHES FOR ROBUST AGGREGATION OF SENSOR DATA
    Cormode, Graham
    Tirthapura, Srikanta
    Xu, Bojian
    SIAM JOURNAL ON COMPUTING, 2009, 39 (04) : 1309 - 1339
  • [6] Stable Learned Bloom Filters for Data Streams
    Liu, Qiyu
    Zheng, Libin
    Shen, Yanyan
    Chen, Lei
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2020, 13 (11): : 2355 - 2367
  • [7] SKDStream: a dynamic clustering algorithm on time-decaying data stream
    Hui Liu
    Aihua Wu
    Mingkang Wei
    Chin-Chen Chang
    EURASIP Journal on Wireless Communications and Networking, 2022
  • [8] SKDStream: a dynamic clustering algorithm on time-decaying data stream
    Liu, Hui
    Wu, Aihua
    Wei, Mingkang
    Chang, Chin-Chen
    EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2022, 2022 (01)
  • [9] Dynamically Maintaining Duplicate-Insensitive and Time-Decayed Sum Using Time-Decaying Bloom Filter
    Zhang, Yu
    Shen, Hong
    Tian, Hui
    Zhang, Xianchao
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, PROCEEDINGS, 2009, 5574 : 741 - +
  • [10] Maintaining time-decaying stream aggregates
    Cohen, E
    Strauss, MJ
    JOURNAL OF ALGORITHMS-COGNITION INFORMATICS AND LOGIC, 2006, 59 (01): : 19 - 36