Bloom filter and its variants for the optimization of MapReduce's algorithms: A review

被引:0
作者
Ezzaki, F. [1 ]
Abghour, N. [1 ]
Elomri, A. [1 ]
Moussaid, K. [1 ]
Rida, M. [1 ]
机构
[1] Univ Hassan 2, Lab Informat Modelisat Syst & Aide Decis LIMSAD, Fac Sci, Dept Math & Informat, Casablanca, Morocco
来源
PROCEEDINGS OF 2020 5TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND ARTIFICIAL INTELLIGENCE: TECHNOLOGIES AND APPLICATIONS (CLOUDTECH'20) | 2020年
关键词
big data storage; optimization; bloom filter; cuckoo filter; membership; data structure; join algorithms; mapreduce;
D O I
10.1109/CloudTech49835.2020.9365876
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The bloom filter is a probabilistic data model used to test the existence of an element in a set, i.e., for any given item, the bloom filter could test the membership query on this candidate. The bloom filter has many advantages due to its simplicity and efficiency in highly solving the issue of data representation in many fields and to support membership queries, it has been known as space and time-efficient randomized data structure, by filtering out redundant data and optimizing the memory consumption. However, bloom filters are limited to membership tests and don't support the deletion of elements. They also generate the false positive probability as they are based on a probabilistic model, this error rate is generated when an element that doesn't belong to a set is considered as a member of this set by the bloom filter. Our goal is to compare a number of well- existed algorithms related to the boom filter for future work on the optimization of the join's algorithms in MapReduce. This paper provides an overview of the different variants of the bloom filter and analyses the studies that have been interested in this area of research.
引用
收藏
页码:175 / 181
页数:7
相关论文
共 35 条
  • [1] Scalable Bloom Filters
    Almeida, Paulo Sergio
    Baquero, Carlos
    Preguica, Nuno
    Hutchison, David
    [J]. INFORMATION PROCESSING LETTERS, 2007, 101 (06) : 255 - 261
  • [2] Angius Fabio., 2012, Proceedings of the 1st ACM workshop on Emerging Name-Oriented Mobile Networking Design - Architecture, Algorithms, and Applications, NoM '12, P25, DOI DOI 10.1145/2248361.2248369
  • [3] Bender MA, 2012, PROC VLDB ENDOW, V5, P1627
  • [4] SPACE/TIME TRADE/OFFS IN HASH CODING WITH ALLOWABLE ERRORS
    BLOOM, BH
    [J]. COMMUNICATIONS OF THE ACM, 1970, 13 (07) : 422 - &
  • [5] Bonomi F., 2006, BLOOM FILTERS VIA D
  • [6] Location privacy without mutual trust: The spatial Bloom filter
    Calderoni, Luca
    Palmieri, Paolo
    Maio, Dario
    [J]. COMPUTER COMMUNICATIONS, 2015, 68 : 4 - 16
  • [7] Calderonia L., 2019, COMPUTER SCI DATA ST
  • [8] Chazelle B., 2004, P 15 ANN ACM SIAM S, P30
  • [9] Composite Bloom Filters for Secure Record Linkage
    Durham, Elizabeth A.
    Kantarcioglu, Murat
    Xue, Yuan
    Toth, Csaba
    Kuzu, Mehmet
    Malin, Bradley
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (12) : 2956 - 2968
  • [10] Streaming Quotient Filter: A Near Optimal Approximate Duplicate Detection Approach for Data Streams
    Dutta, Sourav
    Narang, Ankur
    Bera, Suman K.
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2013, 6 (08): : 589 - 600