Optimizing Distributed Joins with Bloom Filters Using MapReduce

被引:0
作者
Zhang, Changchun [1 ]
Wu, Lei [1 ]
Li, Jing [1 ]
机构
[1] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei, Peoples R China
来源
COMPUTER APPLICATIONS FOR GRAPHICS, GRID COMPUTING, AND INDUSTRIAL ENVIRONMENT | 2012年 / 351卷
关键词
Bloom Filter; MapReduce; Query Optimization;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The MapReduce framework is increasingly being used to process and analyze large-scale datasets over large clusters. Join operation using MapReduce is an attractive point to which researchers have been paying attention in recent years. The distributed join based on the bloom filter has been proved to be a successful technique to improve the efficiency. However, the full potential of the bloom filter has not been fully exploited, especially in the MapReduce environment. In this paper, we present several strategies to build the bloom filter for the large dataset using MapReduce, compare some bloom-join algorithms and point out how to improve the performance of two-way and multi-way joins. The experiments we conduct show that our method is feasible and effective.
引用
收藏
页码:88 / 95
页数:8
相关论文
共 50 条
  • [41] A Distributed Framework for Event Log Analysis using MapReduce
    Dewangan, Sandeep Kumar
    Pandey, Shikha
    Verma, Toran
    [J]. PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION CONTROL AND COMPUTING TECHNOLOGIES (ICACCCT), 2016, : 503 - 506
  • [42] Distributed discovery of frequent subgraphs of a network using MapReduce
    Saeed Shahrivari
    Saeed Jalili
    [J]. Computing, 2015, 97 : 1101 - 1120
  • [43] Scalable Distributed RDFS Reasoning Using MapReduce and Bigtable
    Shi Huijun
    Rao Ruonan
    [J]. INTERNATIONAL CONFERENCE ON GRAPHIC AND IMAGE PROCESSING (ICGIP 2012), 2013, 8768
  • [44] Reducing False Positives of a Bloom Filter using Cross-Checking Bloom Filters
    Lim, Hyesook
    Lee, Nara
    Lee, Jungwon
    Yim, Changhoon
    [J]. APPLIED MATHEMATICS & INFORMATION SCIENCES, 2014, 8 (04): : 1865 - 1877
  • [45] Bloom filters for molecules
    Jorge Medina
    Andrew D. White
    [J]. Journal of Cheminformatics, 15
  • [46] Optimizing Multiway Joins in a Map-Reduce Environment
    Afrati, Foto N.
    Ullman, Jeffrey D.
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2011, 23 (09) : 1282 - 1298
  • [47] Bloom filters for molecules
    Medina, Jorge
    White, Andrew D.
    [J]. JOURNAL OF CHEMINFORMATICS, 2023, 15 (01)
  • [48] Bloom filter and its variants for the optimization of MapReduce's algorithms: A review
    Ezzaki, F.
    Abghour, N.
    Elomri, A.
    Moussaid, K.
    Rida, M.
    [J]. PROCEEDINGS OF 2020 5TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND ARTIFICIAL INTELLIGENCE: TECHNOLOGIES AND APPLICATIONS (CLOUDTECH'20), 2020, : 175 - 181
  • [49] Modeling and optimizing MapReduce programs
    Doerre, Jens
    Apel, Sven
    Lengauer, Christian
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2015, 27 (07) : 1734 - 1766
  • [50] BF-MapReduce : A bloom filter Based Efficient Lightweight Search
    Tan, Zi-long
    Zhou, Ke-ren
    Zhang, Hao
    Zhou, Wei
    [J]. 2015 IEEE CONFERENCE ON COLLABORATION AND INTERNET COMPUTING (CIC), 2015, : 125 - 129