Optimizing Distributed Joins with Bloom Filters Using MapReduce

被引:0
|
作者
Zhang, Changchun [1 ]
Wu, Lei [1 ]
Li, Jing [1 ]
机构
[1] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei, Peoples R China
来源
COMPUTER APPLICATIONS FOR GRAPHICS, GRID COMPUTING, AND INDUSTRIAL ENVIRONMENT | 2012年 / 351卷
关键词
Bloom Filter; MapReduce; Query Optimization;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The MapReduce framework is increasingly being used to process and analyze large-scale datasets over large clusters. Join operation using MapReduce is an attractive point to which researchers have been paying attention in recent years. The distributed join based on the bloom filter has been proved to be a successful technique to improve the efficiency. However, the full potential of the bloom filter has not been fully exploited, especially in the MapReduce environment. In this paper, we present several strategies to build the bloom filter for the large dataset using MapReduce, compare some bloom-join algorithms and point out how to improve the performance of two-way and multi-way joins. The experiments we conduct show that our method is feasible and effective.
引用
收藏
页码:88 / 95
页数:8
相关论文
共 50 条
  • [1] Efficient Processing Distributed Joins with Bloomfilter using MapReduce
    Zhang, Changchun
    Wu, Lei
    Li, Jing
    INTERNATIONAL JOURNAL OF GRID AND DISTRIBUTED COMPUTING, 2013, 6 (03): : 43 - 57
  • [2] Optimization for Large-Scale Fuzzy Joins Using Fuzzy Filters in MapReduce
    Thi-To-Quyen Tran
    Thuong-Cang Phan
    Laurent, Anne
    D'orazio, Laurent
    2020 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2020,
  • [3] On Spatial Joins in MapReduce
    Sabek, Ibrahim
    Mokbel, Mohamed F.
    25TH ACM SIGSPATIAL INTERNATIONAL CONFERENCE ON ADVANCES IN GEOGRAPHIC INFORMATION SYSTEMS (ACM SIGSPATIAL GIS 2017), 2017,
  • [4] Metric Similarity Joins Using MapReduce
    Chen, Gang
    Yang, Keyu
    Chen, Lu
    Gao, Yunjun
    Zheng, Baihua
    Chen, Chun
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2017, 29 (03) : 656 - 669
  • [5] Improving Hamming distance-based fuzzy join in MapReduce using Bloom Filters
    Thi-To-Quyen Tran
    Thuong-Cang Phan
    Laurent, Anne
    D'Orazio, Laurent
    2018 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2018,
  • [6] Privacy preserving similarity joins using MapReduce
    Ding, Xiaofeng
    Yang, Wanlu
    Choo, Kim-Kwang Raymond
    Wang, Xiaoli
    Jin, Hai
    INFORMATION SCIENCES, 2019, 493 : 20 - 33
  • [7] Secure Joins with MapReduce
    Bultel, Xavier
    Ciucanu, Radu
    Giraud, Matthieu
    Lafourcade, Pascal
    Ye, Lihua
    FOUNDATIONS AND PRACTICE OF SECURITY, FPS 2018, 2019, 11358 : 78 - 94
  • [8] SEJ: An Even Approach to Multiway Theta-Joins using MapReduce
    Zhang, Changchun
    Li, Jing
    Wu, Lei
    Lin, Meiyan
    Liu, Weiqing
    SECOND INTERNATIONAL CONFERENCE ON CLOUD AND GREEN COMPUTING / SECOND INTERNATIONAL CONFERENCE ON SOCIAL COMPUTING AND ITS APPLICATIONS (CGC/SCA 2012), 2012, : 73 - 80
  • [9] User Based Collaborative Filtering Using Bloom Filter with MapReduce
    Shinde, Anita
    Savant, Ila
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON ICT FOR SUSTAINABLE DEVELOPMENT, ICT4SD 2015, VOL 1, 2016, 408 : 115 - 123
  • [10] Practising Scalable Graph Similarity Joins in MapReduce
    Chen, Yifan
    Zhao, Xiang
    Ge, Bin
    Xiao, Chuan
    Chi, Chi-Hung
    2014 IEEE INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS), 2014, : 112 - 119