A cluster-based data deduplication technology

被引:1
|
作者
Tseng, Chuan-Mu [1 ]
Ciou, Jheng-Rong [2 ]
Liu, Tzong-Jye [2 ]
机构
[1] Jeh Teh Jr Coll Med Nursing & Management, Dept Appl Digital Media, Miaoli, Taiwan
[2] Feng Chia Univ, Dept Informat Engn & Comp Sci, Taichung, Taiwan
来源
2014 SECOND INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR) | 2014年
关键词
Bloom filter; cluster; data deduplication;
D O I
10.1109/CANDAR.2014.22
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data deduplication technology usually identifies redundant data quickly and correctly by using bloom filter technology. A bloom filter can determine whether there is redundant data. However, there are the presences of false positives. In order to avoid false positives, we need to compare a new chunk with chunks that have been stored. In order to reduce the time to exclude the bloom filter false positives, current research uses many small size index tables to store chunk ID. However, the target chunk ID only stores in one index table. Searching for the target chunk ID at another index table uselessly took a great deal of time. In this paper, we cluster the stored chunks to reduce the time of excluding the false positive problem induced by bloom filter.
引用
收藏
页码:226 / 230
页数:5
相关论文
共 50 条
  • [1] Hadoop Based Scalable Cluster Deduplication for Big Data
    Liu, Qing
    Fu, Yinjin
    Ni, Guiqiang
    Hou, Rui
    2016 IEEE 36TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS WORKSHOPS (ICDCSW 2016), 2016, : 98 - 105
  • [2] A Cluster-Based Cooperative Data Transmission in VANETs
    Fu, Qi
    Chen, Anhua
    Jiang, Yunxia
    Tang, Mingdong
    COLLABORATE COMPUTING: NETWORKING, APPLICATIONS AND WORKSHARING, COLLABORATECOM 2016, 2017, 201 : 563 - 568
  • [3] Cluster-Based Cooperative Data Service for VANETs
    Shi, Yongyue
    Peng, Xiao-Hong
    Shen, Hang
    Bai, Guangwei
    WIRELESS INTERNET (WICON 2017), 2018, 230 : 119 - 129
  • [4] A Cluster-Based Data Routing for Wireless Sensor Networks
    Wang, Hao-Li
    Chao, Yu-Yang
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, PROCEEDINGS, 2009, 5574 : 129 - 136
  • [5] Practical Data Transmission in Cluster-Based Sensor Networks
    Kim, Dae-Young
    Cho, Jinsung
    Jeong, Byeong-Soo
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2010, 4 (03): : 224 - 242
  • [6] Data Deduplication Technology for Cloud Storage
    He, Qinlu
    Bian, Genqing
    Shao, Bilin
    Zhang, Weiqi
    TEHNICKI VJESNIK-TECHNICAL GAZETTE, 2020, 27 (05): : 1444 - 1451
  • [7] Cluster-Based Join for Geographically Distributed Big RDF Data
    Yang, Fan
    Crainiceanu, Adina
    Chen, Zhiyuan
    Needham, Don
    2019 IEEE INTERNATIONAL CONGRESS ON BIG DATA (IEEE BIGDATA CONGRESS 2019), 2019, : 170 - 178
  • [8] Cluster-based tangible programming
    Smith, Andrew Cyrus
    2014 FOURTH INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION AND COMMUNICATION TECHNOLOGY AND IT'S APPLICATIONS (DICTAP), 2014, : 405 - 410
  • [9] Cluster-based artificial contrasts for inhomogeneously distributed data with an indicator variable
    Hwang, Wook-Yeon
    INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2016, 54 (17) : 5045 - 5055
  • [10] The Construction of Data Center Based on the Cluster Technology
    Sun, Q. R.
    Chen, P.
    Lu, X. L.
    Liu, Z.
    INTERNATIONAL CONFERENCE ON ADVANCED EDUCATIONAL TECHNOLOGY AND INFORMATION ENGINEERING (AETIE 2015), 2015, : 478 - 484