Accelerating Duplicate Data Chunk Recognition Using NN Trained by Locality-Sensitive Hash

被引:0
|
作者
Berman, Amit [1 ]
Birk, Yitzhak [1 ]
Mendelson, Avi [1 ]
机构
[1] Technion Israel Inst Technol, Dept Elect Engn, IL-32000 Haifa, Israel
来源
2014 IEEE 28TH CONVENTION OF ELECTRICAL & ELECTRONICS ENGINEERS IN ISRAEL (IEEEI) | 2014年
关键词
Deduplication; Chunking; Cloud Storage; Neural Network; Machine Learning; Locality-Sensitive Hashing;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Deduplication is often used in storage systems in order to save storage space, communication bandwidth, write energy, and recovery and error-protection infrastructure. However, deduplication overhead increases latency and computation energy. Determining whether a data chunk is already stored by comparing signatures constitutes a significant fraction of this deduplication overhead. In this paper, we propose a statistical chunk classifier based on a neural network. Our technique is based on learning the patterns of locality-sensitive hashing of the data. Our experiments show an acceleration of chunk processing, leading to reduction in deduplication overhead.
引用
收藏
页数:5
相关论文
共 37 条
  • [21] P-QALSH: Parallelizing Query Aware Locality-Sensitive Hashing for Big Data
    Huang, Yikai
    Yao, Zhili
    Feng, Jianlin
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 629 - 635
  • [22] Private approximate nearest neighbor search for on-chain data based on locality-sensitive hashing
    Shang, Siyuan
    Du, Xuehui
    Wang, Xiaohan
    Liu, Aodi
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2025, 164
  • [23] An incremental community detection method for social tagging systems using locality-sensitive hashing
    Wu, Zhenyu
    Zou, Ming
    NEURAL NETWORKS, 2014, 58 : 14 - 28
  • [24] Locality-sensitive Hashing scheme for Bangla News Article Clustering using Bloom Filter
    Nath, Subrata
    Singha, Pranab
    Islam, Md. Saiful
    2017 INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND COMMUNICATION ENGINEERING (ECCE), 2017, : 17 - 21
  • [25] A Novel Cluster Prediction Approach Based on Locality-Sensitive Hashing for Fuzzy Clustering of Categorical Data
    Toan Nguyen Mau
    Inoguchi, Yasushi
    Van-Nam Huynh
    IEEE ACCESS, 2022, 10 : 34196 - 34206
  • [26] Double-phase locality-sensitive hashing of neighborhood development for multi-relational data
    Ping Ling
    Xiangsheng Rong
    Yongquan Dong
    Guosheng Hao
    Soft Computing, 2015, 19 : 1553 - 1565
  • [27] Double-phase locality-sensitive hashing of neighborhood development for multi-relational data
    Ling, Ping
    Rong, Xiangsheng
    Dong, Yongquan
    Hao, Guosheng
    SOFT COMPUTING, 2015, 19 (06) : 1553 - 1565
  • [28] Fast Distributed kNN Graph Construction Using Auto-tuned Locality-sensitive Hashing
    Eiras-Franco, Carlos
    Martinez-Rego, David
    Kanthan, Leslie
    Pineiro, Cesar
    Bahamonde, Antonio
    Guijarro-Berdinas, Bertha
    Alonso-Betanzos, Amparo
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2020, 11 (06)
  • [29] Improving binary diffing speed and accuracy using community detection and locality-sensitive hashing: an empirical study
    Chariton Karamitas
    Athanasios Kehagias
    Journal of Computer Virology and Hacking Techniques, 2023, 19 : 319 - 337
  • [30] Improving binary diffing speed and accuracy using community detection and locality-sensitive hashing: an empirical study
    Karamitas, Chariton
    Kehagias, Athanasios
    JOURNAL OF COMPUTER VIROLOGY AND HACKING TECHNIQUES, 2023, 19 (02) : 319 - 337