Accelerating Duplicate Data Chunk Recognition Using NN Trained by Locality-Sensitive Hash

被引:0
|
作者
Berman, Amit [1 ]
Birk, Yitzhak [1 ]
Mendelson, Avi [1 ]
机构
[1] Technion Israel Inst Technol, Dept Elect Engn, IL-32000 Haifa, Israel
来源
2014 IEEE 28TH CONVENTION OF ELECTRICAL & ELECTRONICS ENGINEERS IN ISRAEL (IEEEI) | 2014年
关键词
Deduplication; Chunking; Cloud Storage; Neural Network; Machine Learning; Locality-Sensitive Hashing;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Deduplication is often used in storage systems in order to save storage space, communication bandwidth, write energy, and recovery and error-protection infrastructure. However, deduplication overhead increases latency and computation energy. Determining whether a data chunk is already stored by comparing signatures constitutes a significant fraction of this deduplication overhead. In this paper, we propose a statistical chunk classifier based on a neural network. Our technique is based on learning the patterns of locality-sensitive hashing of the data. Our experiments show an acceleration of chunk processing, leading to reduction in deduplication overhead.
引用
收藏
页数:5
相关论文
共 37 条
  • [1] Locality-Sensitive Hashing Scheme Based on Heap Sort of Hash Bucket
    Fang, Bo
    Hua, Zhongyun
    Huang, Hejiao
    14TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND EDUCATION (ICCSE 2019), 2019, : 5 - 10
  • [2] ProbMinHash - A Class of Locality-Sensitive Hash Algorithms for the (Probability) Jaccard Similarity
    Ertl, Otmar
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (07) : 3491 - 3506
  • [3] Utilizing Locality-Sensitive Hash Learning for Cross-Media Retrieval
    Jia Yuhua
    Bai Liang
    Wang Peng
    Guo Jinlin
    Xie Yuxiang
    Yu Tianyuan
    MULTIMEDIA MODELING (MMM 2017), PT I, 2017, 10132 : 550 - 561
  • [4] Using Locality-sensitive Hashing for Rendezvous Search
    Jiang, Guann-Yng
    Chang, Cheng-Shang
    ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2023, : 1743 - 1749
  • [5] Using Locality-Sensitive Hashing for SVM Classification of Large Data Sets
    Gonzalez-Lima, Maria D.
    Ludena, Carenne C.
    MATHEMATICS, 2022, 10 (11)
  • [6] Irrelevance reduction with locality-sensitive hash learning for efficient cross-media retrieval
    Yuhua Jia
    Liang Bai
    Peng Wang
    Jinlin Guo
    Yuxiang Xie
    Tianyuan Yu
    Multimedia Tools and Applications, 2018, 77 : 29435 - 29455
  • [7] Irrelevance reduction with locality-sensitive hash learning for efficient cross-media retrieval
    Jia, Yuhua
    Bai, Liang
    Wang, Peng
    Guo, Jinlin
    Xie, Yuxiang
    Yu, Tianyuan
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (22) : 29435 - 29455
  • [8] Fast Redescription Mining Using Locality-Sensitive Hashing
    Karjalainen, Maiju
    Galbrun, Esther
    Miettinen, Pauli
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, PT VII, ECML PKDD 2024, 2024, 14947 : 124 - 142
  • [9] LOAD-BALANCED LOCALITY-SENSITIVE HASHING: A NEW METHOD FOR EFFICIENT NEAR DUPLICATE IMAGE DETECTION
    Fan, Yabo
    Xing, Junliang
    Hu, Weiming
    2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 53 - 57
  • [10] Fast Duplicate Detection Using Locality Sensitive Hashing
    Rong, C. T.
    Feng, L. J.
    INTERNATIONAL CONFERENCE ON ADVANCED EDUCATIONAL TECHNOLOGY AND INFORMATION ENGINEERING (AETIE 2015), 2015, : 580 - 588