Accelerating Duplicate Data Chunk Recognition Using NN Trained by Locality-Sensitive Hash

被引:0
|
作者
Berman, Amit [1 ]
Birk, Yitzhak [1 ]
Mendelson, Avi [1 ]
机构
[1] Technion Israel Inst Technol, Dept Elect Engn, IL-32000 Haifa, Israel
来源
2014 IEEE 28TH CONVENTION OF ELECTRICAL & ELECTRONICS ENGINEERS IN ISRAEL (IEEEI) | 2014年
关键词
Deduplication; Chunking; Cloud Storage; Neural Network; Machine Learning; Locality-Sensitive Hashing;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Deduplication is often used in storage systems in order to save storage space, communication bandwidth, write energy, and recovery and error-protection infrastructure. However, deduplication overhead increases latency and computation energy. Determining whether a data chunk is already stored by comparing signatures constitutes a significant fraction of this deduplication overhead. In this paper, we propose a statistical chunk classifier based on a neural network. Our technique is based on learning the patterns of locality-sensitive hashing of the data. Our experiments show an acceleration of chunk processing, leading to reduction in deduplication overhead.
引用
收藏
页数:5
相关论文
共 37 条
  • [31] Locality-sensitive hashing enables efficient and scalable signal classification in high-throughput mass spectrometry raw data
    Konstantin Bob
    David Teschner
    Thomas Kemmer
    David Gomez-Zepeda
    Stefan Tenzer
    Bertil Schmidt
    Andreas Hildebrandt
    BMC Bioinformatics, 23
  • [32] Chinese Multi-Keyword Fuzzy Rank Search over Encrypted Cloud Data Based on Locality-Sensitive Hashing
    Yang, Yang
    Zhang, Yu-Chao
    Liu, Jia
    Liu, Xi-Meng
    Yuan, Feng
    Zhong, Shang-Ping
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2019, 35 (01) : 137 - 158
  • [33] Locality-sensitive hashing enables efficient and scalable signal classification in high-throughput mass spectrometry raw data
    Bob, Konstantin
    Teschner, David
    Kemmer, Thomas
    Gomez-Zepeda, David
    Tenzer, Stefan
    Schmidt, Bertil
    Hildebrandt, Andreas
    BMC BIOINFORMATICS, 2022, 23 (01)
  • [34] EFFICIENT MANIFOLD LEARNING FOR SPEECH RECOGNITION USING LOCALITY SENSITIVE HASHING
    Tomar, Vikrant Singh
    Rose, Richard C.
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 6995 - 6999
  • [35] P-QALSH plus : Exploiting Multiple Cores to Parallelize Query-Aware Locality-Sensitive Hashing on Big Data
    Huang, Yikai
    Hu, Zezhao
    Feng, Jianlin
    WEB AND BIG DATA, PT II, APWEB-WAIM 2023, 2024, 14332 : 28 - 43
  • [36] Fast Fuzzy Search for Mixed Data Using Locality Sensitive Hashing
    Lee, Kyung Mi
    Lee, Keon Myung
    PROGRESS IN MECHATRONICS AND INFORMATION TECHNOLOGY, PTS 1 AND 2, 2014, 462-463 : 321 - +
  • [37] Artificial Increase of 3D-Skeleton-Data for Human Motion Recognition using supervised SVM and NN
    Vox, Jan P.
    Wallhoff, Frank
    2018 9TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS (IS), 2018, : 602 - 607