Dynamic erasure coding decision for modern block-oriented distributed storage systems

被引:3
|
作者
Ahn, Hoo-Young [1 ]
Lee, Kyong-Ha [2 ]
Lee, Yoon-Joon [1 ]
机构
[1] Korea Adv Inst Sci & Technol, Sch Comp, 291 Daehak Ro, Taejon 305701, South Korea
[2] KISTI, Sci Data Res Ctr, 245 Daehak Ro, Daejeon 305806, South Korea
来源
JOURNAL OF SUPERCOMPUTING | 2016年 / 72卷 / 04期
关键词
Distributed storage system; Storage overhead; Hadoop; HDFS; Data replication; Erasure coding; RAID;
D O I
10.1007/s11227-016-1661-7
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Modern block-oriented distributed storage systems like Hadoop distributed file system have proliferated in this era of big data and cloud computing. These systems feature block-level replication in which their files are partitioned into equal-sized blocks and multiple copies for each block are then arbitrarily distributed across nodes for fault tolerance and data availability. However, many storage volumes are just wasted only for keeping block copies whose data may not be accessed frequently in the strategy. Therefore, distributed storage systems begin to adopt erasure codes. However, classical parity encoding scheme are hard to be directly applied to the distributed storage systems since block copies are arbitrarily placed across nodes in the systems. We present a novel technique, called DynaEC, to address the issues in modern block-oriented distributed storage systems. DynaEC provides a unique parity encoding algorithm that encodes data blocks arbitrarily distributed across machines to parities and then places the parities guaranteeing fault tolerance. Parity encoding in DynaEC is performed without any change of the original block placement policy in Hadoop distributed file system. This makes DynaEC work seamlessly with Hadoop distributed file system. Finally, during the encoding procedure each data node encodes each own data blocks, not requiring any information about other blocks located in other data nodes. As such, the encoding procedure in DynaEC is fully performed in parallel without any synchronization issue. With extensive experiments, we show that DynaEC saves storage volumes up to the theoretical limit while outperforming previous approaches by multiple orders of magnitude.
引用
收藏
页码:1312 / 1341
页数:30
相关论文
共 50 条
  • [1] Dynamic erasure coding decision for modern block-oriented distributed storage systems
    Hoo-Young Ahn
    Kyong-Ha Lee
    Yoon-Joon Lee
    The Journal of Supercomputing, 2016, 72 : 1312 - 1341
  • [2] Demand-Aware Erasure Coding for Distributed Storage Systems
    Li, Jun
    Li, Baochun
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2021, 9 (02) : 532 - 545
  • [3] Erasure-Coding-Based Storage and Recovery for Distributed Exascale Storage Systems
    Kim, Jeong-Joon
    APPLIED SCIENCES-BASEL, 2021, 11 (08):
  • [4] BPR: An Erasure Coding Batch Parallel Repair Approach in Distributed Storage Systems
    Song, Ying
    Zhao, Wenxuan
    Wang, Bo
    IEEE ACCESS, 2023, 11 : 44509 - 44518
  • [5] Storage vs Repair Bandwidth for Network Erasure Coding in Distributed Storage Systems
    Singal, Swati Mittal
    Rakesh, Nitin
    Matam, Rakesh
    2015 INTERNATIONAL CONFERENCE ON SOFT COMPUTING TECHNIQUES AND IMPLEMENTATIONS (ICSCTI), 2015,
  • [6] Data Management in Erasure-Coded Distributed Storage Systems
    Aatish, Chiniah
    Avinash, Mungur
    2020 20TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2020), 2020, : 902 - 907
  • [7] Accelerating erasure coding by exploiting multiple repair paths in distributed storage systems
    Kim, Chanki
    Chon, Kang-Wook
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (06): : 8621 - 8635
  • [8] Erasure Coding-Oriented Data Update for Cloud Storage: A Survey
    Xiao, Yifei
    Zhou, Shijie
    Zhong, Linpeng
    IEEE ACCESS, 2020, 8 (08): : 227982 - 227998
  • [9] Erasure Coding for Cloud Storage Systems: A Survey
    Li, Jun
    Li, Baochun
    TSINGHUA SCIENCE AND TECHNOLOGY, 2013, 18 (03) : 259 - 272
  • [10] Erasure Coding for Cloud Storage Systems: A Survey
    Jun Li
    Baochun Li
    TsinghuaScienceandTechnology, 2013, 18 (03) : 259 - 272