Storage-Reliability-Repair Trade-offs in Distributed Storage System

被引:0
作者
Qi, Yichuan [1 ]
Feng, Dan [1 ]
机构
[1] Huazhong Univ Sci & Technol, Wuhan Natl Lab Optoelect, Wuhan, Peoples R China
来源
2019 IEEE 25TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS) | 2019年
关键词
reliability; repair penalty; data redundancy scheme; Markov Chain; latent sector error; correlated failures; ERASURE; CODES;
D O I
10.1109/ICPADS47876.2019.00038
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
For reliability purpose, data redundancy is an essential in distributed storage system. In the evolving of data redundancy schemes, storage-reliability-repair trade-offs always exist. From replication to erasure codes, higher reliability achieves with lower storage cost, but the higher repair penalty can not be ignored. Optimized data redundancy schemes like regenerating codes or basic pyramid codes are proposed to relieve the high repair penalty. In order to study the storage-reliability-repair trade-offs, we propose a new model based on the standard Markov Chain, it takes the irregular fault tolerance, latent sector error and correlated failures into account, and measures the storage and repair cost from system perspective. Through inter-scheme and intra-scheme discussions, we find some principles about choosing data redundancy schemes and corresponding parameters to achieve better storage-reliability-repair trade-offs. Our goal is to provide system designers and administrators with concrete information to help them achieve proper storage-reliability-repair trade-offs.
引用
收藏
页码:201 / 208
页数:8
相关论文
共 28 条
  • [1] Bairavasundaram LN, 2007, PERF E R SI, V35, P289
  • [2] Bhagwan R., 2004, NSDI 04, P25
  • [3] Cidon A., 2013, P USENIX ATC
  • [4] Cook J., 2013, COMPUTER SCI
  • [5] Network Coding for Distributed Storage Systems
    Dimakis, Alexandros G.
    Godfrey, P. Brighten
    Wu, Yunnan
    Wainwright, Martin J.
    Ramchandran, Kannan
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2010, 56 (09) : 4539 - 4551
  • [6] Ford D., 2010, P 9 USENIX S OP SYST, P61
  • [7] Gibson GA., 1992, REDUNDANT DISK ARRAY
  • [8] Improving reliability and performances in large scale distributed applications with erasure codes and replication
    Gribaudo, Marco
    Iacono, Mauro
    Manini, Daniele
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2016, 56 : 773 - 782
  • [9] Hafner J. L., 2006, RJ10391 IBM
  • [10] Hu Y., 2017, ACM T STORAGE TOS