Storage-Reliability-Repair Trade-offs in Distributed Storage System

被引：0

作者：

Qi, Yichuan ^{[1
]}

Feng, Dan ^{[1
]}

机构：

[1] Huazhong Univ Sci & Technol, Wuhan Natl Lab Optoelect, Wuhan, Peoples R China

来源：

2019 IEEE 25TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS) | 2019年

关键词：

reliability; repair penalty; data redundancy scheme; Markov Chain; latent sector error; correlated failures; ERASURE; CODES;

D O I：

10.1109/ICPADS47876.2019.00038

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

For reliability purpose, data redundancy is an essential in distributed storage system. In the evolving of data redundancy schemes, storage-reliability-repair trade-offs always exist. From replication to erasure codes, higher reliability achieves with lower storage cost, but the higher repair penalty can not be ignored. Optimized data redundancy schemes like regenerating codes or basic pyramid codes are proposed to relieve the high repair penalty. In order to study the storage-reliability-repair trade-offs, we propose a new model based on the standard Markov Chain, it takes the irregular fault tolerance, latent sector error and correlated failures into account, and measures the storage and repair cost from system perspective. Through inter-scheme and intra-scheme discussions, we find some principles about choosing data redundancy schemes and corresponding parameters to achieve better storage-reliability-repair trade-offs. Our goal is to provide system designers and administrators with concrete information to help them achieve proper storage-reliability-repair trade-offs.

引用

页码：201 / 208

页数：8

共 28 条

[1] Bairavasundaram LN, 2007, PERF E R SI, V35, P289
[2] Bhagwan R., 2004, NSDI 04, P25
[3] Cidon A., 2013, P USENIX ATC
[4] Cook J., 2013, COMPUTER SCI
[5] Network Coding for Distributed Storage Systems
Dimakis, Alexandros G.
Godfrey, P. Brighten
Wu, Yunnan
Wainwright, Martin J.
Ramchandran, Kannan
[J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2010, 56 (09) : 4539 - 4551
[6] Ford D., 2010, P 9 USENIX S OP SYST, P61
[7] Gibson GA., 1992, REDUNDANT DISK ARRAY
[8] Improving reliability and performances in large scale distributed applications with erasure codes and replication
Gribaudo, Marco
Iacono, Mauro
Manini, Daniele
[J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2016, 56 : 773 - 782
[9] Hafner J. L., 2006, RJ10391 IBM
[10] Hu Y., 2017, ACM T STORAGE TOS

← 1 2 3 →