On Fault Tolerance, Locality, and Optimality in Locally Repairable Codes

被引:0
作者
Kolosov, Oleg [1 ]
Yadgar, Gala [1 ,2 ]
Liram, Matan [2 ]
Tamo, Itzhak [1 ]
Barg, Alexander [3 ]
机构
[1] Tel Aviv Univ, Sch Elect Engn, Tel Aviv, Israel
[2] Technion, Comp Sci Dept, Haifa, Israel
[3] Univ Maryland, Dept ECE ISR, College Pk, MD 20742 USA
来源
PROCEEDINGS OF THE 2018 USENIX ANNUAL TECHNICAL CONFERENCE | 2018年
关键词
DISTRIBUTED STORAGE;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Erasure codes are used in large-scale storage systems to allow recovery of data from a failed node. A recently developed class of erasure codes, termed locally repairable codes (LRCs), offers tradeoffs between storage overhead and repair cost. LRCs facilitate more efficient recovery scenarios by storing additional parity blocks in the system, but these additional blocks may eventually increase the number of blocks that must be reconstructed. Existing codes differ in their use of the additional parity blocks, but also in their locality semantics and in the parameters for which they are defined. As a result, existing theoretical models cannot be used to directly compare different LRCs to determine which code will offer the best recovery performance, and at what cost. In this study, we perform the first systematic comparison of existing LRC approaches. We analyze Xorbas, Azure's LRCs, and the recently proposed Optimal-LRCs in light of two new metrics: the average degraded read cost, and the normalized repaircost. We show the tradeoff between these costs and the code's fault tolerance, and that different approaches offer different choices in this tradeoff. Our experimental evaluation on a Ceph cluster deployed on Amazon EC2 further demonstrates the different effects of realistic network and storage bottlenecks on the benefit from each examined LRC approach. Despite these differences, the normalized repair cost metric can reliably identify the LRC approach that would achieve the lowest repair cost in each setup.
引用
收藏
页码:865 / 877
页数:13
相关论文
共 33 条
[1]  
[Anonymous], 2013, 11 USENIX C FIL STOR
[2]   Network Coding for Distributed Storage Systems [J].
Dimakis, Alexandros G. ;
Godfrey, P. Brighten ;
Wu, Yunnan ;
Wainwright, Martin J. ;
Ramchandran, Kannan .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2010, 56 (09) :4539-4551
[3]  
En-Gad E., 2013, IEEE INT S INF THEOR
[4]   On the Locality of Codeword Symbols [J].
Gopalan, Parikshit ;
Huang, Cheng ;
Simitci, Huseyin ;
Yekhanin, Sergey .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2012, 58 (11) :6925-6934
[5]  
Guruswami V., 2016, 48 ANN ACM SIGACT S
[6]  
Huang C., 2012, USENIX ANN TECHN C A, P1
[7]   Pyramid Codes: Flexible Schemes to Trade Space for Access Efficiency in Reliable Data Storage Systems [J].
Huang, Cheng ;
Chen, Minghua ;
Li, Jin .
ACM TRANSACTIONS ON STORAGE, 2013, 9 (01)
[8]  
Khan O., 2012, 10 US C FIL STOR TEC
[9]  
Kolosov O., 2018, ABS180200157 ARXIV
[10]   Optimal Exact Repair Strategy for the Parity Nodes of the (k+2, k) Zigzag Code [J].
Li, Jie ;
Tang, Xiaohu .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2016, 62 (09) :4848-4856