STAR: An efficient coding scheme for correcting triple storage node failures

被引:135
作者
Huang, Cheng [1 ]
Xu, Lihao [2 ]
机构
[1] Microsoft Res, Redmond, WA 98052 USA
[2] Wayne State Univ, Dept Comp Sci, Detroit, MI 48202 USA
基金
美国国家科学基金会;
关键词
fault tolerance; high availability; error control codes; storage systems;
D O I
10.1109/TC.2007.70830
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Proper data placement schemes based on erasure correcting codes are one of the most important components for a highly available data storage system. For such schemes, low decoding complexity for correcting (or recovering) storage node failures is essential for practical systems. In this paper, we describe a new coding scheme, which we call the STAR code, for correcting triple storage node failures (erasures). The STAR code is an extension of the double-erasure-correcting EVENODD code and a modification of the generalized triple-erasure-correcting EVENODD code. The STAR code is an Maximum Distance Separable (MDS) code and thus is optimal in terms of node failure recovery capability for a given data redundancy. We provide detailed STAR code decoding algorithms for correcting various triple node failures. We show that the decoding complexity of the STAR code is much lower than those of existing comparable codes; thus, the STAR code is practically very meaningful for storage systems that need higher reliability.
引用
收藏
页码:889 / 901
页数:13
相关论文
共 44 条
[1]  
ALVAREZ GA, 1997, P 24 INT S COMP ARCH, P62
[2]   Serverless network file systems [J].
Anderson, TE ;
Dahlin, MD ;
Neefe, JM ;
Patterson, DA ;
Roselli, DS ;
Wang, RY .
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 1996, 14 (01) :41-79
[3]   A CASE FOR NOW (NETWORKS OF WORKSTATIONS) [J].
ANDERSON, TE ;
CULLER, DE ;
PATTERSON, DA .
IEEE MICRO, 1995, 15 (01) :54-64
[4]  
[Anonymous], P 7 INT C ARCH SUPP
[5]  
[Anonymous], P 5 IEEE INT S NETW
[6]  
[Anonymous], 1995, TR95048 ICSI
[7]  
BHIDE A, 1991, PROCEEDINGS OF THE WINTER 1991 USENIX CONFERENCE, P199
[8]  
Bindel D., 2000, P 9 INT C ARCH SUPP
[9]   EVENODD - AN EFFICIENT SCHEME FOR TOLERATING DOUBLE-DISK FAILURES IN RAID ARCHITECTURES [J].
BLAUM, M ;
BRADY, J ;
BRUCK, J ;
MENON, J .
IEEE TRANSACTIONS ON COMPUTERS, 1995, 44 (02) :192-202
[10]   NEW ARRAY CODES FOR MULTIPLE PHASED BURST CORRECTION [J].
BLAUM, M ;
ROTH, RM .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1993, 39 (01) :66-77