An Efficient I/O-Redirection-Based Reconstruction Scheme for Erasure-Coded Storage Clusters

被引:2
作者
Huang, Jianzhong [1 ]
Qin, Xiao [2 ]
Liang, Xianhai [1 ]
Xie, Changsheng [1 ]
机构
[1] Huazhong Univ Sci & Technol, Wuhan Natl Lab Optoelect, Wuhan 430074, Peoples R China
[2] Auburn Univ, Dept Comp Sci & Software Engn, Shelby Ctr Engn Technol, Samuel Ginn Coll Engn, Auburn, AL 36849 USA
基金
美国国家科学基金会;
关键词
Erasure-coded storage cluster; reconstruction; I/O redirection; deferred write; RECOVERY; RELIABILITY; FAILURE;
D O I
10.1109/TC.2015.2394399
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper addresses an I/O interference problem encountered in on-line reconstruction of erasure-coded storage clusters, where user I/Os compete with reconstruction I/Os for both disk and network bandwidth. We propose a redirection scheme called 'RAM-RS' to minimize the I/O interference among user and reconstruction requests. RAM-RS redirects user read/writes targeted at failed nodes to an RS-coded RAM region, which is formed by pre-allocated main memory in surviving nodes in the RS-coding manner. The RS-coded RAM region quickly serves all user read/write misses; therefore, a rebuilding node can devote its disk and network bandwidths to the node reconstruction. The RAM region substantially reduces the amount of data rebuilt by the rebuilding node, because (1) missed writes are buffered in the RAM region and (2) missed reads are satisfied by using surviving nodes to co-rebuild failed blocks. We build two Markov models to estimate the reliability of the RAM-RS system. Modeling results demonstrate that the MTTDL of RS-coded RAM region in a storage cluster is larger than that of the same cluster comprised of surviving nodes. We implement both RAM-RS and the traditional Redirection schemes in an erasure-coded storage cluster, on which real-world I/O traces are replayed. Experimental results show that compared with the Redirection scheme running on a 9-node storage cluster, RAM-RS improves system performance in terms of both user response time and reconstruction time by a factor of 1.78 and 1.20, respectively.
引用
收藏
页码:3037 / 3050
页数:14
相关论文
共 46 条
[1]   Using erasure codes efficiently for storage in a distributed system [J].
Aguilera, MK ;
Janakiraman, R ;
Xu, LH .
2005 INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS, PROCEEDINGS, 2005, :336-345
[2]  
[Anonymous], I O SEARCH ENGINE I
[3]  
[Anonymous], 2009, FAST
[4]  
[Anonymous], THESIS CARNEGIE MELL
[5]  
[Anonymous], P USENIX C USENIX AN
[6]  
[Anonymous], 2010, PROC 9 USENIX S OPER
[7]   RAID - HIGH-PERFORMANCE, RELIABLE SECONDARY STORAGE [J].
CHEN, PM ;
LEE, EK ;
GIBSON, GA ;
KATZ, RH ;
PATTERSON, DA .
ACM COMPUTING SURVEYS, 1994, 26 (02) :145-185
[8]   Analysis of enterprise media server workloads: Access patterns, locality, content evolution, and rates of change [J].
Cherkasova, L ;
Gupta, M .
IEEE-ACM TRANSACTIONS ON NETWORKING, 2004, 12 (05) :781-794
[9]   Network Coding for Distributed Storage Systems [J].
Dimakis, Alexandros G. ;
Godfrey, P. Brighten ;
Wu, Yunnan ;
Wainwright, Martin J. ;
Ramchandran, Kannan .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2010, 56 (09) :4539-4551
[10]  
Frolund S, 2004, 2004 INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS, PROCEEDINGS, P125