Beyond MTTDL: A Closed-Form RAID 6 Reliability Equation

被引:25
作者
Elerath, Jon G. [1 ]
Schindler, Jiri [2 ]
机构
[1] Reliabil Consulting Serv, Placerville, CA 95667 USA
[2] NetApp, Sunnyvale, CA 94089 USA
关键词
Storage systems; RAID reliability; Design; Algorithms; Reliability; Measurement;
D O I
10.1145/2577386
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We introduce a new closed-form equation for estimating the number of data-loss events for a redundant array of inexpensive disks in a RAID-6 configuration. The equation expresses operational failures, their restorations, latent (sector) defects, and disk media scrubbing by time-based distributions that can represent non-homogeneous Poisson processes. It uses two-parameter Weibull distributions that allows the distributions to take on many different shapes, modeling increasing, decreasing, or constant occurrence rates. This article focuses on the statistical basis of the equation. It also presents time-based distributions of the four processes based on an extensive analysis of field data collected over several years from 10,000s of commercially available systems with 100,000s of disk drives. Our results for RAID-6 groups of size 16 indicate that the closed-form expression yields much more accurate results compared to the MTTDL reliability equation and matching computationally-intensive Monte Carlo simulations.
引用
收藏
页数:21
相关论文
共 35 条
[1]  
[Anonymous], COMMUNICATION
[2]  
[Anonymous], P 21 INT S COMP ARCH
[3]  
[Anonymous], NETAPP DAT ONTAP 8 O
[4]  
[Anonymous], BETT RAID STRAT HIGH
[5]  
[Anonymous], P 15 IEEE PAC RIM IN
[6]  
[Anonymous], 2008, P 6 USENIX C FIL STO
[7]  
[Anonymous], IEEE T RELIAB
[8]  
[Anonymous], P IEEE INT C NETW AR
[9]  
[Anonymous], 905 NETAPP
[10]  
[Anonymous], 1988, P ACM SIGMOD INT C M