Enhancing Data Availability in Disk Drives through Background Activities

被引：12

作者：

Mi, Ningfang ^{[1
]}

Riska, Alma ^{[2
]}

Smirni, Evgenia ^{[1
]}

Riedel, Erik ^{[2
]}

机构：

[1] Coll William & Mary, Dept Comp Sci, Williamsburg, VA 23187 USA

[2] Seagate Res, Pittsburgh, PA 15222 USA

来源：

2008 IEEE INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS & NETWORKS WITH FTCS & DCC | 2008年

关键词：

D O I：

10.1109/DSN.2008.4630120

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Latent sector errors in disk drives affect only a few data sectors. They occur silently and are detected only when the affected area is accessed again. If a latent error is detected while the storage system is operating under reduced redundancy, i.e., during a RAID rebuild, then data loss may occur Various features such as scrubbing and intra-disk data redundancy are proposed to detect and/or recover from latent errors and avoid data loss. While such features enhance data availability in the storage system, their execution may cause performance degradation. In this paper we evaluate the effectiveness of scrubbing and intra-disk data redundancy in improving data availability while the overall goal is to maintain user performance within predefined bounds. We show that by treating them as low priority background activities and scheduling them efficiently during idle times, these features remain performance-wise transparent to the storage system user while still improving data reliability. Detailed trace-driven simulations show that the Mean Time To Data Loss (MTTDL) improves by up to 5 orders of magnitude if these features are implemented independently. By scheduling concurrently both scrubbing and intra-disk parity updates during idle times in disk drives, MTTDL improves by as much as 8 orders of magnitude.

引用

页码：492 / +

页数：2

共 21 条

[1]

[Anonymous], P 1 ACM SIGOPS EUROS

[2]

[Anonymous], 3298 NETW APPL INC

[3]

[Anonymous], 2005, ACM SIGOPS OPER SYST, DOI DOI 10.1145/1095809.1095836

[4]

Bairavasundaram LN, 2007, PERF E R SI, V35, P289

[5]

DHOLAKIA A, 2006, RZ3652 IBM RES

[6]

DOUGLIS F, 1995, PROCEEDINGS OF THE SECOND USENIX SYMPOSIUM ON MOBILE AND LOCATION-INDEPENDENT COMPUTING, P121

[7]

Eggert L., 2005, P 20 ACM S OP SYST P, P249

[8] Enhanced reliability Modeling of RAID storage systems [J].

Elerath, Jon G. ;

Pecht, Michael .

37TH ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS, PROCEEDINGS, 2007, :175-+

[9]

Ghemawat S., 2003, Operating Systems Review, V37, P29, DOI 10.1145/1165389.945450

[10]

GOLDING R, 1995, PROCEEDINGS OF THE 1995 USENIX TECHNICAL CONFERENCE, P201

← 1 2 3 →