Failure Prediction, Lead Time Estimation and Health Degree Assessment for Hard Disk Drives Using Voting Based Decision Trees

被引:8
作者
Kaur, Kamaljit [1 ]
Kaur, Kuljit [2 ]
机构
[1] Guru Nanak Dev Univ, Dept Comp Engg & Technol, Amritsar 143005, Punjab, India
[2] Guru Nanak Dev Univ, Dept Comp Sci, Amritsar 143005, Punjab, India
来源
CMC-COMPUTERS MATERIALS & CONTINUA | 2019年 / 60卷 / 03期
关键词
Hard disk drive; lead time; health status; N-splitting algorithm; machine learning; deep learning; data storage; unbalancing problem;
D O I
10.32604/cmc.2019.07675
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Hard Disk drives (HDDs) are an essential component of cloud computing and big data, responsible for storing humongous volumes of collected data. However, HDD failures pose a huge challenge to big data servers and cloud service providers. Every year, about 10% disk drives used in servers crash at least twice, lead to data loss, recovery cost and lower reliability. Recently, the researchers have used SMART parameters to develop various prediction techniques, however, these methods need to be improved for reliability and real-world usage due to the following factors: they lack the ability to consider the gradual change/deterioration of HDDs; they have failed to handle data unbalancing and biases problem; they don't have adequate mechanisms for health status prediction of HDDs. This paper introduces a novel voting-based decision tree classifier to cater failure prediction, a balance splitting algorithm for the data unbalancing problem, an advanced procedure for lead time estimation and R-CNN based approach for health status estimation. Our system works robustly by considering a gradual change in SMART parameters. The system is rigorously tested on 3 datasets and it delivered benchmarks results as compared to the state of the art.
引用
收藏
页码:913 / 946
页数:34
相关论文
共 32 条
[11]   Hard Drive Failure Prediction Using Classification and Regression Trees [J].
Li, Jing ;
Ji, Xinpu ;
Jia, Yuhan ;
Zhu, Bingpeng ;
Wang, Gang ;
Li, Zhongwei ;
Liu, Xiaoguang .
2014 44TH ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS (DSN), 2014, :383-394
[12]  
Longadge R, 2013, International Journal of Computer Science and Network, V2, P1, DOI DOI 10.48550/ARXIV.1305.1707
[13]   RAIDShield: Characterizing, Monitoring, and Proactively Protecting Against Disk Failures [J].
Ma, Ao ;
Traylor, Rachel ;
Douglis, Fred ;
Chamness, Mark ;
Lu, Guanlin ;
Sawyer, Darren ;
Chandra, Surendar ;
Hsu, Windsor .
ACM TRANSACTIONS ON STORAGE, 2015, 11 (04)
[14]  
Mak C. W., 2014, U.S. Patent No, Patent No. [8,698,492, 8698492]
[15]  
Murray JF, 2005, J MACH LEARN RES, V6, P783
[16]  
Pecht Michael, 2007, 2006 8th International Conference on Electronic Materials and Packaging - EMAP '06, P1, DOI 10.1109/ESIME.2007.360069
[17]  
Pinheiro E, 2007, USENIX ASSOCIATION PROCEEDINGS OF THE 5TH USENIX CONFERENCE ON FILE AND STORAGE TECHNOLOGIES ( FAST '07), P17
[18]  
Pitakrat T., 2013, P 4 INT ACM SIGSOFT, P1, DOI DOI 10.1145/2465470.2465473
[19]   A Fault Detection Method for Hard Disk Drives Based on Mixture of Gaussians and Nonparametric Statistics [J].
Queiroz, Lucas P. ;
Rodrigues, Francisco Caio M. ;
Gomes, Joao Paulo P. ;
Brito, Felipe T. ;
Chaves, Iago C. ;
Paula, Manoel Rui P. ;
Salvador, Marcos R. ;
Machado, Javam C. .
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2017, 13 (02) :542-550
[20]  
Rincon C.A., 2017, 2017 International Symposium on Performance Evaluation of Computer and Telecommunication Systems (SPECTS), P1