Predicting Failures in HDDs with Deep NN and Irregularly-Sampled Data

被引:0
作者
Pereira, Francisco Lucas F. [1 ]
Bucar, Raif C. B. [1 ]
Brito, Felipe T. [1 ]
Gomes, Joao Paulo P. [1 ]
Machado, Javam C. [1 ]
机构
[1] Univ Fed Ceara, LSBD, Fortaleza, Ceara, Brazil
来源
INTELLIGENT SYSTEMS, PT II | 2022年 / 13654卷
关键词
Hard disk drives; Reccurent neural networks; Missing data; Irregular sampling;
D O I
10.1007/978-3-031-21689-3_15
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As information systems became basic requirements for essential human services, safeguarding stored data became a requirement to maintaining these services. Predicting Hard Disk Drive (HDD) failure can bring efficiency gains in HDD maintenance and reduce the risk for data loss. Recurrent Neural Networks (RNN) are powerful tools for predicting HDD failure but require complete data entries sampled at regular intervals for efficient model training, testing, and deployment. Data imputation is a baseline method to preprocess data for RNN models. However, typical data imputation methods introduce noise into datasets and erase missing data patterns that would otherwise improve model predictions. This article surveys existing RNN models robust to the presence of substantial amounts of missing data and benchmarks the predictive capabilities of these methods on HDD failure prediction using Self-Monitoring, Analysis, and Reporting Technology (SMART) data. To evaluate different missing data conditions, we simulate binomial and exponential sampling schema with varying levels of missing data. The successful implementation and comparison of these methods demonstrated that the GRU-D, phased-LSTM, and CT-LSTM methods are well-rounded methods for multiple missing data conditions, having achieved better performance than basic LSTM networks.
引用
收藏
页码:196 / 209
页数:14
相关论文
共 27 条
[1]  
Backblaze, 2021, HARD DRIV DAT STATS
[2]   Patient Subtyping via Time-Aware LSTM Networks [J].
Baytas, Inci M. ;
Xiao, Cao ;
Zhang, Xi ;
Wang, Fei ;
Jain, Anil K. ;
Zhou, Jiayu .
KDD'17: PROCEEDINGS OF THE 23RD ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2017, :65-74
[3]   GLOBAL OPTIMIZATION OF A NEURAL NETWORK-HIDDEN MARKOV MODEL HYBRID [J].
BENGIO, Y ;
DEMORI, R ;
FLAMMIA, G ;
KOMPE, R .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1992, 3 (02) :252-259
[4]   Predicting Disk Replacement towards Reliable Data Centers [J].
Botezatu, Mirela ;
Giurgiu, Ioana ;
Bogojeska, Jasmina ;
Wiesmann, Dorothea .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :39-48
[5]   Recurrent Neural Networks for Multivariate Time Series with Missing Values [J].
Che, Zhengping ;
Purushotham, Sanjay ;
Cho, Kyunghyun ;
Sontag, David ;
Liu, Yan .
SCIENTIFIC REPORTS, 2018, 8
[6]  
De Brouwer E., 2019, Adv. Neural Inf. Process. Syst., V32, P1
[7]   Software Implementation of Neural Recurrent Model to Predict Remaining Useful Life of Data Storage Devices [J].
Demidova, Liliya ;
Fursov, Ilya .
HIGH-PERFORMANCE COMPUTING SYSTEMS AND TECHNOLOGIES IN SCIENTIFIC RESEARCH, AUTOMATION OF CONTROL AND PRODUCTION, 2022, 1526 :391-400
[8]   Framewise phoneme classification with bidirectional LSTM and other neural network architectures [J].
Graves, A ;
Schmidhuber, J .
NEURAL NETWORKS, 2005, 18 (5-6) :602-610
[9]  
Lechner M, 2020, Arxiv, DOI arXiv:2006.04418
[10]   Hard drive failure prediction using Decision Trees [J].
Li, Jing ;
Stones, Rebecca J. ;
Wang, Gang ;
Liu, Xiaoguang ;
Li, Zhongwei ;
Xu, Ming .
RELIABILITY ENGINEERING & SYSTEM SAFETY, 2017, 164 :55-65