SiaDFP: A Disk Failure Prediction Framework Based on Siamese Neural Network in Large-Scale Data Center

被引:0
|
作者
Fang, Xiaoyu [1 ]
Guan, Wenbai [1 ]
Li, Jiawen [1 ]
Cao, Chenhan [1 ]
Xia, Bin [2 ]
机构
[1] Nanjing Univ Posts & Telecommun, Nanjing 210049, Peoples R China
[2] Nanjing Univ Posts & Telecommun, Jiangsu Key Lab Big Data Secur & Intelligent Proc, Nanjing 210049, Peoples R China
关键词
Neural networks; Market research; Task analysis; Predictive models; Faces; Data centers; Web and internet services; Attention mechanism; change point detection; disk failure prediction; siamese neural network;
D O I
10.1109/TSC.2024.3394692
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the rapid development of cloud services, service providers increasingly rely on a dependable storage system equipped with large-capacity disks to ensure data availability. The primary source of unreliability in such storage systems attributes to disk failures. In recent years, some proactive methods base on machine learning models have emerged, aiming to predict impending disk failures by leveraging the SMART attributes of disks. These methods enable service providers to timely back up storage data. While the methods prove more effective and efficient in disk failure prediction, they still face challenges, such as inadequate mining of abnormal information and imbalanced classification. In this paper, we mainly analyzed the change of data distribution in hard disks. From the data analysis, we observed that the distribution change in the failed disk is obvious during the period before the disk damage, while that in the healthy disk is insignificant during running time. Motivated by the observation, we propose a novel framework named SiaDFP, based on Siamese neural network, designed to predict impending disk failures by capturing the distribution changes in failed disks. Additionally, we observed that the failed disks exhibit some change points as an abnormal feature by analyzing the disk data trend. To fully mining abnormal information inhere in failed disks, we propose CP-MAP mechanism and 2D-Attention mechanism. Furthermore, we present a subsampling approach named Region Balanced Sampling to address the challenge of imbalanced classification. Experiments on the real-world dataset Backblaze and Baidu demonstrate that the performance of SiaDFP is outstanding in the task of disk failure prediction.
引用
收藏
页码:2890 / 2903
页数:14
相关论文
共 50 条
  • [31] A Lightweight Framework for Function Name Reassignment Based on Large-Scale Stripped Binaries
    Gao, Han
    Cheng, Shaoyin
    Xue, Yinxing
    Zhang, Weiming
    ISSTA '21: PROCEEDINGS OF THE 30TH ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, 2021, : 607 - 619
  • [32] Decentralized adaptive neural network control for a large-scale nonlinear systems with unmodeled dynamic
    Zhu Hong-bin
    Zhang Tian-ping
    2013 32ND CHINESE CONTROL CONFERENCE (CCC), 2013, : 3019 - 3024
  • [33] Novel Hybrid Spatiotemporal Convolution Neural Network Model for Short-Term Passenger Flow Prediction in a Large-Scale Metro System
    Li, Zhihong
    Wang, Xiaoyu
    Cai, Hua
    Xu, Han
    JOURNAL OF TRANSPORTATION ENGINEERING PART A-SYSTEMS, 2024, 150 (05)
  • [34] Real-Time Pattern Synthesis for Large-Scale Conformal Arrays Based on Interpolation and Artificial Neural Network Method
    Yang, Xinyao
    Yang, Feng
    Chen, Yikai
    Hu, Jun
    Yang, Shiwen
    IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION, 2023, 71 (12) : 9559 - 9570
  • [35] Modified predictive optimal control using neural network-based combined model for large-scale power plants
    Lee, Kwang Y.
    Heo, Jin S.
    Hoffman, Jason A.
    Kim, Sung-Ho
    Jung, Won-Hee
    2007 IEEE POWER ENGINEERING SOCIETY GENERAL MEETING, VOLS 1-10, 2007, : 1020 - 1027
  • [36] Effective graph-neural-network based models for discovering Structural Hole Spanners in large-scale and diverse networks
    Goel, Diksha
    Shen, Hong
    Tian, Hui
    Guo, Mingyu
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 249
  • [37] Progressive Tree-Based Compression of Large-Scale Particle Data
    Hoang, Duong
    Bhatia, Harsh
    Lindstrom, Peter
    Pascucci, Valerio
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2024, 30 (07) : 4321 - 4338
  • [38] Neural networks-based regularization for large-scale medical image reconstruction
    Kofler, A.
    Haltmeier, M.
    Schaeffter, T.
    Kachelriess, M.
    Dewey, M.
    Wald, C.
    Kolbitsch, C.
    PHYSICS IN MEDICINE AND BIOLOGY, 2020, 65 (13)
  • [39] Multi-scale residual based siamese neural network for writer-independent online signature verification
    Qi Shen
    Fangjun Luan
    Shuai Yuan
    Applied Intelligence, 2022, 52 : 14571 - 14589
  • [40] Multi-scale residual based siamese neural network for writer-independent online signature verification
    Shen, Qi
    Luan, Fangjun
    Yuan, Shuai
    APPLIED INTELLIGENCE, 2022, 52 (12) : 14571 - 14589