A data-driven approach to simultaneous fault detection and diagnosis in data centers

被引:20
作者
Asgari, Sahar [1 ,2 ]
Gupta, Rohit [2 ]
Puri, Ishwar K. [1 ,2 ]
Zheng, Rong [1 ,3 ]
机构
[1] McMaster Univ, Comp Infrastruct Res Ctr, Hamilton, ON, Canada
[2] McMaster Univ, Dept Mech Engn, Hamilton, ON, Canada
[3] McMaster Univ, Dept Comp & Software Engn, Hamilton, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Data center; Fault diagnosis; Classification; Time-series analysis; Gray-box model; MULTIPLE SIMULTANEOUS FAULTS; QUANTITATIVE MODEL; NEURAL-NETWORKS; AIR; TEMPERATURE; PREDICTIONS; ENVIRONMENT; MANAGEMENT; BUILDINGS; STRATEGY;
D O I
10.1016/j.asoc.2021.107638
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The failure of cooling systems in data centers (DCs) leads to higher indoor temperatures, causing crucial electronic devices to fail, and produces a significant economic loss. To circumvent this issue, fault detection and diagnosis (FDD) algorithms and associated control strategies can be applied to detect, diagnose, and isolate faults. Existing methods that apply FDD to DC cooling systems are designed to successfully overcome individually occurring faults but have difficulty in handling simultaneous faults. These methods either require expensive measurements or those made over a wide range of conditions to develop training models, which can be time-consuming and costly. We develop a rapid and accurate, single and multiple FDD strategy for a DC with a row-based cooling system using data-driven fault classifiers informed by a gray-box temperature prediction model. The gray-box model provides thermal maps of the DC airspace for single as well as a few simultaneous failure conditions, which are used as inputs for two different data-driven classifiers, CNN and RNN, to rapidly predict multiple simultaneous failures. The model is validated with testing data from an experimental DC. Also, the effect of adding Gaussian white noise to training data is discussed and observed that even with low noisy environment, the FDD strategy can diagnose multiple faults with accuracy as high as 100% while requiring relatively few simultaneous fault training data samples. Finally, the different classifiers are compared in terms of accuracy, confusion matrix, precision, recall and F1-score. (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:17
相关论文
共 62 条
[1]   RESULTS AND CHALLENGES OF ARTIFICIAL NEURAL NETWORKS USED FOR DECISION-MAKING AND CONTROL IN MEDICAL APPLICATIONS [J].
Albu, Adriana ;
Precup, Radu-Emil ;
Teban, Teodor-Adrian .
FACTA UNIVERSITATIS-SERIES MECHANICAL ENGINEERING, 2019, 17 (03) :285-308
[2]  
Andrews D., 2019, SUSTAINABLE INNOVATI
[3]   A gray-box model for real-time transient temperature predictions in data centers [J].
Asgari, Sahar ;
MirhoseiniNejad, SeyedMorteza ;
Moazamigoodarzi, Hosein ;
Gupta, Rohit ;
Zheng, Rong ;
Puri, Ishwar K. .
APPLIED THERMAL ENGINEERING, 2021, 185
[4]   Hybrid surrogate model for online temperature and pressure predictions in data centers [J].
Asgari, Sahar ;
Moazamigoodarzi, Hosein ;
Tsai, Peiying Jennifer ;
Pal, Souvik ;
Zheng, Rong ;
Badawy, Ghada ;
Puri, Ishwar K. .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2021, 114 :531-547
[5]   An artificial neural network-based condition monitoring method for wind turbines, with application to the monitoring of the gearbox [J].
Bangalore, P. ;
Letzgus, S. ;
Karlsson, D. ;
Patriksson, M. .
WIND ENERGY, 2017, 20 (08) :1421-1438
[6]   Detection of gear failures via vibration and acoustic signals using wavelet transform [J].
Baydar, N ;
Ball, A .
MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2003, 17 (04) :787-804
[7]   Data-driven Fault Detection and Diagnosis for HVAC water chillers [J].
Beghi, A. ;
Brignoli, R. ;
Cecchinato, L. ;
Menegazzo, G. ;
Rampazzo, M. ;
Simmini, F. .
CONTROL ENGINEERING PRACTICE, 2016, 53 :79-91
[8]   RECURRENT NEURAL NETWORKS AND ROBUST TIME-SERIES PREDICTION [J].
CONNOR, JT ;
MARTIN, RD ;
ATLAS, LE .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (02) :240-254
[9]  
Cui Y, 2018, 2018 IEEE INTERNATIONAL CONFERENCE ON PROBABILISTIC METHODS APPLIED TO POWER SYSTEMS (PMAPS)
[10]  
Di Piazza A., 2016, Renew. Energy Environ. Sustain, V1, P39, DOI DOI 10.1051/REES/2016047