Smart predictive maintenance for high-performance computing systems: a literature review

被引:0
|
作者
André Luis da Cunha Dantas Lima
Vitor Moraes Aranha
Caio Jordão de Lima Carvalho
Erick Giovani Sperandio Nascimento
机构
[1] SENAI CIMATEC Manufacturing and Technology Integrated Campus,Faculdade de Tecnologia SENAI CIMATEC Salvador
来源
The Journal of Supercomputing | 2021年 / 77卷
关键词
Predictive maintenance; High-performance computing; HPC; Artificial intelligence; Deep learning;
D O I
暂无
中图分类号
学科分类号
摘要
Predictive maintenance is an invaluable tool to preserve the health of mission critical assets while minimizing the operational costs of scheduled intervention. Artificial intelligence techniques have been shown to be effective at treating large volumes of data, such as the ones collected by the sensors typically present in equipment. In this work, we aim to identify and summarize existing publications in the field of predictive maintenance that explore machine learning and deep learning algorithms to improve the performance of failure classification and detection. We show a significant upward trend in the use of deep learning methods of sensor data collected by mission critical assets for early failure detection to assist predictive maintenance schedules. We also identify aspects that require further investigation in future works, regarding exploration of life support systems for supercomputing assets and standardization of performance metrics.
引用
收藏
页码:13494 / 13513
页数:19
相关论文
共 50 条
  • [41] A Comparative Review of High-Performance Computing Major Cloud Service Providers
    Aljamal, Rawan
    El-Mousa, Ali
    Jubair, Fahed
    2018 9TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION SYSTEMS (ICICS), 2018, : 181 - 186
  • [42] High-Performance Computing on Power System Transient Stability Analysis: A Review
    Wang, Cong
    Liang, Shiyang
    Jia, Xun
    Jin, Shuangshuang
    2023 NORTH AMERICAN POWER SYMPOSIUM, NAPS, 2023,
  • [43] A Large-Scale Study of Failures in High-Performance Computing Systems
    Schroeder, Bianca
    Gibson, Garth A.
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2010, 7 (04) : 337 - 350
  • [44] A Scalable Runtime Fault Localization Framework for High-Performance Computing Systems
    Jian Gao
    Hongmei Wei
    Kang Yu
    Peng Qing
    International Journal of Parallel Programming, 2018, 46 : 749 - 761
  • [45] Modeling Problems of Magnetic Hydrodynamics Problems on High-Performance Computing Systems
    Chetverushkin B.N.
    Saveliev A.V.
    Saveliev V.I.
    Mathematical Models and Computer Simulations, 2021, 13 (4) : 631 - 637
  • [46] A Scalable Runtime Fault Localization Framework for High-Performance Computing Systems
    Gao, Jian
    Wei, Hongmei
    Yu, Kang
    Qing, Peng
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2018, 46 (04) : 749 - 761
  • [47] Energy-Aware Scheduling for High-Performance Computing Systems: A Survey
    Kocot, Bartlomiej
    Czarnul, Pawel
    Proficz, Jerzy
    ENERGIES, 2023, 16 (02)
  • [48] Application of Hybrid Computing Technologies for High-Performance Distributed NFV Systems
    Rovnyagin, Mikhail M.
    Kuznetsov, Alexey A.
    PROCEEDINGS OF THE 2017 IEEE RUSSIA SECTION YOUNG RESEARCHERS IN ELECTRICAL AND ELECTRONIC ENGINEERING CONFERENCE (2017 ELCONRUS), 2017, : 540 - 543
  • [49] Predictive Resource Management for Next-Generation High-Performance Computing Heterogeneous Platforms
    Massari, Giuseppe
    Pupykina, Anna
    Agosta, Giovanni
    Fornaciari, William
    EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING, AND SIMULATION, SAMOS 2019, 2019, 11733 : 470 - 483
  • [50] A High-Performance Computing Implementation of Iterative Random Forest for the Creation of Predictive Expression Networks
    Cliff, Ashley
    Romero, Jonathon
    Kainer, David
    Walker, Angelica
    Furches, Anna
    Jacobson, Daniel
    GENES, 2019, 10 (12)