Graph neural networks for detecting anomalies in scientific workflows

被引:2
|
作者
Jin, Hongwei [1 ,6 ]
Raghavan, Krishnan [1 ]
Papadimitriou, George [2 ]
Wang, Cong [3 ]
Mandal, Anirban [3 ]
Kiran, Mariam [4 ]
Deelman, Ewa [2 ]
Balaprakash, Prasanna [5 ]
机构
[1] Argonne Natl Lab, Lemont, IL USA
[2] Univ Southern Calif, Los Angeles, CA USA
[3] Renaissance Comp Inst RENCI, Chapel Hill, NC USA
[4] Energy Sci Network ESnet, Berkeley, CA USA
[5] Oak Ridge Natl Lab, Oak Ridge, TN USA
[6] Argonne Natl Lab, Math & Comp Sci Div, 9700 S Cass Ave, Lemont, IL 60439 USA
关键词
Anomaly detection; machine learning; graph neural networks; scientific workflows; hyperparameter tuning; explainable predictions;
D O I
10.1177/10943420231172140
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Identifying and addressing anomalies in complex, distributed systems can be challenging for reliable execution of scientific workflows. We model these workflows as directed acyclic graphs (DAGs), where the nodes and edges of the DAGs represent jobs and their dependencies, respectively. We develop graph neural networks (GNNs) to learn patterns in the DAGs and to detect anomalies at the node (job) and graph (workflow) levels. We investigate workflow-specific GNN models that are trained on a particular workflow and workflow-agnostic GNN models that are trained across the workflows. Our GNN models, which incorporate both individual job features and topological information from the workflow, show improved accuracy and efficiency compared to conventional learning methods for detecting anomalies. While joint trained with multiple scientific workflows, our GNN models reached an accuracy more than 80% for workflow level and 75% for job level anomalies. In addition, we illustrate the importance of hyperparameter tuning method in our study that can significantly improve the metric(s) measure of evaluating the GNN models. Finally, we integrate explainable GNN methods to provide insights on job features in the workflow that cause an anomaly.
引用
收藏
页码:394 / 411
页数:18
相关论文
共 50 条
  • [41] Graph Neural Networks for Intrusion Detection: A Survey
    Bilot, Tristan
    Madhoun, Nour El
    Al Agha, Khaldoun
    Zouaoui, Anis
    IEEE ACCESS, 2023, 11 : 49114 - 49139
  • [42] Graph Anomaly Detection With Graph Neural Networks: Current Status and Challenges
    Kim, Hwan
    Lee, Byung Suk
    Shin, Won-Yong
    Lim, Sungsu
    IEEE ACCESS, 2022, 10 : 111820 - 111829
  • [43] Detecting anomalies in time series data from a manufacturing system using recurrent neural networks
    Wang, Yue
    Perry, Michael
    Whitlock, Dane
    Sutherland, John W.
    JOURNAL OF MANUFACTURING SYSTEMS, 2022, 62 : 823 - 834
  • [44] Investigating Transfer Learning in Graph Neural Networks
    Kooverjee, Nishai
    James, Steven
    van Zyl, Terence
    ELECTRONICS, 2022, 11 (08)
  • [45] Detecting User Behavior Anomalies in Communication Networks
    Li, Quangang
    Liu, Peipeng
    2017 2ND IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYSIS (ICCCBDA 2017), 2017, : 384 - 388
  • [46] Detecting symmetries with neural networks
    Krippendorf, Sven
    Syvaeri, Marc
    MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2021, 2 (01):
  • [47] Semisupervised Graph Neural Networks for Graph Classification
    Xie, Yu
    Liang, Yanfeng
    Gong, Maoguo
    Qin, A. K.
    Ong, Yew-Soon
    He, Tiantian
    IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (10) : 6222 - 6235
  • [48] Factor Graph Neural Networks
    Zhang, Zhen
    Dupty, Mohammed Haroon
    Wu, Fan
    Shi, Javen Qinfeng
    Lee, Wee Sun
    JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
  • [49] Torsion Graph Neural Networks
    Shen, Cong
    Liu, Xiang
    Luo, Jiawei
    Xia, Kelin
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (04) : 2946 - 2956
  • [50] STOCHASTIC GRAPH NEURAL NETWORKS
    Gao, Zhan
    Isufi, Elvin
    Ribeiro, Alejandro
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 9080 - 9084