Graph neural networks for detecting anomalies in scientific workflows

被引：2

作者：

Jin, Hongwei ^{[1
,6
]}

Raghavan, Krishnan ^{[1
]}

Papadimitriou, George ^{[2
]}

Wang, Cong ^{[3
]}

Mandal, Anirban ^{[3
]}

Kiran, Mariam ^{[4
]}

Deelman, Ewa ^{[2
]}

Balaprakash, Prasanna ^{[5
]}

机构：

[1] Argonne Natl Lab, Lemont, IL USA

[2] Univ Southern Calif, Los Angeles, CA USA

[3] Renaissance Comp Inst RENCI, Chapel Hill, NC USA

[4] Energy Sci Network ESnet, Berkeley, CA USA

[5] Oak Ridge Natl Lab, Oak Ridge, TN USA

[6] Argonne Natl Lab, Math & Comp Sci Div, 9700 S Cass Ave, Lemont, IL 60439 USA

来源：

INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS | 2023年 / 37卷 / 3-4期

关键词：

Anomaly detection; machine learning; graph neural networks; scientific workflows; hyperparameter tuning; explainable predictions;

D O I：

10.1177/10943420231172140

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Identifying and addressing anomalies in complex, distributed systems can be challenging for reliable execution of scientific workflows. We model these workflows as directed acyclic graphs (DAGs), where the nodes and edges of the DAGs represent jobs and their dependencies, respectively. We develop graph neural networks (GNNs) to learn patterns in the DAGs and to detect anomalies at the node (job) and graph (workflow) levels. We investigate workflow-specific GNN models that are trained on a particular workflow and workflow-agnostic GNN models that are trained across the workflows. Our GNN models, which incorporate both individual job features and topological information from the workflow, show improved accuracy and efficiency compared to conventional learning methods for detecting anomalies. While joint trained with multiple scientific workflows, our GNN models reached an accuracy more than 80% for workflow level and 75% for job level anomalies. In addition, we illustrate the importance of hyperparameter tuning method in our study that can significantly improve the metric(s) measure of evaluating the GNN models. Finally, we integrate explainable GNN methods to provide insights on job features in the workflow that cause an anomaly.

引用

页码：394 / 411

页数：18

共 50 条

[41] Graph Neural Networks for Intrusion Detection: A Survey
Bilot, Tristan
Madhoun, Nour El
Al Agha, Khaldoun
Zouaoui, Anis
IEEE ACCESS, 2023, 11 : 49114 - 49139
[42] Graph Anomaly Detection With Graph Neural Networks: Current Status and Challenges
Kim, Hwan
Lee, Byung Suk
Shin, Won-Yong
Lim, Sungsu
IEEE ACCESS, 2022, 10 : 111820 - 111829
[43] Detecting anomalies in time series data from a manufacturing system using recurrent neural networks
Wang, Yue
Perry, Michael
Whitlock, Dane
Sutherland, John W.
JOURNAL OF MANUFACTURING SYSTEMS, 2022, 62 : 823 - 834
[44] Investigating Transfer Learning in Graph Neural Networks
Kooverjee, Nishai
James, Steven
van Zyl, Terence
ELECTRONICS, 2022, 11 (08)
[45] Detecting User Behavior Anomalies in Communication Networks
Li, Quangang
Liu, Peipeng
2017 2ND IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYSIS (ICCCBDA 2017), 2017, : 384 - 388
[46] Detecting symmetries with neural networks
Krippendorf, Sven
Syvaeri, Marc
MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2021, 2 (01):
[47] Semisupervised Graph Neural Networks for Graph Classification
Xie, Yu
Liang, Yanfeng
Gong, Maoguo
Qin, A. K.
Ong, Yew-Soon
He, Tiantian
IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (10) : 6222 - 6235
[48] Factor Graph Neural Networks
Zhang, Zhen
Dupty, Mohammed Haroon
Wu, Fan
Shi, Javen Qinfeng
Lee, Wee Sun
JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
[49] Torsion Graph Neural Networks
Shen, Cong
Liu, Xiang
Luo, Jiawei
Xia, Kelin
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (04) : 2946 - 2956
[50] STOCHASTIC GRAPH NEURAL NETWORKS
Gao, Zhan
Isufi, Elvin
Ribeiro, Alejandro
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 9080 - 9084

← 1 2 3 4 5 →