Anomaly detection for key performance indicators by fusing self-supervised spatio-temporal graph attention networks

被引：4

作者：

Chen, Ningjiang ^{[1
,2
,3
]}

Tu, Huan ^{[1
]}

Zeng, Haoyang ^{[1
]}

Ou, Yangjie ^{[1
]}

机构：

[1] Guangxi Univ, Sch Comp Elect & Informat, Nanning 530004, Peoples R China

[2] Guangxi Intelligent Digital Serv Res Ctr Engn Tech, Nanning 530004, Peoples R China

[3] Guangxi Univ, Educ Dept Guangxi Zhuang Autonomous Reg, Key Lab Parallel Distributed & Intelligent Comp, Nanning 530004, Peoples R China

来源：

KNOWLEDGE-BASED SYSTEMS | 2024年 / 300卷

关键词：

Key performance indicators; Anomaly detection; Spatio-temporal features; Graph attention network (GAT);

D O I：

10.1016/j.knosys.2024.112167

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

With the development of Artificial Intelligence for IT Operations (AIOps), numerous software and services are monitored by Key Performance Indicators (KPIs) collection components. Multivariate KPIs, as a type of time series data, are essential for effective management of the entity's service quality. In recent years, deep learning methods have made great improvements in the anomaly detection of multivariate time series; however, existing methods have not fully considered how to explicitly capture the correlation between multivariate time series in the feature dimension and temporal dimension, resulting in inevitable abnormal false positives. Therefore, this paper proposes a self-supervised multivariate KPIs anomaly detection method MAD-STA that combines graph structure learning and spatio-temporal GAT (Graph Attention Network). In the feature dimension, MAD-STA introduces a node embedding mechanism for graph structure learning and then uses the feature-oriented GAT layer to compute the graph attention coefficient to obtain the correlation between different KPIs. In the temporal dimension, MAD-STA uses the time-oriented GAT layer to compute attention weights between correlated timestamps, and the GRU-based VAE encoder captures long-term dependence to extract more comprehensive temporal feature representations. Finally, MAD-STA uses GRU-based VAE decoder to reconstruct the captured high-level features and achieves efficient anomaly detection and localization by calculating the anomaly score of multiple KPIs. Compared with the baseline methods on multiple data sets, the experimental results show that the anomaly detection accuracy of MAD-STA is better than that of the baseline method. Especially on the KPI data sets of the two server clusters of SMD and CKM, MAD-STA improves the performance and the F1 comprehensive index compared with the best baseline method. In addition, MAD-STA performs well on anomaly false positive rate and has excellent interpretability, which can be used to assist anomaly diagnosis and root cause index analysis.

引用

页数：13

共 38 条

[1] USAD : UnSupervised Anomaly Detection on Multivariate Time Series [J].

Audibert, Julien ;

Michiardi, Pietro ;

Guyard, Frederic ;

Marti, Sebastien ;

Zuluaga, Maria A. .

KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, :3395-3404

[2]

Bashar MA, 2020, 2020 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), P1778, DOI [10.1109/SSCI47803.2020.9308512, 10.1109/ssci47803.2020.9308512]

[3] A Review on Outlier/Anomaly Detection in Time Series Data [J].

Blazquez-Garcia, Ane ;

Conde, Angel ;

Mori, Usue ;

Lozano, Jose A. .

ACM COMPUTING SURVEYS, 2022, 54 (03)

[4] A semisupervised autoencoder-based approach for anomaly detection in high performance computing systems [J].

Borghesi, Andrea ;

Bartolini, Andrea ;

Lombardi, Michele ;

Milano, Michela ;

Benini, Luca .

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2019, 85 :634-644

[5]

Chen M, 2020, PR MACH LEARN RES, V119

[6] Semisupervised anomaly detection of multivariate time series based on a variational autoencoder [J].

Chen, Ningjiang ;

Tu, Huan ;

Duan, Xiaoyan ;

Hu, Liangqing ;

Guo, Chengxiang .

APPLIED INTELLIGENCE, 2023, 53 (05) :6074-6098

[7]

Daehyung Park, 2018, IEEE Robotics and Automation Letters, V3, P1544, DOI [10.1109/lra.2018.2801475, 10.1109/LRA.2018.2801475]

[8]

Deng AL, 2021, AAAI CONF ARTIF INTE, V35, P4027

[9]

Duan XY, 2019, P 7 CCF C BIGDATA, P366

[10] Unsupervised Anomaly Detection With LSTM Neural Networks [J].

Ergen, Tolga ;

Kozat, Suleyman Serdar .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (08) :3127-3141

← 1 2 3 4 →