An Empirical Analysis of Anomaly Detection Methods for Multivariate Time Series

被引:3
作者
Li, Dongwen [1 ]
Zhang, Shenglin [1 ,2 ]
Sun, Yongqian [1 ]
Guo, Yang [1 ]
Che, Zeyu [1 ]
Chen, Shiqi [1 ]
Zhong, Zhenyu [1 ]
Liang, Minghan [1 ]
Shao, Minyi [1 ]
Li, Mingjie [1 ]
Liu, Shuyang [1 ]
Zhang, Yuzhi [1 ,2 ]
Pei, Dan [3 ]
机构
[1] Nankai Univ, Tianjin, Peoples R China
[2] Haihe Lab Informat Technol Applicat Innovat, Tianjin, Peoples R China
[3] Tsinghua Univ, Beijing, Peoples R China
来源
2023 IEEE 34TH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING, ISSRE | 2023年
基金
中国国家自然科学基金;
关键词
Multivariate Time Series; Anomaly Detection; Practical Challenges; Empirical Analysis;
D O I
10.1109/ISSRE59848.2023.00014
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Using multivariate time series (MTS) data for anomaly detection is widely adopted in service systems, such as web services and financial businesses. Researchers have recently proposed some well-performed algorithms for MTS anomaly detection from different perspectives. When applied to the real world, we observe that none of the algorithms is adaptable to all scenarios due to the complex data and anomaly characteristics. Moreover, there is currently a lack of comprehensive analysis work of these algorithms to guide operators in selecting the appropriate one in practice. To bridge this gap, we conduct an empirical study using various real-world data to gain an in-depth understanding of state-of-the-art anomaly detection algorithms. First, we provide general recommendations to guide operators in selecting suitable models based on the volume of training data, computational resources, and effectiveness requirements. Then, we summarize the typical data characteristics and types of anomalies and offer tailored model selection suggestions for different data characteristics and anomaly types. At last, we apply the summarized model selection suggestions to all the datasets we collected. The results show that most of our suggestions can achieve better than any single algorithm alone, demonstrating the effectiveness and generalization of our recommendations.
引用
收藏
页码:57 / 68
页数:12
相关论文
共 26 条
[1]   USAD : UnSupervised Anomaly Detection on Multivariate Time Series [J].
Audibert, Julien ;
Michiardi, Pietro ;
Guyard, Frederic ;
Marti, Sebastien ;
Zuluaga, Maria A. .
KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, :3395-3404
[2]   A Review on Outlier/Anomaly Detection in Time Series Data [J].
Blazquez-Garcia, Ane ;
Conde, Angel ;
Mori, Usue ;
Lozano, Jose A. .
ACM COMPUTING SURVEYS, 2022, 54 (03)
[3]   SDFVAE: Static and Dynamic Factorized VAE for Anomaly Detection of Multivariate CDN KPIs [J].
Dai, Liang ;
Lin, Tao ;
Liu, Chang ;
Jiang, Bo ;
Liu, Yanwei ;
Xu, Zhen ;
Zhang, Zhi-Li .
PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2021 (WWW 2021), 2021, :3076-3086
[4]  
Deng AL, 2021, AAAI CONF ARTIF INTE, V35, P4027
[5]   Semi-supervised anomaly detection algorithms: A comparative summary and future research directions [J].
Elizabeth Villa-Perez, Miryam ;
Alvarez-Carmona, Miguel A. ;
Loyola-Gonzalez, Octavio ;
Angel Medina-Perez, Miguel ;
Carlos Velazco-Rossell, Juan ;
Raymond Choo, Kim-Kwang .
KNOWLEDGE-BASED SYSTEMS, 2021, 218
[6]  
Garcia Gonzalez Gaston, 2021, ACM SIGMETRICS Performance Evaluation Review, V48, P49, DOI [10.1145/3466826.3466843, 10.1145/3466826.3466843]
[7]  
github, 2023, ABOUT US
[8]   Outlier detection using k-nearest neighbour graph [J].
Hautamäki, V ;
Kärkkäinen, I ;
Fränti, P .
PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 3, 2004, :430-433
[9]   Discovering cluster-based local outliers [J].
He, ZY ;
Xu, XF ;
Deng, SC .
PATTERN RECOGNITION LETTERS, 2003, 24 (9-10) :1641-1650
[10]   Detecting Spacecraft Anomalies Using LSTMs and Nonparametric Dynamic Thresholding [J].
Hundman, Kyle ;
Constantinou, Valentino ;
Laporte, Christopher ;
Colwell, Ian ;
Soderstrom, Tom .
KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, :387-395