Robust and accurate performance anomaly detection and prediction for cloud applications: a novel ensemble learning-based framework

被引:23
作者
Xin, Ruyue [1 ]
Liu, Hongyun [1 ]
Chen, Peng [2 ]
Zhao, Zhiming [1 ]
机构
[1] Univ Amsterdam, Multiscale Networked Syst MNS Res Grp, Amsterdam, Netherlands
[2] Xihua Univ, Sch Comp & Software Engn, Chengdu, Peoples R China
来源
JOURNAL OF CLOUD COMPUTING-ADVANCES SYSTEMS AND APPLICATIONS | 2023年 / 12卷 / 01期
基金
欧盟地平线“2020”;
关键词
Performance anomaly detection; Algorithm robustness; Anomaly prediction; Ensemble learning; Deep ensemble; SUPPORT;
D O I
10.1186/s13677-022-00383-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Effectively detecting run-time performance anomalies is crucial for clouds to identify abnormal performance behavior and forestall future incidents. To be used for real-world applications, an effective anomaly detection framework should meet three main challenging requirements: high accuracy for identifying anomalies, good robustness when application patterns change, and prediction ability for upcoming anomalies. Unfortunately, existing research about performance anomaly detection usually focuses on improving detection accuracy, while little research tackles the three challenges simultaneously. We conduct experiments for existing detection methods on multiple application monitoring data, and results show that existing detection methods usually focus on different features in data, which will lead to their diverse performance on different data patterns. Therefore, existing anomaly detection methods have difficulty improving detection accuracy and robustness and predicting anomalies. To address the three requirements, we propose an Ensemble Learning-Based Detection (ELBD) framework which integrates existing well-selected detection methods. The framework includes three classic linear ensemble methods (maximum, average, and weighted average) and a novel deep ensemble method. Our experiments show that the ELBD framework realizes better detection accuracy and robustness, where the deep ensemble method can achieve the most accurate and robust detection for cloud applications. In addition, it can predict anomalies in the next four minutes with an F1 score higher than 0.8. The paper also proposes a new indicator ARP_score to measure detection accuracy, robustness, and multi-step prediction ability. The ARP_score of the deep ensemble method is 5.1821, which is much higher than other detection methods.
引用
收藏
页数:16
相关论文
共 54 条
[1]   Principal component analysis [J].
Abdi, Herve ;
Williams, Lynne J. .
WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2010, 2 (04) :433-459
[2]  
Aggarwal CC, 2015, ACM SIGKDD Explorations Newsletter, V17, P24, DOI [10.1145/2830544.2830549, 10.1145/2830544.2830549, DOI 10.1145/2830544.2830549]
[3]   Adaptive Performance Anomaly Detection in Distributed Systems Using Online SVMs [J].
Alvarez Cid-Fuentes, Javier ;
Szabo, Claudia ;
Falkner, Katrina .
IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2020, 17 (05) :928-941
[4]   USAD : UnSupervised Anomaly Detection on Multivariate Time Series [J].
Audibert, Julien ;
Michiardi, Pietro ;
Guyard, Frederic ;
Marti, Sebastien ;
Zuluaga, Maria A. .
KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, :3395-3404
[5]   A semisupervised autoencoder-based approach for anomaly detection in high performance computing systems [J].
Borghesi, Andrea ;
Bartolini, Andrea ;
Lombardi, Michele ;
Milano, Michela ;
Benini, Luca .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2019, 85 :634-644
[6]   LOF: Identifying density-based local outliers [J].
Breunig, MM ;
Kriegel, HP ;
Ng, RT ;
Sander, J .
SIGMOD RECORD, 2000, 29 (02) :93-104
[7]   An In-Depth Study and Improvement of Isolation Forest [J].
Chabchoub, Yousra ;
Togbe, Maurras Ulbricht ;
Boly, Aliou ;
Chiky, Raja .
IEEE ACCESS, 2022, 10 :10219-10237
[8]  
Elijah AV, 2019, INT J ADV COMPUT SC, V10, P520
[9]  
Feiyu Xu, 2019, Natural Language Processing and Chinese Computing. 8th CCF International Conference, NLPCC 2019. Proceedings. Lecture Notes in Artificial Intelligence, Subseries of Lecture Notes in Computer Science (LNAI 11839), P563, DOI 10.1007/978-3-030-32236-6_51
[10]   Combining multiple clusterings using evidence accumulation [J].
Fred, ALN ;
Jain, AK .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2005, 27 (06) :835-850