Infrastructure Fault Detection and Prediction in Edge Cloud Environments

被引:36
作者
Soualhia, Mbarka [1 ]
Fu, Chunyan [2 ]
Khomh, Foutse [1 ]
机构
[1] Polytech Montreal, Montreal, PQ, Canada
[2] Ericsson Res Canada, Mississauga, ON, Canada
来源
SEC'19: PROCEEDINGS OF THE 4TH ACM/IEEE SYMPOSIUM ON EDGE COMPUTING | 2019年
关键词
D O I
10.1145/3318216.3363305
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As an emerging 5G system component, edge cloud becomes one of the key enablers to provide services such us mission critical, IoT and content delivery applications. However, because of limited fail-over mechanisms in edge clouds, faults (e.g., CPU or HDD faults) are highly undesirable. When infrastructure faults occur in edge clouds, they can accumulate and propagate; leading to severe degradation of system and application performance. It is therefore crucial to identify these faults early on and mitigate them. In this paper, we propose a framework to detect and predict several faults at infrastructure-level of edge clouds using supervised machine learning and statistical techniques. The proposed framework is composed of three main components responsible for: (1) data pre-processing, (2) fault detection, and (3) fault prediction. The results show that the framework allows to timely detect and predict several faults online. For instance, using Support Vector Machine (SVM), Random Forest (RF) and Neural Network(NN) models, the framework is able to detect non-fatal CPU and HDD overload faults with an F1 score of more than 95%. For the prediction, the Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM) have comparable accuracy at 96.47% vs. 96.88% for CPU-overload fault and 85.52% vs. 88.73% for network fault.
引用
收藏
页码:222 / 235
页数:14
相关论文
共 31 条
[31]  
Zheng Y, 2014, LECT NOTES COMPUT SC, V8485, P298, DOI 10.1007/978-3-319-08010-9_33