Experiences With Deep Learning Enhanced Steering Mechanisms for Debugging of Fundamental Cloud Services

被引:1
作者
Lovas, Robert [1 ]
Rigo, Erno
Unyi, Daniel [2 ]
Gyires-Toth, Balint
机构
[1] Eotvos Lorand Res Network ELKH, Inst Comp Sci & Control SZTAK, H-1111 Budapest, Hungary
[2] Budapest Univ Technol & Econ, Dept Telecommun & Media Informat, H-1111 Budapest, Hungary
关键词
Cloud computing; Computer architecture; Deep learning; Monitoring; Debugging; Circuit faults; Fault detection; deep learning; software debugging; reference architecture; service mesh; formal verification; Markov chains; autoencoder; long short-term memory; graph neural networks; FAULT-DETECTION;
D O I
10.1109/ACCESS.2023.3243201
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cloud architecture blueprints or reference architectures allow the reuse of existing knowledge and best practices when creating new cloud native solutions. Therefore, debugging of reference architecture candidates (or their new versions) is an extremely crucial but tedious and time-consuming task due to the deployment of complex services in typical multi-tenant and non-deterministic environments. During the debugging/testing/maintenance scenarios, we might be able to achieve greater levels of test coverage (and eventually improved reliability) by modelling and verifying at least their most fundamental building blocks and their interconnections. The main objective of our work is to integrate stochastic modelling and verification techniques based on deep learning methods into the debugging cycle in order to handle large state spaces more efficiently, i.e. by steering the process of traversing state space towards suspicious situations that may result in potential bugs in the actual system with smart steering during the traversal. For this purpose, our presented and illustrated approach combines (among others) Continuous Time Markov Chain modelling (CTMC) techniques with deep learning methods including autoencoder, Long Short-Term Memory (LSTM) and Graph Neural Network (GNN) models. Our experiences are summarized with widespread cloud design patterns including load balancing and service mesh topologies. According to the results, the debugging cycle can be partly automated through the application of deep learning methods. The autoencoders are able to detect erroneous load balancer behaviors (anomalies) in complex configurations; the LSTMs demonstrate implicitly some random nature of the inspected processes, and GNNs exploit the additional topology-related information in service meshes.
引用
收藏
页码:26403 / 26418
页数:16
相关论文
共 66 条
[51]   Acceptance Test for Fault Detection in Component-based Cloud Computing and Systems [J].
Smara, Mounya ;
Aliouat, Makhlouf ;
Pathan, Al-Sakib Khan ;
Aliouat, Zibouda .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2017, 70 :74-93
[52]  
Swetha S, 2018, INT J TREND SCI RES, V2, P24
[53]  
Tamura Y., 2015, PROC 2 INT C INF SCI, P1
[54]  
Tang Binh, 2021, ADV NEUR IN, V34
[55]  
Tomas P. R. B. N., 2022, THESIS U COIMBRA COI
[56]   MiCADO-Edge: Towards an Application-level Orchestrator for the Cloud-to-Edge Computing Continuum [J].
Ullah, Amjad ;
Dagdeviren, Huseyin ;
Ariyattu, Resmi C. ;
DesLauriers, James ;
Kiss, Tamas ;
Bowden, James .
JOURNAL OF GRID COMPUTING, 2021, 19 (04)
[57]  
Wang T, 2015, PROCEEDINGS OF THE 2015 IFIP/IEEE INTERNATIONAL SYMPOSIUM ON INTEGRATED NETWORK MANAGEMENT (IM), P652, DOI 10.1109/INM.2015.7140351
[58]  
Wehrle K., 2010, Tools for Network Simulation, P15
[59]   Performance Diagnosis in Cloud Microservices Using Deep Learning [J].
Wu, Li ;
Bogatinovski, Jasmin ;
Nedelkoski, Sasho ;
Tordsson, Johan ;
Kao, Odej .
SERVICE-ORIENTED COMPUTING, ICSOC 2020, 2021, 12632 :85-96
[60]   Graph neural networks for automated de novo drug design [J].
Xiong, Jiacheng ;
Xiong, Zhaoping ;
Chen, Kaixian ;
Jiang, Hualiang ;
Zheng, Mingyue .
DRUG DISCOVERY TODAY, 2021, 26 (06) :1382-1393