Microservices Monitoring with Event Logs and Black Box Execution Tracing

被引:32
作者
Cinque, Marcello [1 ]
Della Corte, Raffaele [1 ]
Pecchia, Antonio [1 ]
机构
[1] Univ Napoli Federico II, Dipartimento Ingn Elettr & Tecnol Informaz DIETI, Via Claudio 21, I-80125 Naples, Italy
关键词
monitoring; microservices; REST; docker; clearwater; kubernetes; log analysis; VERIFICATION; DESIGN; TIME;
D O I
10.1109/TSC.2019.2940009
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Monitoring is a core practice in any software system. Trends in microservices systems exacerbate the role of monitoring and pose novel challenges to data sources being used for monitoring, such as event logs. Current deployments create a distinct log per microservice; moreover, composing microservices by different vendors exacerbates format and semantic heterogeneity of logs. Understanding and traversing the logs from different microservices demands for substantial cognitive work by human experts. This paper proposes a novel approach to accompany microservices logs with black box tracing to help practitioners in making informed decisions for troubleshooting. Our approach is based on the passive tracing of request-response messages of the REpresentational State Transfer (REST) communication model. Differently from many existing tools for microservices, our tracing is application transparent and non-intrusive. We present an implementation called MetroFunnel and conduct an assessment in the context of two case studies: a Clearwater IP Multimedia Subsystem (IMS) setup consisting of Docker microservices and a Kubemetes orchestrator deployment hosting tens of microservices. MetroFunnel allows making useful attributions in traversing the logs; more important, it reduces the size of collected monitoring data at negligible performance overhead with respect to traditional logs.
引用
收藏
页码:294 / 307
页数:14
相关论文
共 23 条
[1]  
Aguilera M. K., 2003, Operating Systems Review, V37, P74, DOI 10.1145/1165389.945454
[2]  
[Anonymous], 2000, Experimentation in Software Engineering-An Introduction
[3]  
[Anonymous], ART COMPUTER SYSTEMS
[4]  
Benjamin H.Sigelman., 2010, Dapper, a large-scale distributed systems tracing infrastructure
[5]   Design-Time to Run-Time Verification of Microservices Based Applications (Short Paper) [J].
Camilli, Matteo ;
Bellettini, Carlo ;
Capra, Lorenzo .
SOFTWARE ENGINEERING AND FORMAL METHODS, SEFM 2017, 2018, 10729 :168-173
[6]   Event-Based Runtime Verification of Temporal Properties Using Time Basic Petri Nets [J].
Camilli, Matteo ;
Gargantini, Angelo ;
Scandurra, Patrizia ;
Bellettini, Carlo .
NASA FORMAL METHODS (NFM 2017), 2017, 10227 :115-130
[7]   Characterizing Direct Monitoring Techniques in Software Systems [J].
Cinque, Marcello ;
Cotroneo, Domenico ;
Della Corte, Raffaele ;
Pecchia, Antonio .
IEEE TRANSACTIONS ON RELIABILITY, 2016, 65 (04) :1665-1681
[8]  
Dragoni N., 2017, Present and Ulterior Software Engineering, P195, DOI DOI 10.1007/978-3-319-67425-4_12
[9]   Optimizing Monitorability of Multi-cloud Applications [J].
Fadda, Edoardo ;
Plebani, Pierluigi ;
Vitali, Monica .
ADVANCED INFORMATION SYSTEMS ENGINEERING (CAISE 2016), 2016, 9694 :411-426
[10]   BLT: Bi-layer tracing of HTTP and TCP/IP [J].
Feldmann, A .
COMPUTER NETWORKS-THE INTERNATIONAL JOURNAL OF COMPUTER AND TELECOMMUNICATIONS NETWORKING, 2000, 33 (1-6) :321-335