Murphy: Performance Diagnosis of Distributed Cloud Applications

被引:1
|
作者
Harsh, Vipul [1 ,2 ]
Zhou, Wenxuan [2 ]
Ashok, Sachin [1 ]
Mysore, Radhika Niranjan [3 ]
Godfrey, P. Brighten [1 ,2 ]
Banerjee, Sujata [3 ]
机构
[1] Univ Illinois, Chicago, IL 60680 USA
[2] VMware, Palo Alto, CA 94304 USA
[3] VMware Res, Palo Alto, CA USA
来源
PROCEEDINGS OF THE 2023 ACM SIGCOMM 2023 CONFERENCE, SIGCOMM 2023 | 2023年
关键词
performance diagnosis; cyclic dependencies; enterprise networks; microservices;
D O I
10.1145/3603269.3604877
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Modern cloud-based applications have complex inter-dependencies on both distributed application components as well as network infrastructure, making it difficult to reason about their performance. As a result, a rich body of work seeks to automate performance diagnosis of enterprise networks and such cloud applications. However, existing methods either ignore inter-dependencies which results in poor accuracy, or require causal acyclic dependencies which cannot model common enterprise environments. We describe the design and implementation of Murphy, an automated performance diagnosis system, that can work with commonly available telemetry in practical enterprise environments, while achieving high accuracy. Murphy utilizes loosely-defined associations between entities obtained from commonly available monitoring data. Its learning algorithm is based on a Markov Random Field (MRF) that can take advantage of such loose associations to reason about how entities affect each other in the context of a specific incident. We evaluate Murphy in an emulated microservice environment and in real incidents from a large enterprise. Compared to past work, Murphy is able to reduce diagnosis error by approximate to 1.35x in restrictive environments supported by past work, and by >= 4.7x in more general environments.
引用
收藏
页码:438 / 451
页数:14
相关论文
共 50 条
  • [31] Integrating Continuous Security Assessments in Microservices and Cloud Native Applications
    Torkura, Kennedy A.
    Sukmana, Muhammad I. H.
    Meinel, Christoph
    PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING (UCC' 17), 2017, : 171 - 180
  • [32] Automated Setup of Multi-Cloud Environments for Microservices Applications
    Sousa, Gustavo
    Rudametkin, Walter
    Duchien, Laurence
    PROCEEDINGS OF 2016 IEEE 9TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD), 2016, : 327 - 334
  • [33] A configurable method for benchmarking scalability of cloud-native applications
    Henning, Soeren
    Hasselbring, Wilhelm
    EMPIRICAL SOFTWARE ENGINEERING, 2022, 27 (06)
  • [34] Efficient Data Delivery Scheme for Large-Scale Microservices in Distributed Cloud Environment
    Pham, Van-Nam
    Hossain, Md. Delowar
    Lee, Ga-Won
    Huh, Eui-Nam
    APPLIED SCIENCES-BASEL, 2023, 13 (02):
  • [35] Data Fusion of Observability Signals for Assisting Orchestration of Distributed Applications
    Tzanettis, Ioannis
    Androna, Christina-Maria
    Zafeiropoulos, Anastasios
    Fotopoulou, Eleni
    Papavassiliou, Symeon
    SENSORS, 2022, 22 (05)
  • [36] Kubernetes: Towards Deployment of Distributed IoT Applications in Fog Computing
    Kayal, Paridhika
    ICPE'20: COMPANION OF THE ACM/SPEC INTERNATIONAL CONFERENCE ON PERFORMANCE ENGINEERING, 2020, : 32 - 33
  • [37] A distributed tracing pipeline for improving locality awareness of microservices applications
    Colarusso, Carmine
    De Caro, Assunta
    Falco, Ida
    Goglia, Lorenzo
    Zimeo, Eugenio
    SOFTWARE-PRACTICE & EXPERIENCE, 2024, 54 (06) : 1118 - 1140
  • [38] A New Efficient Distributed Computing Middleware based on Cloud Micro-Services for HPC
    Benchara, Fatema Zahra
    Youssfi, Mohamed
    Bouattane, Omar
    Ouajji, Hassan
    PROCEEDINGS OF 2016 5TH INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS (ICMCS), 2016, : 354 - 359
  • [39] Design of Scalable and Resilient Applications using Microservice Architecture in PaaS Cloud
    Gesvindr, David
    Davidek, Jaroslav
    Buhnova, Barbora
    ICSOFT: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON SOFTWARE TECHNOLOGIES, 2019, : 619 - 630
  • [40] Resource scheduling of concurrency based applications in IoT based cloud environment
    Aron, Rajni
    Aggarwal, Deepak. K.
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 14 (6) : 6817 - 6828