Murphy: Performance Diagnosis of Distributed Cloud Applications

被引:1
|
作者
Harsh, Vipul [1 ,2 ]
Zhou, Wenxuan [2 ]
Ashok, Sachin [1 ]
Mysore, Radhika Niranjan [3 ]
Godfrey, P. Brighten [1 ,2 ]
Banerjee, Sujata [3 ]
机构
[1] Univ Illinois, Chicago, IL 60680 USA
[2] VMware, Palo Alto, CA 94304 USA
[3] VMware Res, Palo Alto, CA USA
来源
PROCEEDINGS OF THE 2023 ACM SIGCOMM 2023 CONFERENCE, SIGCOMM 2023 | 2023年
关键词
performance diagnosis; cyclic dependencies; enterprise networks; microservices;
D O I
10.1145/3603269.3604877
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Modern cloud-based applications have complex inter-dependencies on both distributed application components as well as network infrastructure, making it difficult to reason about their performance. As a result, a rich body of work seeks to automate performance diagnosis of enterprise networks and such cloud applications. However, existing methods either ignore inter-dependencies which results in poor accuracy, or require causal acyclic dependencies which cannot model common enterprise environments. We describe the design and implementation of Murphy, an automated performance diagnosis system, that can work with commonly available telemetry in practical enterprise environments, while achieving high accuracy. Murphy utilizes loosely-defined associations between entities obtained from commonly available monitoring data. Its learning algorithm is based on a Markov Random Field (MRF) that can take advantage of such loose associations to reason about how entities affect each other in the context of a specific incident. We evaluate Murphy in an emulated microservice environment and in real incidents from a large enterprise. Compared to past work, Murphy is able to reduce diagnosis error by approximate to 1.35x in restrictive environments supported by past work, and by >= 4.7x in more general environments.
引用
收藏
页码:438 / 451
页数:14
相关论文
共 50 条
  • [21] Enriching Cloud-native Applications with Sustainability Features
    Vitali, Monica
    Schmiedmayer, Paul
    Bootz, Valentin
    2023 IEEE INTERNATIONAL CONFERENCE ON CLOUD ENGINEERING, IC2E, 2023, : 21 - 31
  • [22] Cloud Elasticity of Microservices-Based Applications: A Survey
    Fourati, Mohamed Hedi
    Marzouk, Soumaya
    Jmaiel, Mohamed
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2025, 37 (02)
  • [23] Anomaly Detection and Failure Root Cause Analysis in (Micro) Service-Based Cloud Applications: A Survey
    Soldani, Jacopo
    Brogi, Antonio
    ACM COMPUTING SURVEYS, 2023, 55 (03)
  • [24] Causal Inference Techniques for Microservice Performance Diagnosis: Evaluation and Guiding Recommendations
    Wu, Li
    Tordsson, Johan
    Elmroth, Erik
    Kao, Odej
    2021 IEEE INTERNATIONAL CONFERENCE ON AUTONOMIC COMPUTING AND SELF-ORGANIZING SYSTEMS (ACSOS 2021), 2021, : 21 - 30
  • [25] VITASENIOR-MT: A distributed and scalable cloud-based telehealth solution
    Mendes, Diogo
    Panda, Renato
    Dias, Pedro
    Jorge, Dario
    Antonio, Ricardo
    Oliveira, Luis
    Pires, Gabriel
    2019 IEEE 5TH WORLD FORUM ON INTERNET OF THINGS (WF-IOT), 2019, : 767 - 772
  • [26] Performance Diagnosis for Inefficient Loops
    Song, Linhai
    Lu, Shan
    2017 IEEE/ACM 39TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2017, : 370 - 380
  • [27] OXN - Automated Observability Assessments for Cloud-Native Applications
    Borges, Maria C.
    Bauer, Joshua
    Werner, Sebastian
    IEEE 21ST INTERNATIONAL CONFERENCE ON SOFTWARE ARCHITECTURE COMPANION, ICSA-C 2024, 2024, : 167 - 170
  • [28] An Asynchronous Panel Discussion What Are Cloud-Native Applications?
    Gannon, Dennis
    Barga, Roger
    Sundaresan, Neel
    Goasguen, Sebastien
    Gustaffson, Niklas
    Subramanian, Balan
    Davis, Cornelia
    Kohn, Dan
    IEEE CLOUD COMPUTING, 2017, 4 (05): : 50 - 54
  • [29] Security-as-a-Service for Microservices-Based Cloud Applications
    Sun, Yuqiong
    Nanda, Susanta
    Jaeger, Trent
    2015 IEEE 7TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING TECHNOLOGY AND SCIENCE (CLOUDCOM), 2015, : 50 - 57
  • [30] Technical Debt and Software Quality in Cloud-Native Applications
    Su, Ruoyu
    SOFTWARE ARCHITECTURE, ECSA 2024 TRACKS AND WORKSHOPS, 2024, 14937 : 65 - 71