Sieve: Attention-based Sampling of End-to-End Trace Data in Distributed Microservice Systems

被引:11
作者
Huang, Zicheng [1 ]
Chen, Pengfei [1 ]
Yu, Guangba [1 ]
Chen, Hongyang [1 ]
Zheng, Zibin [1 ]
机构
[1] Sun Yat Sen Univ, Sch Data & Comp Sci, Guangzhou 510006, Peoples R China
来源
2021 IEEE INTERNATIONAL CONFERENCE ON WEB SERVICES, ICWS 2021 | 2021年
基金
中国国家自然科学基金;
关键词
End-to-end tracing; Weighted sampling; Microservice; Robust Random Cut Forest;
D O I
10.1109/ICWS53863.2021.00063
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
End-to-end tracing plays an important role in understanding and monitoring distributed microservice systems. The trace data are valuable to help find out the anomalous or erroneous behavior of the system. However, the volume of trace data is huge leading to a heavy burden on analyzing and storing them. To reduce the volume of trace data, the sampling technique is widely adopted. However, existing uniform sampling approaches are unable to capture uncommon traces that are more interesting and informative. To tackle this problem, we design and implement Sieve, an online sampler that aims to bias sampling towards uncommon traces by taking advantage of the attention mechanism. The evaluation results on the trace datasets collected from real-world and experimental microservice systems show that Sieve is effective to increase sampling probabilities of the structurally and temporally uncommon traces and reduce the storage space to a large extent by taking a low sampling rate.
引用
收藏
页码:436 / 446
页数:11
相关论文
共 27 条
[1]  
Barham P, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P259
[2]   Estimating Process Conformance by Trace Sampling and Result Approximation [J].
Bauer, Martin ;
van der Aa, Han ;
Weidlich, Matthias .
BUSINESS PROCESS MANAGEMENT (BPM 2019), 2019, 11675 :179-197
[3]   A Real-Time Trace-Level Root-Cause Diagnosis System in Alibaba Datacenters [J].
Cai, Zhengong ;
Li, Wei ;
Zhu, Wanyi ;
Liu, Lu ;
Yang, Bowei .
IEEE ACCESS, 2019, 7 :142692-142702
[4]  
Chen H., 2020, IEEE ACCESS, V8, p43 413
[5]  
Chen MY, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE FIRST SYMPOSIUM ON NETWORKED SYSTEMS DESIGN AND IMPLEMENTATION (NSDI'04), P309
[6]   Pinpoint: Problem determination in large, dynamic Internet services [J].
Chen, MY ;
Kiciman, E ;
Fratkin, E ;
Fox, A ;
Brewer, E .
INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS, PROCEEDINGS, 2002, :595-604
[7]  
Fonseca Rodrigo, 2007, 4 USENIX S NETW SYST
[8]   Seer: Leveraging Big Data to Navigate the Complexity of Performance Debugging in Cloud Microservices [J].
Gan, Yu ;
Zhang, Yanqi ;
Hu, Kelvin ;
Cheng, Dailun ;
He, Yuan ;
Pancholi, Meghna ;
Delimitrou, Christina .
TWENTY-FOURTH INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS (ASPLOS XXIV), 2019, :19-33
[9]  
Gan Yu, 2020, ML COMPUT ARCHIT SYS
[10]  
Guha S, 2016, PR MACH LEARN RES, V48