DIFFY: Data-Driven Bug Finding for Configurations

被引:0
作者
Kakarla, Siva Kesava Reddy [1 ]
Yan, Francis Y. [1 ]
Beckett, Ryan [1 ]
机构
[1] Microsoft, Redmond, WA 98052 USA
来源
PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL | 2024年 / 8卷 / PLDI期
关键词
configuration bug finding; template synthesis; anomaly detection;
D O I
10.1145/3656385
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Configuration errors remain a major cause of system failures and service outages. One promising approach to identify configuration errors automatically is to learn common usage patterns (and anti-patterns) using data-driven methods. However, existing data-driven learning approaches analyze only simple configurations (e.g., those with no hierarchical structure), identify only simple types of issues (e.g., type errors), or require extensive domain-specific tuning. In this paper, we present DIFFY, the first push-button configuration analyzer that detects likely bugs in structured configurations. From example configurations, DIFFY learns a common template, with "holes" that capture their variation. It then applies unsupervised learning to identify anomalous template parameters as likely bugs. We evaluate DIFFY on a large cloud provider's wide-area network, an operational 5G network testbed, and MySQL configurations, demonstrating its versatility, performance, and accuracy. During DIFFY's development, it caught and prevented a bug in a configuration timer value that had previously caused an outage for the cloud provider.
引用
收藏
页数:24
相关论文
共 60 条
[31]  
Kakarla SKR, 2020, PROCEEDINGS OF THE 17TH USENIX SYMPOSIUM ON NETWORKED SYSTEMS DESIGN AND IMPLEMENTATION, P999
[32]  
Kakarla Siva Kesava Reddy, 2024, Diffy: Data-Driven Bug Finding for Configurations
[33]  
Kazemian P., 2012, P 9 USENIX C NETW SY, P113, DOI 10.5555/2228298.2228311
[34]  
Khurshid Ahmed, 2013, Proceedings of NSDI '13: 10th USENIX Symposium on Networked Systems Design and Implementation. NSDI '13, P15
[35]   Isolation Forest [J].
Liu, Fei Tony ;
Ting, Kai Ming ;
Zhou, Zhi-Hua .
ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2008, :413-+
[36]   CrystalNet: Faithfully Emulating Large Production Networks [J].
Liu, Hongqiang Harry ;
Zhu, Yibo ;
Padhye, Jitu ;
Cao, Jiaxin ;
Tallapragada, Sri ;
Lopes, Nuno P. ;
Rybalchenko, Andrey ;
Lu, Guohan ;
Yuan, Lihua .
PROCEEDINGS OF THE TWENTY-SIXTH ACM SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES (SOSP '17), 2017, :599-613
[37]  
Lopes Nuno P., 2015, Proceedings of the 12th USENIX Symposium on Networked Systems Design and Implementation: NSDI '15, P499
[38]   Debugging the Data Plane with Anteater [J].
Mai, Haohui ;
Khurshid, Ahmed ;
Agarwal, Rachit ;
Caesar, Matthew ;
Godfrey, P. Brighten ;
King, Samuel T. .
ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2011, 41 (04) :290-301
[39]  
Nextgov, 2021, Commercial Cloud Outages Are a Wake-Up Call
[40]  
Oppenheimer D, 2003, USENIX ASSOCIATION PROCEEDINGS OF THE 4TH USENIX SYMPOSIUM ON INTERNET TECHNOLOGIES AND SYSTEMS (USITS'03), P1