Using Lightweight Formal Methods to Validate a Key-Value Storage Node in Amazon S3

被引:44
作者
Bornholt, James [1 ,2 ]
Joshi, Rajeev [1 ]
Astrauskas, Vytautas [3 ]
Cully, Brendan [1 ]
Kragl, Bernhard [1 ]
Markle, Seth [1 ]
Sauri, Kyle [1 ]
Schleit, Drew [1 ]
Slatton, Grant [1 ]
Tasiran, Serdar [1 ]
Van Geffen, Jacob [4 ]
Warfield, Andrew [1 ]
机构
[1] Amazon Web Serv, Seattle, WA 98108 USA
[2] Univ Texas Austin, Austin, TX 78712 USA
[3] Swiss Fed Inst Technol, Zurich, Switzerland
[4] Univ Washington, Seattle, WA 98195 USA
来源
PROCEEDINGS OF THE 28TH ACM SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES, SOSP 2021 | 2021年
关键词
cloud storage; lightweight formal methods;
D O I
10.1145/3477132.3483540
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
This paper reports our experience applying lightweight formal methods to validate the correctness of ShardStore, a new key-value storage node implementation for the Amazon S3 cloud object storage service. By "lightweight formal methodsz we mean a pragmatic approach to verifying the correctness of a production storage node that is under ongoing feature development by a full-time engineering team. We do not aim to achieve full formal verification, but instead emphasize automation, usability, and the ability to continually ensure correctness as both software and its specification evolve over time. Our approach decomposes correctness into independent properties, each checked by the most appropriate tool, and develops executable reference models as specifications to be checked against the implementation. Our work has prevented 16 issues from reaching production, including subtle crash consistency and concurrency problems, and has been extended by non-formal-methods experts to check new features and properties as ShardStore has evolved.
引用
收藏
页码:836 / 850
页数:15
相关论文
共 53 条
[1]  
Amazon Web Services, 2020, Shuttle
[2]  
Amazon Web Services, 2021, Rust Model Checker (RMC)
[3]  
[Anonymous], 2021, Facebook
[4]   Leveraging Rust Types for Modular Specification and Verification [J].
Astrauskas, Vytautas ;
Muller, Peter ;
Poli, Federico ;
Summers, Alexander J. .
PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2019, 3 (OOPSLA)
[5]   How Do Programmers Use Unsafe Rust? [J].
Astrauskas, Vytautas ;
Matheja, Christoph ;
Poli, Federico ;
Muller, Peter ;
Summers, Alexander J. .
PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2020, 4 (OOPSLA)
[6]   Verifying Rust Programs with SMACK [J].
Baranowski, Marek ;
He, Shaobo ;
Rakamaric, Zvonimir .
AUTOMATED TECHNOLOGY FOR VERIFICATION AND ANALYSIS (ATVA 2018), 2018, 11138 :528-535
[7]   Specifying and Checking File System Crash-Consistency Models [J].
Bornholt, James ;
Kaufmann, Antoine ;
Li, Jialin ;
Krishnamurthy, Arvind ;
Torlak, Emina ;
Wang, Xi .
ACM SIGPLAN NOTICES, 2016, 51 (04) :83-98
[8]  
Burckhardt S, 2010, ASPLOS XV: FIFTEENTH INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS, P167
[9]  
Chajed T, 2021, PROCEEDINGS OF THE 15TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDI '21), P423
[10]   Verifying concurrent, crash-safe systems with Perennial [J].
Chajed, Tej ;
Tassarotti, Joseph ;
Kaashoek, M. Frans ;
Zeldovich, Nickolai .
PROCEEDINGS OF THE TWENTY-SEVENTH ACM SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES (SOSP '19), 2019, :243-258