Zyzzyva: Speculative Byzantine Fault Tolerance

被引:0
作者
Kotla, Ramakrishna [1 ]
Alvisi, Lorenzo [2 ]
Dahlin, Mike [2 ]
Clement, Allen [2 ]
Wong, Edmund [2 ]
机构
[1] Microsoft Res Silicon Valley, Mountain View, CA 94043 USA
[2] Univ Texas Austin, Dept Comp Sci, Austin, TX 78712 USA
来源
ACM TRANSACTIONS ON COMPUTER SYSTEMS | 2009年 / 27卷 / 04期
关键词
Performance; Reliability; Byzantine fault tolerance; speculative execution; replication; output commit;
D O I
10.1145/1658357.1658358
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
A longstanding vision in distributed systems is to build reliable systems from unreliable components. An enticing formulation of this vision is Byzantine Fault-Tolerant (BFT) state machine replication, in which a group of servers collectively act as a correct server even if some of the servers misbehave or malfunction in arbitrary ("Byzantine") ways. Despite this promise, practitioners hesitate to deploy BFT systems, at least partly because of the perception that BFT must impose high overheads. In this article, we present Zyzzyva, a protocol that uses speculation to reduce the cost of BFT replication. In Zyzzyva, replicas reply to a client's request without first running an expensive three-phase commit protocol to agree on the order to process requests. Instead, they optimistically adopt the order proposed by a primary server, process the request, and reply immediately to the client. If the primary is faulty, replicas can become temporarily inconsistent with one another, but clients detect inconsistencies, help correct replicas converge on a single total ordering of requests, and only rely on responses that are consistent with this total order. This approach allows Zyzzyva to reduce replication overheads to near their theoretical minima and to achieve throughputs of tens of thousands of requests per second, making BFT replication practical for a broad range of demanding services.
引用
收藏
页数:39
相关论文
共 43 条
[1]  
Abd-El-Malek M, 2005, USENIX Association Proceedings of the 4th Usenix Conference on File and Storage Technologies, P59
[2]  
Aiyer Amitanand S, 2005, S OP SYST PRINC SOSP, P45, DOI [10.1145/1095810.1095816, DOI 10.1145/1095810.1095816]
[3]  
Anjali Singh Anjali Singh, 2008, Phytochemicals: a therapeutant for critical disease management, P189
[4]  
BELLARE M, 1997, P 14 ANN EUR C EUR 9, P163
[5]   Practical byzantine fault tolerance and proactive recovery [J].
Castro, M ;
Liskov, B .
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2002, 20 (04) :398-461
[6]   Practical byzantine fault tolerance and proactive recovery [J].
Castro, M ;
Liskov, B .
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2002, 20 (04) :398-461
[7]  
Castro M, 2000, USENIX ASSOCIATION PROCEEDINGS OF THE FOURTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P273
[8]  
CASTRO M, 2001, THESIS MIT CAMBRIDGE
[9]  
CHUN BG, 2007, SIGOPS OPER SYST REV, V41, P189
[10]  
CLEMENT A, 2009, P 22 ACM S OP SYST P, P270