Hypevisor-based fault-tolerance

被引:102
作者
Bressoud, TC [1 ]
Schneider, FB [1 ]
机构
[1] CORNELL UNIV,ITHACA,NY 14853
来源
ACM TRANSACTIONS ON COMPUTER SYSTEMS | 1996年 / 14卷 / 01期
关键词
algorithms; reliability; fault-tolerant computing system; primary/backup approach; virtual-machine manager;
D O I
10.1145/225535.225538
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Protocols to implement a fault-tolerant computing system are described. These protocols augment the hypervisor of a virtual-machine manager and coordinate a primary virtual machine with its backup. No modifications to the hardware, operating system, or application programs are required. A prototype system was constructed for HP's PA-RISC instruction-set architecture. Even though the prototype was not carefully tuned, it ran programs about a factor of 2 slower than a bare machine would.
引用
收藏
页码:80 / 107
页数:28
相关论文
共 28 条
[1]  
ALSBERG PA, 1976, 2ND P INT C SOFTW EN, P627
[2]  
BARTLETT J, 1981, 8TH P S OP SYST PRIN, P22
[3]  
Bernstein P.A., 1987, Concurrency Control and Recovery in Database Systems
[4]   THE PROCESS GROUP-APPROACH TO RELIABLE DISTRIBUTED COMPUTING [J].
BIRMAN, KP .
COMMUNICATIONS OF THE ACM, 1993, 36 (12) :37-&
[5]  
BORG A, 1983, 9TH P ACM S OP SYST, P90
[6]  
BORG A, 1985, ACM T COMPUT SYST, V3, P63
[7]  
BRESSOUD TC, 1996, THESIS CORNELL U ITH
[8]  
CUTTS RW, 1990, Patent No. 4965717
[9]  
ELNOZAHY EN, 1995, CMUCS95157 CARN MELL
[10]  
GLEESON B, 1994, Patent No. 5363503