Efficient causal message logging protocol integrated with asynchronous checkpointing

被引:0
|
作者
Ahn, Jinho [1 ]
机构
[1] Kyonggi Univ, Dept Comp Sci, Suwon 443760, Gyeonggido, South Korea
关键词
distributed systems; message passing; fault-tolerance; asynchronous checkpointing; causal message logging; recovery;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Log-based rollback recovery is a well-known fault-tolerance technique to combine message logging with checkpointing. Among log-based recovery approaches, causal message logging has failure-free performance advantage of optimistic message logging while ensuring the always-no-orphans property in case of failures like pessimistic message logging. However, most previous causal message logging protocols may not progress surviving processes' execution while incurring a number of stable storage accesses during recovery. A previous protocol attempts to addresses these issues, but charaterizes centralized recovery behavior and may make the system's global state inconsistent when recovering concurrent process crashes. This paper proposes an efficient causal message logging protocol to enable surviving processes to progress their execution regardless of simultaneous process crashes and alleviate the limitation of the previous one by performing synchronous and distributed recovery. Also, the proposed protocol has each process keep only its latest checkpoint on the stable storage and perform globally consistent recovery in case of being integrated with asynchronous checkpointing because it forces each recovering process to obtain recovery information related to the process from the other recovering processes as well as all live processes.
引用
收藏
页码:300 / 305
页数:6
相关论文
共 50 条