An efficient communication induced rollforward checkpointing and recovery protocol for distributed systems

被引:0
作者
Gu, MM [1 ]
Zeng, L [1 ]
Liang, ZH [1 ]
Gupta, B [1 ]
机构
[1] So Illinois Univ, Dept Comp Sci, Carbondale, IL 62901 USA
来源
COMPUTERS AND THEIR APPLICATIONS | 2000年
关键词
distributed systems; forced checkpoints; recovery; rollback; rollforward;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a novel idea for checkpointing and recovery for distributed systems is presented. it introduces a new concept about the creation of forced checkpoints and uses the idea to guarantee a small reexecution time (large rollforward) of the processes after the system recovers from a failure. The proposed algorithm offers a very simple recovery scheme comparable to that in synchronous approach.
引用
收藏
页码:298 / 302
页数:5
相关论文
共 8 条
[1]   An index-based checkpointing algorithm for autonomous distributed systems [J].
Baldoni, R ;
Quaglia, F ;
Fornara, P .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1999, 10 (02) :181-192
[2]  
CAO G, 1997, IEEE T PARALL DISTR, V9, P456
[3]   CHECKPOINTING AND ROLLBACK-RECOVERY FOR DISTRIBUTED SYSTEMS [J].
KOO, R ;
TOUEG, S .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1987, 13 (01) :23-31
[4]   Diskless checkpointing [J].
Plank, JS ;
Li, K ;
Puening, MA .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1998, 9 (10) :972-986
[5]  
Singhal M., 1994, ADV CONCEPTS OPERATI
[6]   Theoretical analysis for communication-induced checkpointing protocols with rollback-dependency trackability [J].
Tsai, JC ;
Kuo, SY ;
Wang, YM .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1998, 9 (10) :963-971
[7]   Optimistic crash recovery without changing application messages [J].
Venkatesan, S ;
Juang, RRY ;
Alagar, S .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1997, 8 (03) :263-271
[8]   Consistent global checkpoints that contain a given set of local checkpoints [J].
Wang, YM .
IEEE TRANSACTIONS ON COMPUTERS, 1997, 46 (04) :456-468