An Efficient Checkpointing and Rollback Recovery Scheme for Cluster-based Multi-channel Ad-hoc Wireless Networks

被引:10
作者
Men, Chaoguang [1 ]
Xu, Zhenpeng [1 ]
Li, Xiang [1 ]
机构
[1] Harbin Engn Univ, Coll Comp Sci & Technol, Harbin 150001, Heilongjiang, Peoples R China
来源
PROCEEDINGS OF THE 2008 INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS | 2008年
关键词
ad-hoc wireless networks; cluster; fault tolerance; checkpoint; rollback recovery;
D O I
10.1109/ISPA.2008.35
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Compared to the wired distributed computing system, cluster-based ad-hoc wireless networks have certain new characteristics. The transient failure probability of the computing process increases greatly with the enlarging of system scale. If a failure occurs in a process and there is not an appropriate method to protect it, more cost will be wasted for restarting the program. This paper presents an efficient checkpointing and rollback recovery scheme based on CMMP. The graceful fault tolerant scheme can seamlessly cooperate with the cluster-based multi-channel ad-hoc wireless networks. The number of coordinated messages between a cluster head and its ordinary members is small. The recovery scheme has no domino effect and the failure process can rollback from its latest local consistent checkpoint. The simulation results show that the proposed scheme keeps fast recovery upon transient failures and only a low additional overhead is incurred
引用
收藏
页码:371 / 378
页数:8
相关论文
共 16 条
[1]   Checkpointing with mutable checkpoints [J].
Cao, GH ;
Singhal, M .
THEORETICAL COMPUTER SCIENCE, 2003, 290 (02) :1127-1148
[2]   On failure recoverability of client-server applications in mobile wireless environments [J].
Chen, IR ;
Gu, BS ;
George, SE ;
Cheng, ST .
IEEE TRANSACTIONS ON RELIABILITY, 2005, 54 (01) :115-122
[3]  
DEPARTMENT IS, 1999, 80211 IEEE
[4]   A survey of rollback-recovery protocols in message-passing systems [J].
Elnozahy, EN ;
Alvisi, L ;
Wang, YM ;
Johnson, DB .
ACM COMPUTING SURVEYS, 2002, 34 (03) :375-408
[5]  
KUMAR VK, 2003, 22 ANN JOINT C IEEE, V1, P459
[6]   A novel min-process checkpointing scheme for mobile computing systems [J].
Li, GH ;
Wang, HY .
JOURNAL OF SYSTEMS ARCHITECTURE, 2005, 51 (01) :45-61
[7]   An integrated cluster-based multi-channel MAC protocol for mobile ad hoe networks [J].
Zhang, Lili ;
Soong, Boon-Hee ;
Xiao, Wendong .
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2007, 6 (11) :3964-3974
[8]  
ONO M, 2007, 2007 INT C PAR DISTR, P1041
[9]   An efficient recovery scheme for mobile computing environments [J].
Park, T ;
Woo, N ;
Yeom, HY .
PROCEEDINGS OF THE EIGHTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, 2001, :53-60
[10]  
Park T., 2001, IEEE T MOBILE COMPUT, V1, P265, DOI DOI 10.1109/TMC.2002.1175540