Fault-Tolerance Mechanism of Computation Grid Service System Based on Mobile Agent

被引:0
作者
Zhang, Zhirou [1 ]
Li, Ying [2 ]
机构
[1] N China Elect Power Univ, Network & Informat Ctr, Beijing 102206, Peoples R China
[2] Commun Univ China, Sch Comp Sci, Beijing 100024, Peoples R China
来源
2008 ISECS INTERNATIONAL COLLOQUIUM ON COMPUTING, COMMUNICATION, CONTROL, AND MANAGEMENT, VOL 1, PROCEEDINGS | 2008年
关键词
D O I
10.1109/CCCM.2008.39
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Constructing Computation Grid Service System with idle computers in an organization to provide computation service for Mobile Agent can save funds of high-performance computing and make full use of idle resources, but Fault-Tolerance mechanism must be researched to guarantee running of computation task when nodes or networks of the system fail. Three main parts of Fault-Tolerance mechanism of the system are researched in this paper. An adaptive Fault-Defection mechanism, a non-close, non-block and low-overhead Checkpointing mechanism, and a Partial Rollback Mechanism Based on Communication Domain are proposed, which can save overhead of Fault-Tolerance. Experiments have shown their advantages.
引用
收藏
页码:161 / +
页数:3
相关论文
共 7 条
[1]  
[Anonymous], 1999, GRID BLUEPRINT FUTUR
[2]  
Buyya R., 1999, HIGH PERFORMANCE CLU, V1
[3]   DISTRIBUTED SNAPSHOTS - DETERMINING GLOBAL STATES OF DISTRIBUTED SYSTEMS [J].
CHANDY, KM ;
LAMPORT, L .
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 1985, 3 (01) :63-75
[4]  
DU ZH, 2002, OGSA SERVICE BASED G
[5]  
FOSTER I, 2000, INTERNET COMPUTING E
[6]  
WANG RB, 2001, APPL RES COMPUTERS, V6, P9
[7]  
WLADAWSKYBERGER I, 2002, OPTIMIZE MAGAZINE, P59