A dynamic shadow approach to fault-tolerant mobile agents in an autonomic environment

被引:4
作者
Xu, J [1 ]
Pears, S [1 ]
机构
[1] Univ Leeds, Sch Comp, Leeds LS2 9JT, W Yorkshire, England
关键词
autonomic computing; exception handling; fault tolerance; mobile agents; performance evaluation; server crash failures;
D O I
10.1007/s11241-005-4682-5
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Large-scale distributed applications such as online information retrieval and collaboration over computational elements demand an approach to self-managed computing systems with a minimum of human interference. However, large scales and full distribution often lead to poor system dependability and security, and increase the difficulty in managing and controlling redundancy for fault tolerance. In particular, fault tolerance schemes for mobile agents to survive agent server crash failures in an autonomie environment are complex since developers normally have no control over remote agent servers. Some solutions inject a replica into stable storage upon its arrival at an agent server. But in the event of an agent server crash the replica is unavailable until the agent server recovers. In this paper we present a failure model and an exception handling framework for mobile agent systems. An exception handling scheme is developed for mobile agents to survive agent server crash failures. A replica mobile agent operates at the agent server visited prior to its master's current location. If a master crashes its replica is available as a replacement. The proposed scheme is examined in comparison with a simple time-out scheme. Experimental evaluation is performed, and performance results show that the scheme leads to some overhead in the round trip time when fault tolerance measures are exercised. However the scheme offers the advantage that fault tolerance is provided during the mobile agent trip, i.e. in the event of an agent server crash all agent servers are not revisited.
引用
收藏
页码:235 / 252
页数:18
相关论文
共 21 条
[1]  
Coulouris G., 2001, DISTRIBUTED SYSTEMS
[2]   Understanding code mobility [J].
Fuggetta, A ;
Picco, GP ;
Vigna, G .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1998, 24 (05) :342-361
[3]  
MACEDO RJA, 2001, EUROPEAN RES SEM ADV
[4]   Dependability of CORBA systems: Service characterization by fault injection [J].
Marsden, E ;
Fabre, JC ;
Arlat, J .
21ST IEEE SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS, PROCEEDINGS, 2002, :276-285
[5]   Exploiting non-determinism for reliability of mobile agent systems [J].
Mohindra, A ;
Purakayastha, A ;
Thati, P .
DSN 2000: INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS, PROCEEDINGS, 2000, :144-153
[6]   Totem: A fault-tolerant multicast group communication system [J].
Moser, LE ;
MelliarSmith, PM ;
Agarwal, DA ;
Budhia, RK ;
LingleyPapadopoulos, CA .
COMMUNICATIONS OF THE ACM, 1996, 39 (04) :54-63
[7]  
NAGAMUTA V, 2001, AN S SOS BRQAS RED C
[8]  
OSHIMA M, 1998, AGLETS SPECIFICATION
[9]  
Park TS, 2002, SYM REL DIST SYST, P256, DOI 10.1109/RELDIS.2002.1180195
[10]  
PEARS S, 2003, P 6 INT S AUT DEC SY