An agent oriented proactive fault-tolerant framework for grid computing

被引:7
作者
Huda, MT [1 ]
Schmidt, HW [1 ]
Peake, ID [1 ]
机构
[1] Monash Univ, Ctr Distributed Syst & Software Engn, Melbourne, Vic 3004, Australia
来源
First International Conference on e-Science and Grid Computing, Proceedings | 2005年
关键词
D O I
10.1109/E-SCIENCE.2005.15
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Because of computational grid heterogeneity, scale and complexity, faults become likely. Therefore, grid infrastructure must have mechanisms to deal with faults while also providing efficient and reliable services to its end users. Existing fault-tolerant approaches are inefficient because they are reactive and incomplete. They are reactive because they only deal with faults when they take place; they are incomplete because they only deal with certain types of faults. Proactive approaches increase efficiency by reducing the cost and time of operations and network resource usage by maintaining the state of executing applications and resuming operation when rescheduled. This paper presents an agent oriented, fault-tolerant grid framework where agents deal wit h individual faults proactively. Agents maintain information about hardware conditions, executing process memory consumption, available resources, network conditions and component mean time to failure. Based on this information and critical states, agent can improve the reliability and efficiency of grid services.
引用
收藏
页码:304 / 311
页数:8
相关论文
共 15 条
[1]  
[Anonymous], 2002, UCBCSD021175
[2]  
*AXC, ENF 8 0 US MAN
[3]   Web search for a planet:: The Google cluster architecture [J].
Barroso, LA ;
Dean, J ;
Hölzle, U .
IEEE MICRO, 2003, 23 (02) :22-28
[4]  
BUYYA DAR, 2000, 2 INT WORKSH GLOB CL
[5]  
BUYYA DAR, 2000, HPC AS 2000 4 INT C
[6]  
BUYYA R, 2002, EC BASED DISTRIBUTED, P152
[7]  
COULOURIS JDA, DISTRIBUTED SYSTEMS
[8]  
FOSTER CKI, GRID BLUEPRINT NEW C
[9]  
HUDA MT, 2005, AGENT ORIENTED APPRO, P84
[10]  
JAYAPUTERA J, 2003, P SAVCBS WORKSH EUR, P42