A fault-tolerant scheduling system for computational grids

被引:23
作者
Amoon, Mohammed [1 ,2 ]
机构
[1] Menoufia Univ, Fac Elect Eng, Comp Sci & Eng Dept, Menoufia, Egypt
[2] King Saud Univ, RCC, Dept Comp Sci, Riyadh 11437, Saudi Arabia
关键词
Grid computing - Response time (computer systems) - Fault tolerance - Failure analysis;
D O I
10.1016/j.compeleceng.2011.11.004
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Fault-tolerant scheduling is an important issue for computational grid systems, as grids typically consist of strongly varying and geographically distributed resources. The main scheduling strategy of most fault-tolerant scheduling systems depends on the response time and fault index when selecting a resource to execute a certain job. In this paper, a scheduling system is presented that depends on a new factor called scheduling indicator in selecting resources. This factor comprises of the response time and the failure rate of grid resources. Whenever a grid scheduler has jobs to schedule on grid resources, it uses the scheduling indicator to generate the scheduling decisions. The main scheduling strategy of the system is to select resources that have the lowest tendency to fail. Extensive simulation experiments are conducted to quantify the performance of the proposed system. Experiments have shown that the proposed system can considerably improve grid performance in terms of throughput, unavailability, turnaround time, and fail tendency. (C) 2011 Elsevier Ltd. All rights reserved.
引用
收藏
页码:399 / 412
页数:14
相关论文
共 18 条
[1]  
Abawajy J, 2004, P 18 IEEE INT PAR DI
[2]   An ant algorithm for balanced job scheduling in grids [J].
Chang, Ruay-Shiung ;
Chang, Jih-Sheng ;
Lin, Po-Sheng .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2009, 25 (01) :20-27
[3]  
Chtepen M, 2006, PROCEEDINGS OF THE 18TH IASTED INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING AND SYSTEMS, P622
[4]   Static strategy and dynamic adjustment: An effective method for Grid task scheduling [J].
Huang, Peijie ;
Peng, Hong ;
Lin, Piyuan ;
Li, Xuezhen .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2009, 25 (08) :884-892
[5]   An agent oriented proactive fault-tolerant framework for grid computing [J].
Huda, MT ;
Schmidt, HW ;
Peake, ID .
First International Conference on e-Science and Grid Computing, Proceedings, 2005, :304-311
[6]  
Jiang C, 2010, P 3 INT S EL COMM SE
[7]   Fault-tolerant grid architecture and practice [J].
Jin, H ;
Zou, DQ ;
Chen, HH ;
Sun, JH ;
Wu, S .
JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2003, 18 (04) :423-433
[8]   Performance evaluation of fault tolerance techniques in grid computing system [J].
Khan, Fiaz Gul ;
Qureshi, Kalim ;
Nazir, Babar .
COMPUTERS & ELECTRICAL ENGINEERING, 2010, 36 (06) :1110-1122
[9]   A resource management and fault tolerance services in grid computing [J].
Lee, HM ;
Chung, KS ;
Chin, SH ;
Lee, JH ;
Lee, DW ;
Park, S ;
Yu, HC .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2005, 65 (11) :1305-1317
[10]   Resource scheduling with conflicting objectives in grid environments: Model and evaluation [J].
Li Chunlin ;
Xiu, Zhong Jin ;
Li Layuan .
JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2009, 32 (03) :760-769