A hierarchical reliability-driven scheduling algorithm in grid systems

被引:106
作者
Tang, Xiaoyong [1 ]
Li, Kenli [1 ]
Qiu, Meikang [2 ]
Sha, Edwin H. -M. [1 ,3 ]
机构
[1] Hunan Univ, Sch Informat Sci & Engn, Natl Supercomp Ctr Changsha, Changsha 410082, Hunan, Peoples R China
[2] Univ Kentucky, Lexington, KY 40506 USA
[3] Univ Texas Dallas, Dept Comp Sci, Dallas, TX 75230 USA
基金
美国国家科学基金会; 中国国家自然科学基金;
关键词
Grid computing; Hierarchical; Scheduling algorithm; Reliability; Application; TASK-ALLOCATION ALGORITHMS; INDEPENDENT TASKS; MAXIMIZING RELIABILITY; PERFORMANCE; MODEL;
D O I
10.1016/j.jpdc.2011.12.004
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In a Grid computing system, many distributed scientific and engineering applications often require multi-institutional collaboration, large-scale resource sharing, wide-area communication, etc. Applications executing in such systems inevitably encounter different types of failures such as hardware failure, program failure, and storage failure. One way of taking failures into account is to employ a reliable scheduling algorithm. However, most existing Grid scheduling algorithms do not adequately consider the reliability requirements of an application. In recognition of this problem, we design a hierarchical reliability-driven scheduling architecture that includes both a local scheduler and a global scheduler. The local scheduler aims to effectively measure task reliability of an application in a Grid virtual node and incorporate the precedence constrained tasks' reliability overhead into a heuristic scheduling algorithm. In the global scheduler, we propose a hierarchical reliability-driven scheduling algorithm based on quantitative evaluation of independent application reliability. Our experiments, based on both randomly generated graphs and the graphs of some real applications, show that our hierarchical scheduling algorithm performs much better than the existing scheduling algorithms in terms of system reliability, schedule length, and speedup. (C) 2011 Elsevier Inc. All rights reserved.
引用
收藏
页码:525 / 535
页数:11
相关论文
共 35 条
[1]  
[Anonymous], 1999, GRID BLUEPRINT NEW C
[2]  
[Anonymous], 1979, COMPUTERS INTRACTABI
[3]   A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems [J].
Braun, TD ;
Siegel, HJ ;
Beck, N ;
Bölöni, LL ;
Maheswaran, M ;
Reuther, AI ;
Robertson, JP ;
Theys, MD ;
Yao, B ;
Hensgen, D ;
Freund, RF .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2001, 61 (06) :810-837
[4]   Network modeling issues for Grid application scheduling [J].
Casanova, H .
INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE, 2005, 16 (02) :145-162
[5]   Performance assessment and reliability analysis of dependable and distributed computing systems based on BDD and recursive merge [J].
Chang, Yung-Ruei ;
Huang, Chin-Yu ;
Kuo, Sy-Yen .
APPLIED MATHEMATICS AND COMPUTATION, 2010, 217 (01) :403-413
[6]   A heuristic approach to generating file spanning trees for reliability analysis of distributed computing systems [J].
Chen, DJ ;
Chen, RS ;
Huang, TH .
COMPUTERS & MATHEMATICS WITH APPLICATIONS, 1997, 34 (10) :115-131
[7]   A comparison of centralized and distributed meta-scheduling architectures for computation and communication tasks in Grid networks [J].
Christodoulopoulos, K. ;
Sourlas, V. ;
Mpakolas, I. ;
Varvarigos, E. .
COMPUTER COMMUNICATIONS, 2009, 32 (7-10) :1172-1184
[8]   A hierarchical modeling and analysis for grid service reliability [J].
Dai, Yuan-Shun ;
Pan, Yi ;
Zou, Xukai .
IEEE TRANSACTIONS ON COMPUTERS, 2007, 56 (05) :681-691
[9]   Reliability and performance of tree-structured grid services [J].
Dai, Yuan-Shun ;
Levitin, Gregory .
IEEE TRANSACTIONS ON RELIABILITY, 2006, 55 (02) :337-349
[10]   Biobjective scheduling algorithms for execution time-reliability trade-off in heterogeneous computing systems [J].
Dogan, A ;
Özgüner, F .
COMPUTER JOURNAL, 2005, 48 (03) :300-314