Meeting Soft Deadlines in Scientific Workflows Using Resubmission Impact

被引:34
作者
Plankensteiner, Kassian [1 ]
Prodan, Radu [1 ]
机构
[1] Univ Innsbruck, Inst Comp Sci, A-6020 Innsbruck, Austria
关键词
Scientific workflows; fault tolerance; scheduling; cloud computing; grid computing;
D O I
10.1109/TPDS.2011.221
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We propose a new heuristic called Resubmission Impact to support fault tolerant execution of scientific workflows in heterogeneous parallel and distributed computing environments. In contrast to related approaches, our method can be effectively used on new or unfamiliar environments, even in the absence of historical executions or failure trace models. On top of this method, we propose a dynamic enactment and rescheduling heuristic able to execute workflows with a high degree of fault tolerance, while taking into account soft deadlines. Simulated experiments of three real-world workflows in the Austrian Grid demonstrate that our method significantly reduces the resource waste compared to conservative task replication and resubmission techniques, while having a comparable makespan and only a slight decrease in the success probability. On the other hand, the dynamic enactment method manages to successfully meet soft deadlines in faulty environments in the absence of historical failure trace information or models.
引用
收藏
页码:890 / 901
页数:12
相关论文
共 15 条
[11]  
Iosup A, 2007, 2007 8TH IEEE/ACM INTERNATIONAL CONFERENCE ON GRID COMPUTING, P154
[12]   A trace-based investigation of the characteristics of grid workflows [J].
Ostermann, Simon ;
Prodan, Radu ;
Fahringer, Thomas ;
Losup, Alexandru ;
Epema, Dick .
FROM GRIDS TO SERVICE AND PERVASIVE COMPUTING, 2008, :191-+
[13]   Performance-effective and low-complexity task scheduling for heterogeneous computing [J].
Topcuoglu, H ;
Hariri, S ;
Wu, MY .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2002, 13 (03) :260-274
[14]   A taxonomy of scientific workflow systems for Grid computing [J].
Yu, J ;
Buyya, R .
SIGMOD RECORD, 2005, 34 (03) :44-49
[15]  
Zhang Y, 2009, CCGRID: 2009 9TH IEEE INTERNATIONAL SYMPOSIUM ON CLUSTER COMPUTING AND THE GRID, P244, DOI 10.1109/CCGRID.2009.59