Contention-Aware Reliability Management Scheme for Parallel Tasks Scheduling in Heterogeneous Computing Systems

被引:0
作者
Zhang, Longxin [1 ]
Li, Kenli [2 ]
Wen, Zhicheng [1 ]
Peng, Cheng [1 ]
Li, Keqin [2 ,3 ]
机构
[1] Hunan Univ Technol, Coll Comp & Commun, Zhuzhou 412007, Peoples R China
[2] Hunan Univ, Coll Comp Sci & Elect Engn, Natl Supercomp Ctr Changsha, Changsha 410082, Hunan, Peoples R China
[3] SUNY Coll New Paltz, Dept Comp Sci, New Paltz, NY 12561 USA
来源
2016 SEVENTH INTERNATIONAL GREEN AND SUSTAINABLE COMPUTING CONFERENCE (IGSC) | 2016年
基金
中国国家自然科学基金; 对外科技合作项目(国际科技项目);
关键词
Contention-aware; parallel tasks; reliability; task scheduling; ENERGY; TIME;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Energy efficiency and high system reliability are two primary measurement in modern high-performance computing. Most recent studies pay too much attention on low energy consumption or execution time for parallel tasks scheduling. In addition, these approaches are proposed for the classic scheduling model. It is increasing recognized that contention model is more realistic and be of benefit to create accurate and efficient schedules. This paper presents a contention-aware reliability management algorithm for parallel tasks scheduling in heterogeneous computing systems. Extensive experiments are performed to evaluate the results. It is demonstrated that our algorithm significant improve the system reliability.
引用
收藏
页数:6
相关论文
共 12 条
[1]   Matching and scheduling algorithms for minimizing execution time and failure probability of applications in heterogeneous computing [J].
Dogan, A ;
Özgüner, F .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2002, 13 (03) :308-323
[2]   MEASUREMENT AND MODELING OF COMPUTER RELIABILITY AS AFFECTED BY SYSTEM ACTIVITY [J].
IYER, RK ;
ROSSETTI, DJ ;
HSUEH, MC .
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 1986, 4 (03) :214-237
[3]   Energy Conscious Scheduling for Distributed Computing Systems under Different Operating Conditions [J].
Lee, Young Choon ;
Zomaya, Albert Y. .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2011, 22 (08) :1374-1381
[4]   Energy and time constrained task scheduling on multiprocessor computers with discrete speed levels [J].
Li, Keqin .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2016, 95 :15-28
[5]   Scheduling Precedence Constrained Tasks with Reduced Processor Energy on Multiprocessor Computers [J].
Li, Keqin .
IEEE TRANSACTIONS ON COMPUTERS, 2012, 61 (12) :1668-1681
[6]   A Survey of Techniques for Modeling and Improving Reliability of Computing Systems [J].
Mittal, Sparsh ;
Vetter, Jeffrey S. .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2016, 27 (04) :1226-1238
[7]   Contention Aware Energy Efficient Scheduling on Heterogeneous Multiprocessors [J].
Singh, Jagpreet ;
Betha, Sandeep ;
Mangipudi, Bhargav ;
Auluck, Nitin .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2015, 26 (05) :1251-1264
[8]   A hierarchical reliability-driven scheduling algorithm in grid systems [J].
Tang, Xiaoyong ;
Li, Kenli ;
Qiu, Meikang ;
Sha, Edwin H. -M. .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2012, 72 (04) :525-535
[9]   Maximizing reliability with energy conservation for parallel task scheduling in a heterogeneous cluster [J].
Zhang, Longxin ;
Li, Kenli ;
Xu, Yuming ;
Mei, Jing ;
Zhang, Fan ;
Li, Keqin .
INFORMATION SCIENCES, 2015, 319 :113-131
[10]  
Zhang Y, 2003, DESIGN, AUTOMATION AND TEST IN EUROPE CONFERENCE AND EXHIBITION, PROCEEDINGS, P918