Exploiting Redundancies to Enhance Schedulability in Fault-Tolerant and Real-Time Distributed Systems

被引:27
作者
Luo, Wei [1 ]
Qin, Xiao [2 ]
Tan, Xian-Chun [1 ]
Qin, Ke [1 ]
Manzanares, Adam [2 ]
机构
[1] China Ship Dev & Design Ctr, Dept Informat Syst, Wuhan 430064, Peoples R China
[2] Auburn Univ, Dept Comp Sci & Software Engn, Auburn, AL 36849 USA
来源
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS | 2009年 / 39卷 / 03期
基金
中国国家自然科学基金; 美国国家科学基金会;
关键词
Distributed systems; fault tolerance; rate-monotonic (RM) algorithm; real-time task scheduling; primary-backup copy; SCHEDULING ALGORITHM; APERIODIC TASKS; ASSIGNMENT; RECOVERY; SOFT;
D O I
10.1109/TSMCA.2009.2013192
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In the past decades, distributed systems have been widely applied to real-time applications, most of which have fault-tolerance requirements to assure high reliability. Due to the stringent space constraints of real-time systems, the issue of schedulability becomes a major concern in the design of fault-tolerant and real-time distributed systems. Most existing real-time and fault-tolerant scheduling algorithms, which are based on the primary-backup scheme for periodic real-time tasks, introduce unnecessary redundancies by aggressively using active-backup copies. To solve this problem, we propose two novel fault-tolerant techniques, which are seamlessly integrated with fixed-priority-based scheduling algorithms. These techniques leverage redundancies to enhance schedulability in fault-tolerant and real-time distributed systems. Our fault-tolerant techniques make use of the primary-backup scheme to tolerate permanent hardware failures. The first technique (referred to as Tercos) terminates the execution of active-backup copies, when corresponding primary copies are successfully completed. Tercos is designed to reduce scheduling lengths in fault-free scenarios to enhance schedulability by virtue of executing portions of active-backup copies in passive forms. The second technique (referred to as Debus) uses a deferred-active-backup scheme to further minimize schedule lengths to improve the schedulability performance. Debus schedules active-backup copies as late as possible, while terminating active-backup copies when their primary copies are completed. Experimental results show that, compared with existing algorithms in literature, Tercos can significantly improve schedulability by up to 17.0% (with an average of 9.7%). Furthermore, empirical results reveal that Debus can enhance schedulability over Tercos by up to 12% (with an average of 7.8%).
引用
收藏
页码:626 / 639
页数:14
相关论文
共 37 条
[1]   An adaptive scheme for fault-tolerant scheduling of soft real-time tasks in multiprocessor systems [J].
Al-Omari, R ;
Somani, AK ;
Manimaran, G .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2005, 65 (05) :595-608
[2]  
ALOMARI R, 2001, P INT PAR PROC S SAN
[3]  
[Anonymous], 1996, HDB SOFTWARE RELIABI
[4]  
Bernat G., 1998, THESIS U ILLES BALEA
[5]   Fault-tolerant rate-monotonic first-fit scheduling in hard-real-time systems [J].
Bertossi, AA ;
Mancini, LV ;
Rossini, F .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1999, 10 (09) :934-945
[6]   NEW STRATEGIES FOR ASSIGNING REAL-TIME TASKS TO MULTIPROCESSOR SYSTEMS [J].
BURCHARD, A ;
LIEBEHERR, J ;
OH, YF ;
SON, SH .
IEEE TRANSACTIONS ON COMPUTERS, 1995, 44 (12) :1429-1442
[7]   Optimal scheduling for fault-tolerant and firm real-time systems [J].
Caccamo, M ;
Buttazzo, G .
FIFTH INTERNATIONAL CONFERENCE ON REAL-TIME COMPUTING SYSTEMS AND APPLICATIONS, PROCEEDINGS, 1998, :223-231
[8]  
Davis R, 1995, IEEE REAL TIME, P100, DOI 10.1109/REAL.1995.495200
[9]   REAL-TIME SCHEDULING PROBLEM [J].
DHALL, SK ;
LIU, CL .
OPERATIONS RESEARCH, 1978, 26 (01) :127-140
[10]   Modeling and analysis of real-time cooperative systems using Petri nets [J].
Du, YuYue ;
Jiang, ChangJun ;
Zhou, MengChu .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2007, 37 (05) :643-654