Instruction-Level Fault Tolerance Configurability

被引:6
作者
Borodin, Demid [1 ]
Juurlink, B. H. H. [1 ]
Hamdioui, Said [1 ]
Vassiliadis, Stamatis [1 ]
机构
[1] Delft Univ Technol, Comp Engn Lab, Fac Elect Engn Math & Comp Sci, NL-2628 CD Delft, Netherlands
来源
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY | 2009年 / 57卷 / 01期
关键词
Fault tolerance; Reliability; Performance; Energy consumption; Instruction-level configurability; ERROR-DETECTION; WATCHDOG PROCESSORS;
D O I
10.1007/s11265-008-0175-9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Due to modern technology trends such as decreasing feature sizes and lower voltage levels, fault tolerance (FT) is becoming increasingly important in computing systems. Several schemes have been proposed to enable a user to configure the FT at the application level, thereby enabling the user to trade stronger FT for performance or vice versa. In this paper, we propose supporting instruction-level rather than application-level configurability of FT, since different parts of some applications (e.g., multimedia) can have different reliability requirements. Weak or no FT will be applied to less critical parts, resulting in time and/or resource gains. These gains can be used to apply stronger FT techniques to the more critical parts; hence increasing the overall reliability. The paper shows how some existing FT techniques can be adapted to support instruction-level FT configurability, how a programmer can specify the desired FT level of the instructions, and how the compiler can manage it automatically. A comparison between the existing FT scheme EDDI (which duplicates all instructions) and the proposed approach is performed both at the kernel and at full application levels. The simulation results show that both the performance and the energy consumption are significantly improved (up to 50% at the kernel and up to 16% at full application level), while the fault coverage depends on the application. For the full application (JPEG encoder), our approach is only applied to one kernel in order to avoid increasing the programming effort significantly.
引用
收藏
页码:89 / 105
页数:17
相关论文
共 28 条
[1]  
AUSTING RM, 1999, MICRO 32 P, P196
[2]   Defect and error tolerance in the presence of massive numbers of defects [J].
Breuer, MA ;
Gupta, SK .
IEEE DESIGN & TEST OF COMPUTERS, 2004, 21 (03) :216-227
[3]   Biological monitoring of n-hexane exposure in shoe workers [J].
Burgaz, S ;
Cok, I ;
Ulusoy, L ;
Tarhan, U ;
Aygun, N ;
Karakaya, AE .
BIOMARKERS, 1997, 2 (01) :25-28
[4]  
Chatelain P, 2000, Endocr Regul, V34, P33
[5]   Analysis and testing for error tolerant motion estimation [J].
Chung, H ;
Ortega, A .
DFT 2005: 20TH IEEE INTERNATIONAL SYMPOSIUM ON DEFECT AND FAULT TOLERANCE IN VLSI SYSTEMS, 2005, :514-522
[6]  
Frangiotti M, 1995, PROCEEDINGS OF THE EIGHTH INTERNATIONAL KANT CONGRESS, VOL II, PT 1, SECT 1-9, P207, DOI 10.1109/DFTVS.1995.476954
[7]   Impact of CMOS process scaling and SOI on the soft error rates of logic processes [J].
Hareland, S ;
Maiz, J ;
Alavi, M ;
Mistry, K ;
Walsta, S ;
Dai, CH .
2001 SYMPOSIUM ON VLSI TECHNOLOGY, DIGEST OF TECHNICAL PAPERS, 2001, :73-74
[8]  
Hennessy JL, 2019, COMPUTER ARCHITECTUR
[9]  
Johnson B., 1989, Design and Analysis of Fault-Tolerant Digital Systems
[10]  
LU DJ, 1982, IEEE T COMPUT, V31, P681, DOI 10.1109/TC.1982.1676066