Application-level fault tolerance as a complement to system-level fault tolerance

被引:14
|
作者
Haines, J [1 ]
Lakamraju, V [1 ]
Koren, I [1 ]
Krishna, CM [1 ]
机构
[1] Univ Massachusetts, Dept Elect & Comp Engn, Amherst, MA 01003 USA
关键词
distributed real-time systems; fault tolerance; checkpointing; imprecise computation; target tracking; beam forming;
D O I
10.1023/A:1008181429693
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
As multiprocessor systems become more complex, their reliability will need to increase as well. In this paper we propose a novel technique which is applicable to a wide variety of distributed real-time systems, especially those exhibiting data parallelism. System-level fault tolerance involves reliability techniques incorporated within the system hardware and software whereas application-level fault tolerance involves reliability techniques incorporated within the application software. We assert that, for high reliability, a combination of system-level fault tolerance and application-level fault tolerance works best. In many systems, application-level fault tolerance can be used to bridge the gap when system-level fault tolerance alone does not provide the required reliability. We exemplify this with the RTHT target tracking benchmark and the ABF beamforming benchmark.
引用
收藏
页码:53 / 68
页数:16
相关论文
共 50 条
  • [21] A decentralized fault tolerance model based on level of performance for grid environment
    Mohammed Rebbah
    Yahya Slimani
    Abdelkader Benyettou
    Lionel Brunie
    Cluster Computing, 2016, 19 : 13 - 27
  • [22] A decentralized fault tolerance model based on level of performance for grid environment
    Rebbah, Mohammed
    Slimani, Yahya
    Benyettou, Abdelkader
    Brunie, Lionel
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2016, 19 (01): : 13 - 27
  • [23] RedThreads: An Interface for Application-Level Fault Detection/Correction Through Adaptive Redundant Multithreading
    Hukerikar, Saurabh
    Teranishi, Keita
    Diniz, Pedro C.
    Lucas, Robert F.
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2018, 46 (02) : 225 - 251
  • [24] RedThreads: An Interface for Application-Level Fault Detection/Correction Through Adaptive Redundant Multithreading
    Saurabh Hukerikar
    Keita Teranishi
    Pedro C. Diniz
    Robert F. Lucas
    International Journal of Parallel Programming, 2018, 46 : 225 - 251
  • [25] System Implementation of AUSF Fault Tolerance
    Chen, Wei-Sheng
    Leu, Fang-Yie
    Susanto, Heru
    ADVANCES ON BROAD-BAND WIRELESS COMPUTING, COMMUNICATION AND APPLICATIONS, 2020, 97 : 678 - 687
  • [26] A Simple Dual Three-Level Inverter Topology With Improved Fault Tolerance
    Karthik, A.
    Loganathan, Umanand
    IEEE JOURNAL OF EMERGING AND SELECTED TOPICS IN POWER ELECTRONICS, 2021, 9 (05) : 5954 - 5961
  • [27] Enhancing NetBeans with Transparent Fault Tolerance Using Meta-Level Architecture
    Rytter, Martin
    Jorgensen, Bo Norregaard
    JOURNAL OF OBJECT TECHNOLOGY, 2010, 9 (05): : 55 - 73
  • [28] Fault Tolerance and Overmodulation Algorithm for Three Level Neutral Point Clamped Inverters
    Zhang, Ming
    Li, Ze
    Guo, Yuanbo
    Zhang, Xiaohua
    PROCEEDINGS OF THE 36TH CHINESE CONTROL CONFERENCE (CCC 2017), 2017, : 7358 - 7363
  • [29] System-level fault diagnosis in fixed topology mobile ad hoc networks
    Sahoo, Manmath Narayan
    Khilar, Pabitra Mohan
    INTERNATIONAL JOURNAL OF COMMUNICATION NETWORKS AND DISTRIBUTED SYSTEMS, 2013, 10 (03) : 216 - 232
  • [30] Comparison-Based System-Level Fault Diagnosis: A Neural Network Approach
    Elhadef, Mourad
    Nayak, Amiya
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2012, 23 (06) : 1047 - 1059