An adaptive programming model for fault-tolerant distributed computing

被引:14
|
作者
Gorender, Sergio
Macedo, Raimundo Jose de Araujo
Raynal, Michel
机构
[1] Univ Fed Bahia, Dept Comp Sci, Distributed Syst Lab, BR-40170110 Salvador, BA, Brazil
[2] Univ Rennes 1, IRISA, F-35042 Rennes, France
关键词
adaptability; asynchronous/synchronous distributed system; consensus; distributed computing model; fault tolerance; quality of service;
D O I
10.1109/TDSC.2007.3
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The capability of dynamically adapting to distinct runtime conditions is an important issue when designing distributed systems where negotiated quality of service (QoS) cannot always be delivered between processes. Providing fault tolerance for such dynamic environments is a challenging task. Considering such a context, this paper proposes an adaptive programming model for fault-tolerant distributed computing, which provides upper-layer applications with process state information according to the current system synchrony ( or QoS). The underlying system model is hybrid, composed by a synchronous part ( where there are time bounds on processing speed and message delay) and an asynchronous part ( where there is no time bound). However, such a composition can vary over time, and, in particular, the system may become totally asynchronous ( e. g., when the underlying system QoS degrade) or totally synchronous. Moreover, processes are not required to share the same view of the system synchrony at a given time. To illustrate what can be done in this programming model and how to use it, the consensus problem is taken as a benchmark problem. This paper also presents an implementation of the model that relies on a negotiated quality of service ( QoS) for communication channels.
引用
收藏
页码:18 / 31
页数:14
相关论文
共 50 条
  • [21] FAULT-TOLERANT PROGRAMMING FOR NETWORK-BASED PARALLEL COMPUTING
    CLEMATIS, A
    MICROPROCESSING AND MICROPROGRAMMING, 1994, 40 (10-12): : 765 - 768
  • [22] PROGRAMMING LANGUAGE SUPPORT FOR WRITING FAULT-TOLERANT DISTRIBUTED SOFTWARE
    SCHLICHTING, RD
    THOMAS, VT
    IEEE TRANSACTIONS ON COMPUTERS, 1995, 44 (02) : 203 - 212
  • [23] LOAD-LEVELING IN FAULT-TOLERANT DISTRIBUTED COMPUTING SYSTEMS
    PATNAIK, LM
    IYER, KV
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1986, 12 (04) : 554 - 560
  • [24] Fault-Tolerant Parallel and Distributed Computing for Software Engineering Undergraduates
    Ebnenasir, Ali
    Mayo, Jean
    2015 IEEE 29TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, 2015, : 788 - 794
  • [25] Deterministic Fault-Tolerant Distributed Computing in Linear Time and Communication
    Chlebus, Bogdan S.
    Kowalski, Dariusz R.
    Olkowski, Jan
    PROCEEDINGS OF THE 2023 ACM SYMPOSIUM ON PRINCIPLES OF DISTRIBUTED COMPUTING, PODC 2023, 2023, : 344 - 354
  • [26] MARKOV RELIABILITY MODELS OF FAULT-TOLERANT DISTRIBUTED COMPUTING SYSTEMS
    LIRON, M
    MELAMED, B
    YAU, SS
    INFORMATION SCIENCES, 1986, 40 (03) : 183 - 206
  • [27] Fault-tolerant distributed computing in full-information networks
    Goldwasser, Shafi
    Pavlov, Elan
    Vaikuntanathan, Vinod
    47TH ANNUAL IEEE SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, PROCEEDINGS, 2006, : 15 - +
  • [28] Reliability modeling and optimization for distributed fault-tolerant computing systems
    Albeanu, G
    Popentiu-Vladicescu, F
    Serbanescu, L
    SAFETY AND RELIABILITY, VOLS 1 AND 2, 2003, : 19 - 24
  • [29] Reconciling fault-tolerant distributed computing and systems-on-chip
    Fuegger, Matthias
    Schmid, Ulrich
    DISTRIBUTED COMPUTING, 2012, 24 (06) : 323 - 355
  • [30] Reconciling fault-tolerant distributed computing and systems-on-chip
    Matthias Függer
    Ulrich Schmid
    Distributed Computing, 2012, 24 : 323 - 355