An adaptive programming model for fault-tolerant distributed computing

被引:14
|
作者
Gorender, Sergio
Macedo, Raimundo Jose de Araujo
Raynal, Michel
机构
[1] Univ Fed Bahia, Dept Comp Sci, Distributed Syst Lab, BR-40170110 Salvador, BA, Brazil
[2] Univ Rennes 1, IRISA, F-35042 Rennes, France
关键词
adaptability; asynchronous/synchronous distributed system; consensus; distributed computing model; fault tolerance; quality of service;
D O I
10.1109/TDSC.2007.3
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The capability of dynamically adapting to distinct runtime conditions is an important issue when designing distributed systems where negotiated quality of service (QoS) cannot always be delivered between processes. Providing fault tolerance for such dynamic environments is a challenging task. Considering such a context, this paper proposes an adaptive programming model for fault-tolerant distributed computing, which provides upper-layer applications with process state information according to the current system synchrony ( or QoS). The underlying system model is hybrid, composed by a synchronous part ( where there are time bounds on processing speed and message delay) and an asynchronous part ( where there is no time bound). However, such a composition can vary over time, and, in particular, the system may become totally asynchronous ( e. g., when the underlying system QoS degrade) or totally synchronous. Moreover, processes are not required to share the same view of the system synchrony at a given time. To illustrate what can be done in this programming model and how to use it, the consensus problem is taken as a benchmark problem. This paper also presents an implementation of the model that relies on a negotiated quality of service ( QoS) for communication channels.
引用
收藏
页码:18 / 31
页数:14
相关论文
共 50 条
  • [41] Cyclic storage for fault-tolerant distributed executions
    Marcelin-Jimenez, Ricardo
    Rajsbaum, Sergio
    Stevens, Brett
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2006, 17 (09) : 1028 - 1036
  • [42] Evaluation of fault-tolerant distributed web systems
    Hong, YS
    No, JH
    Han, I
    WORDS 2005: 10th IEEE International Workshop on Object-Oriented Real-Time Dependable, Proceedings, 2005, : 148 - 151
  • [43] A Fault-Tolerant Algorithm For Distributed Resource Allocation
    Pessolani, P.
    Jara, O.
    Gonnet, S.
    Cortes, T.
    Tinetti, F. G.
    IEEE LATIN AMERICA TRANSACTIONS, 2017, 15 (11) : 2152 - 2163
  • [44] Synthesis of Fault-Tolerant Distributed Router Configurations
    Subramanian, Kausik
    D'Antoni, Loris
    Akella, Aditya
    PROCEEDINGS OF THE ACM ON MEASUREMENT AND ANALYSIS OF COMPUTING SYSTEMS, 2018, 2 (01)
  • [45] On fault-tolerant data replication in distributed systems
    Tenzekhti, F
    Day, K
    Ould-Khaoua, M
    MICROPROCESSORS AND MICROSYSTEMS, 2002, 26 (07) : 301 - 309
  • [46] Distributed Methods for Autonomous Robot Groups Fault-Tolerant Management
    Kalyaev, Igor
    Melnik, Eduard
    Klimenko, Anna
    INTERACTIVE COLLABORATIVE ROBOTICS, ICR 2020, 2020, 12336 : 135 - 147
  • [47] Distributed Fault Estimation and Fault-Tolerant Control of Interconnected Systems
    Zhang, Ke
    Jiang, Bin
    Chen, Mou
    Yan, Xing-Gang
    IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (03) : 1230 - 1240
  • [48] An Adaptive Fault-Tolerant Communication Scheme for Body Sensor Networks
    Wu, Guowei
    Ren, Jiankang
    Xia, Feng
    Xu, Zichuan
    SENSORS, 2010, 10 (11) : 9590 - 9608
  • [49] An I/O-efficient and adaptive fault-tolerant framework for distributed graph computations
    Zhigang Wang
    Yu Gu
    Yubin Bao
    Ge Yu
    Lixin Gao
    Distributed and Parallel Databases, 2017, 35 : 177 - 196
  • [50] Fault-tolerant Iterative Solvers with Adaptive Reliability
    Shukla, Aaditya
    Wu, Yue
    Zonouz, Saman
    Dehnavi, Maryam Mehri
    2016 IEEE CONFERENCE ON ELECTROMAGNETIC FIELD COMPUTATION (CEFC), 2016,